Hi Koreen - did you figure out how to calculate the geometric mean with the raster calculator? Thanks!
The way you'd do this using the 10x Raster Calculator tool would be:
("grid1" * "grid2" * "grid3" ) ** (1 / 3.)
Don't forget to include the "." after the number of rasters (3), as Python division is integer if both inputs are integer.
>>> 1 / 3
0
>>> 1 / 3.
0.3333333333333333
Hi Curtis - I meant calculate the geometric mean of all the individual cells contained in one raster, like this:
with each of those being an individual cell of my raster. I think we realized that doing this to 1.2M cells in a raster simply implodes in Python, we couldn't get the script to do more than 50 or so values at one time. Can someone out there with more programming experience explain why, or if there's a better way to do geometric mean in Python or ArcGIS?
thanks!
Not a spatial problem per-se, and the SciPy library does a pretty good job for problems like this. It'll be included in the 10.3 release, but can be downloaded and installed separately in 10.x. With that in place, convert your raster to a NumPy array, then use the statistics built in to SciPy to do the calculation:
import scipy.stats
rast_path = 'C:/my_input_raster.tif')
raster_as_numpy_array
= arcpy.RasterToNumPyArray(rast_path)
raster_geometric_mean = scipy.stats.stats.gmean(
raster_as_numpy_array, axis=None)
Which for a raster of 10M values takes a few seconds for me to run.
Hope that helps,
Shaun
Edit: Michael August I've updated my answer to include 'axis=None' which will compute the geometric mean for the whole matrix (instead of along one axis). With this change, it should work without any further steps.
>>> raster_as_numpy_array = arcpy.RasterToNumPyArray('C:\Users\xxxxx\Downloads\q47121g52be.tif\q47121g52be.tif')
>>> raster_geometric_mean = scipy.stats.mstats.gmean(raster_as_numpy_array)
>>> raster_geometric_mean
array([ 0., 0., 0., ..., 0., 0., 0.], dtype=float16)
So I tried the above on integer and float rasters and am not sure what I did wrong there...how do you get to the gm as a single reported value? If you can't tell I am fairly new at this point to Python -thanks you guys for all your help!
once you have the array...check my examples in this thread
So finish it with this part:
>>> cp = np.cumproduct(a)
In my thread, I outlined two options...one using the cumulative product approach...and the other..the cumulative sum of the logs... the cumulative product approach will fail if you have a large array and/or the numbers in the array contain large numbers due to numeric overflow. The log approach suffers less from this. The only difference is the way in which the final GM is determined. I documented this for teaching purposes and to provide some context on GM use and the calculation pitfalls. I am pretty sure that spreadsheet geometric mean functions don't use the cumulative product approach but rollout the log summation method behind the scenes..... So read my thread carefully...