Calculate daily average of raster files based on their names in python

3693
27
Jump to solution
06-13-2017 10:15 AM
AmenehTavakol
New Contributor III

I have hundreds of SMAP soil moisture data and need to calculate the daily average of them. For every day there are 8 files (every 3 hours) and I need a code to calculate the average based on their names. 

I have a code but i am interested to know how can I define a code that average data based on their names? 

# Import arcpy module
import arcpy


# Local variables:
SMAP_L4_SM_gph_20150331T043000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c15_tif = "SMAP_L4_SM_gph_20150331T043000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c15.tif"
SMAP_L4_SM_gph_20150331T073000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c0d_tif = "SMAP_L4_SM_gph_20150331T073000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c0d.tif"
SMAP_L4_SM_gph_20150331T103000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c14_tif = "SMAP_L4_SM_gph_20150331T103000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c14.tif"
SMAP_L4_SM_gph_20150331T133000_Vv2030_001_Geophysical_Data_sm_surface_bdea9bf5_tif = "SMAP_L4_SM_gph_20150331T133000_Vv2030_001_Geophysical_Data_sm_surface_bdea9bf5.tif"
SMAP_L4_SM_gph_20150331T163000_Vv2030_001_Geophysical_Data_sm_surface_bdea9bf3_tif = "SMAP_L4_SM_gph_20150331T163000_Vv2030_001_Geophysical_Data_sm_surface_bdea9bf3.tif"
SMAP_L4_SM_gph_20150331T193000_Vv2030_001_Geophysical_Data_sm_surface_bdea9bf4_tif = "SMAP_L4_SM_gph_20150331T193000_Vv2030_001_Geophysical_Data_sm_surface_bdea9bf4.tif"
SMAP_L4_SM_gph_20150331T223000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c0f_tif = "SMAP_L4_SM_gph_20150331T223000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c0f.tif"
SMAP_L4_SM_gph_20150331T013000_Vv2030_001_Geophysical_Data_sm_surface_bdea9eb8_tif = "SMAP_L4_SM_gph_20150331T013000_Vv2030_001_Geophysical_Data_sm_surface_bdea9eb8.tif"
v_name_ = "D:\\NASA\\Data\\SMAP\\2014\\Processed\\%name%.tif"

# Process: Raster Calculator
arcpy.gp.RasterCalculator_sa("( \"%SMAP_L4_SM_gph_20150331T043000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c15.tif%\" + \"%SMAP_L4_SM_gph_20150331T073000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c0d.tif%\" + \"%SMAP_L4_SM_gph_20150331T103000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c14.tif%\" +\"%SMAP_L4_SM_gph_20150331T133000_Vv2030_001_Geophysical_Data_sm_surface_bdea9bf5.tif%\" + \"%SMAP_L4_SM_gph_20150331T163000_Vv2030_001_Geophysical_Data_sm_surface_bdea9bf3.tif%\" + \"%SMAP_L4_SM_gph_20150331T193000_Vv2030_001_Geophysical_Data_sm_surface_bdea9bf4.tif%\" + \"%SMAP_L4_SM_gph_20150331T223000_Vv2030_001_Geophysical_Data_sm_surface_bdea9c0f.tif%\" + \"%SMAP_L4_SM_gph_20150331T013000_Vv2030_001_Geophysical_Data_sm_surface_bdea9eb8.tif%\") / 8", v_name_)


Your help is really appreciated.
0 Kudos
27 Replies
AmenehTavakol
New Contributor III

Right now I have to calculate the average for every cluster which is in this shape,

20150429
SMAP_L4_SM_gph_20150429T013000_Vv2030_001_Geophysical_Data_sm_surface_bf1ec829.tif
SMAP_L4_SM_gph_20150429T043000_Vv2030_001_Geophysical_Data_sm_surface_bf1ec80f.tif
SMAP_L4_SM_gph_20150429T073000_Vv2030_001_Geophysical_Data_sm_surface_bf1ec7e9.tif
SMAP_L4_SM_gph_20150429T103000_Vv2030_001_Geophysical_Data_sm_surface_bf1ecad1.tif
SMAP_L4_SM_gph_20150429T133000_Vv2030_001_Geophysical_Data_sm_surface_bf1ec808.tif
SMAP_L4_SM_gph_20150429T163000_Vv2030_001_Geophysical_Data_sm_surface_bf1ec82d.tif
SMAP_L4_SM_gph_20150429T193000_Vv2030_001_Geophysical_Data_sm_surface_bf1ec7eb.tif
SMAP_L4_SM_gph_20150429T223000_Vv2030_001_Geophysical_Data_sm_surface_bf1ec82a.tif

But I could not do that through getproperties and this code:

for rasters in main():
    Ave = np.mean()

So how can i define the average for every cluster?
0 Kudos
BlakeTerhune
MVP Regular Contributor

Check out the Cell Statistics tool that Curtis Price‌ mentioned. It takes a list of raster filenames and calculates an output raster with whatever summary statistic type you choose. Using my file grouping code above, you can do something like...

for date, rasters in raster_days.items():
    # Execute CellStatistics
    outCellStatistics = arcpy.sa.CellStatistics(rasters, "MEAN")

    # Save the output
    outCellStatistics.save("C:/example/cellstats_output")‍‍‍‍‍‍

Note that it looks like you do need the spatial analyst extension for Cell Statistics.

# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")
AmenehTavakol
New Contributor III
import arcpy
import numpy as np

# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")

from collections import defaultdict

def main():
    raster_dir = r"D:\NASA\Data\SMAP\OutPut"
    arcpy.env.workspace = raster_dir
    raster_days = defaultdict(list)
    for r in arcpy.ListRasters():
        raster_days[r[15:23]].append(r)

    for date, rasters in raster_days.items():
        print(date)
        for r in rasters:
            print("\t{}".format(r))

if __name__ == '__main__':
    main()

for date, rasters in raster_days.item():
        # Execute CellStatistics
        outCellStatistics = arcpy.sa.CellStatistics(rasters, "MEAN")
        # Save the output
        outCellStatistics.save("D:/NASA\Data\SMAP\OutPut/Processed")

I tried this one but I received an error "for date, rasters in raster_days.item():
                                                              NameError: name 'raster_days' is not defined"

0 Kudos
curtvprice
MVP Esteemed Contributor

Blake there is a typo in your code above I think 

    for date, rasters in day_rasters.items():

should be

    for date, rasters in raster_days.items():
AmenehTavakol
New Contributor III

They are "for date, rasters in raster_days.item():"

But still I receive this error!! that "NameError: name 'raster_days' is not defined". 

0 Kudos
curtvprice
MVP Esteemed Contributor

That's because you put that code in the wrong place -- outside of the main() function.

The variable raster_days is not defined outside of the function.

BlakeTerhune
MVP Regular Contributor

Yes, I see it now, thanks. I'll update my code.

0 Kudos
curtvprice
MVP Esteemed Contributor

I think you want to look into the Cell Statistics tool. The order doesn't matter, you could use Model Builder‌ or Python (arcpy.ListRasters() or arcpy.da.Walk) to create your list of rasters and pass it to the tool.

AmenehTavakol
New Contributor III

Do you have an example of the code. I am very new in python so your help is really appreciated.

0 Kudos
curtvprice
MVP Esteemed Contributor
0 Kudos