Finding an Average Rainfall Data

974
11
05-02-2012 11:46 AM
RPatarasuk
New Contributor II
Hi All,

I have about 350 files of PRISM rainfall data to process. I have e.g. precip from 1970 - 2010. and each year has 13 files.

ppt_1970.01.tif
ppt_1970.02.tif
ppt_1970.03.tif
ppt_1970.04.tif
ppt_1970.05.tif
ppt_1970.06.tif
ppt_1970.07.tif
ppt_1970.08.tif
ppt_1970.09.tif
ppt_1970.10.tif
ppt_1970.11.tif
ppt_1970.12.tif
ppt_1970.14.tif
.
.
.
ppt_2010.11.tif
ppt_2010.12.tif
ppt_2010.14.tif

I try to write a python code to calculate a long term average. Here is the code that I have so far. I have only tried to play around with 5-10 files and the program only returns values for the last file (i.e. it doesn't add up the values of previous file and find the average). Also, I would like to exclude "*.14" from the calculation.

# Import system modules
import arcpy
from arcpy import env
from arcpy.sa import *

# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")

# Input data source
arcpy.env.workspace = "S:/Work/Risa/Trial & Error/input"
arcpy.env.overwriteOutput = True

# Set output folder
OutputFolder = "S:/Work/Risa/Trial & Error/output/"

# Loop through a list of files in the workspace
rasterFiles = arcpy.ListRasters()

# Local variables:
for filename in rasterFiles:
    print("Processing: " + filename)
    inRaster = arcpy.env.workspace + "/" + filename
    outRaster = OutputFolder + "/" + "AvgPrecip.tif"
   
    #***need to process from all of the files except the ones that have "*.14"

    # Process: Cell Statistics
    arcpy.gp.CellStatistics_sa(inRaster, outRaster, "MEAN", "DATA")

   
                                  
print "***DONE!!!"


Any help would be very appreciated. Thank you very much.

Risa
Tags (2)
0 Kudos
11 Replies
PhilMorefield
Occasional Contributor III
First, please use CODE tags when you post actual Python code: http://forums.arcgis.com/threads/48475-Please-read-How-to-post-Python-code

Second, the "14" file that you want to exclude is actually the annual average. No need to include the other 12 files for each year.

Third, in answer to your question you could do something like this:
inList = []
for filename in rasterFiles:
    if filename.endswith('14'):
        inList.append(filename)
arcpy.sa.CellStatistics(inList, outRaster, "MEAN", "DATA")  # this line was incorrect in your example
0 Kudos
RPatarasuk
New Contributor II
Dear philmorefield,

Thank you very  much for your help. I am very new to python and how to post the questions on the Forum. I tried to incorporated what you suggested but still has no luck. It doesn't append *.14 and it also calculates only the last file processed. Here is the code that I modified.

# Import system modules
import arcpy
from arcpy import env
from arcpy.sa import *

# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")

# Input data source
arcpy.env.workspace = "S:/Work/Risa/Trial & Error/input"
arcpy.env.overwriteOutput = True

# Set output folder
OutputFolder = "S:/Work/Risa/Trial & Error/output/"

# Loop through a list of files in the workspace
rasterFiles = arcpy.ListRasters()

# Local variables:
for filename in rasterFiles:
    
    print("Processing: " + filename)
    inRaster = arcpy.env.workspace + "/" + filename
    if filename.endswith("*.14.tif"):
        inRaster.append(filename)
        
    # Process: Cell Statistics
    outRaster = CellStatistics(inRaster, "MEAN", "DATA")
   
    # Save the output 
    outRaster.save("S:/Work/Risa/Trial & Error/output/AvgPrecip01.tif")
    
print "Average Calculated!!!"
0 Kudos
PhilMorefield
Occasional Contributor III
There are a couple of problems that need to be addressed. I put in some comments to help explain.
# Import system modules
import arcpy
from arcpy import env
from arcpy.sa import *

# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")

# Input data source
arcpy.env.workspace = "S:/Work/Risa/Trial_and_Error/input"  # don't use spaces or special characters in path names
arcpy.env.overwriteOutput = True

# Set output folder
OutputFolder = "S:/Work/Risa/Trial_and_Error/output/"

# Loop through a list of files in the workspace
rasterFiles = arcpy.ListRasters()

# The loop does nothing else but build the list
inRasters = []  # you need to do this outside of the loop
for filename in rasterFiles: 
    print("Processing: " + filename)
    if filename.endswith("14.tif"):
        inRasters.append(filename)
        
# Now we have a list of the files we want to average, but you only need to calculate the average once, 
# which means this part should not be in the loop, so we remove indentation
outRaster = CellStatistics(inRasters, "MEAN", "DATA")
   
# Save the output 
outRaster.save("S:/Work/Risa/Trial_and_Error/output/AvgPrecip01.tif")
    
print "Average Calculated!!!"

Looping through files one of the most useful and common tasks in GIS. Once you get this mastered you'll be set!
0 Kudos
RPatarasuk
New Contributor II
Thank you very much for your help. It still produces the same results. I am just wondering why you put

inRaster = []

Isn't it the same as

rasterFiles = arcpy.ListRasters()

?

I have modified the code a bit and it takes the averages of all of the files. However, it does not exclude the file that has *.14. I changed these line

rasterFiles.append(filename) instead of inRasters

and

outRaster = CellStatistics(rasterFiles, "MEAN", "DATA") instead of inRasters

Here is the modified code:

# Import system modules
import arcpy
from arcpy import env
from arcpy.sa import *

# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")

# Input data source
arcpy.env.workspace = "S:/Work/Risa/Trial_Error/input"  # don't use spaces or special characters in path names
arcpy.env.overwriteOutput = True

# Set output folder
OutputFolder = "S:/Work/Risa/Trial_Error/output/"

# Loop through a list of files in the workspace
rasterFiles = arcpy.ListRasters()

# The loop does nothing else but build the list
inRaster = []  # you need to do this outside of the loop
for filename in rasterFiles: 
    print("Processing: " + filename)
    if filename.endswith("*.14.tif"):
        rasterFiles.append(filename)
        
# Now we have a list of the files we want to average, but you only need to calculate the average once, 
# which means this part should not be in the loop, so we remove indentation
outRaster = CellStatistics(rasterFiles, "MEAN", "DATA")
   
# Save the output 
outRaster.save("S:/Work/Risa/Trial_Error/output/AvgPrecip18.tif")
    
print "Average Calculated!!!"

#takes the average and including the file that ends with *.14

0 Kudos
DarrenWiens2
MVP Honored Contributor
Jook, you need the second list (inRasters = []) so that you have an empty place to put the rasters ending in "14.tif".

"rasterFiles = arcpy.ListRasters()" lists all of the rasters, then put the "14.tif" files into "inRasters". Then, calculate the mean of all of the rasters in "inRasters".

The way you have modified the code will produce the original list of all rasters, plus any file in "14.tif", so you'll have two copies of each "14.tif".

Finally, just to be clear you are not excluding the files ending in "14.tif", you are only using them (ie. you are excluding anything not ending in (14.tif")
0 Kudos
RPatarasuk
New Contributor II
Dear Darren,

Thank you very much for your reply. Maybe I said something wrong at the beginning. I want to include the files that ends with *.01 to *.12 but exclude *.14.
0 Kudos
PhilMorefield
Occasional Contributor III
Dear Darren,

Thank you very much for your reply. Maybe I said something wrong at the beginning. I want to include the files that ends with *.01 to *.12 but exclude *.14.

You can do that if you want. But as I pointed out earlier, the data you are working with already include the annual averages: the files ending in "14". If you're just going to average all months together, you can make things simpler and faster by just using the "14" files.

But, this is the change you should make to average 12 files from each year:
inRaster = []  # you need to do this outside of the loop
for filename in rasterFiles: 
    print("Processing: " + filename)
    if filename[-6:] in [str(x).zfill(2) + '.tif' for x in xrange(1, 13)]:
        rasterFiles.append(filename)
0 Kudos
DarrenWiens2
MVP Honored Contributor
Or, just to be lazy:
if not filename.endswith("*.14.tif"):
0 Kudos
PhilMorefield
Occasional Contributor III
Or, just to be lazy:
if not filename.endswith("*.14.tif"):


Touché.

Though you could run into trouble if some other raster is laying around in there. And also, the endswith method doesn't use asterisks as wildcards. It will evaluate the string literally and look for asterisks in the file name.
>>>filename = 'ppt_1970.14.tif'
>>>filename.endswith('14.tif')
True
>>>filename.endswith('*.14.tif')
False
0 Kudos