Numpy_Snippets
Updated: 2016-09-09
Previous snippets:
None Jan 5, 2015
Documentation Jan 30
Edits May 5
Documentation
NumPy Reference — NumPy v1.9 Manual
for numpy python packages
http://www.lfd.uci.edu/~gohlke/pythonlibs/
other links
http://rintintin.colorado.edu/~wajo8931/docs/jochem_aag2011.pdf
-------------------------------------------------------------------------------------------------
As a companion to the Numpy Lessons series, I have posted within my blog, I have decided to maintain a series of snippets that don't comfortably fit into a coherent lesson. They, like the lessons, will be sequentially numbered with links to the previous ones kept in the top section. Contributions and/or corrections.
All samples assume that the following imports are made. Other required imports will be noted when necessary.
# default imports used in all examples whether they are or not
import numpy as np
import arcpy
This is a bit of a hodge-podge, but the end result is produce running means for a data set over a 10-year time period.
Simple array creation is shown using two methods, as well as how to convert array contents to specific data types.
>>> year_data = np.arange(2005,2015,dtype='int') # 10 years worth of records from 2005 up to, but not 2015
>>> some_data = np.arange(0,10,dtype='float') # some numbers...sequential and floating point in this case
>>> result = np.zeros(shape=(10,),dtype='float') # create an array of 0's with 10 records
>>> result.fill(-999) # provide a null value and fill the zero's with null values
>>> result_list = zip(year_data,some_data,result) # zip the 3 arrays together
>>>
>>> dt = np.dtype([('year','int'), ('Some_Data', 'float'),('Mean_5year',np.float64)]) # combined array type
>>> result_array = np.array(result_list,dtype=dt) # produce the final array with the desired data type
>>> result_array
array([(2005, 0.0, -999.0), (2006, 1.0, -999.0), (2007, 2.0, -999.0),
(2008, 3.0, -999.0), (2009, 4.0, -999.0), (2010, 5.0, -999.0),
(2011, 6.0, -999.0), (2012, 7.0, -999.0), (2013, 8.0, -999.0),
(2014, 9.0, -999.0)],
dtype=[('year', '<i4'), ('Some_Data', '<f8'), ('Mean_5year', '<f8')])
>>>
The result_array now consists of a three columns, which can be accessed by names using array slicing.
>>> result_array['year'] # slicing the year, data and result column values
array([2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014])
>>> result_array['Some_Data']
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
>>> result_array['Mean_5year']
array([-999., -999., -999., -999., -999., -999., -999., -999., -999., -999.])
>>>
If this array, an ndarray, is converted to a recarray, field access can also be achieved using 'array.field' notation.
>>> result_v2 = (result_array.view(np.recarray)) # convert it to a recarray to permit 'array.field access'
>>>
>>> result_v2.year
array([2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014])
>>> result_v2.Some_Data
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
>>> result_v2.Mean_5year
array([-999., -999., -999., -999., -999., -999., -999., -999., -999., -999.])
>>>
The remainder of the demonstration basically shows some of the things that can be done with ndarrays and recarrays. As an example, the 5-year running mean will be calculated and the Mean_5year column's null values replaced with valid data. The 'np.convolve' method will be used to determine the running means for no other reason than I hadn't used it before. Since the input data are sequential numbers from 0 to 9, it will be pretty easy to do the mental math to figure out whether the running mean is indeed correct. The steps entail:
Here it goes...
>>> N = 5 # five year running mean step, see the help on convolve
>>> rm = np.convolve(result_v2['Some_Data'],np.ones((N,))/N, mode='valid') # a mouth-full
>>> rm # however, there are only values for the mid-point year
array([ 2., 3., 4., 5., 6., 7.]) # so we need to pad by 2 on either end of the output
>>>
>>> pad_by = N/2 # integer division...this has change in python 3.x
>>>
>>> new_vals = np.pad(rm,pad_by,mode='constant',constant_values=-999) # padding the result to new_vals
>>> new_vals
array([-999., -999., 2., 3., 4., 5., 6., 7., -999., -999.])
>>>
>>> result_v2.Mean_5year = new_vals # set the new_vals into the correct column
>>>
>>> result_v2 # voila
rec.array([(2005, 0.0, -999.0), (2006, 1.0, -999.0), (2007, 2.0, 2.0),
(2008, 3.0, 3.0), (2009, 4.0, 4.0), (2010, 5.0, 5.000000000000001),
(2011, 6.0, 6.0), (2012, 7.0, 7.000000000000001),
(2013, 8.0, -999.0), (2014, 9.0, -999.0)],
dtype=[('year', '<i4'), ('Some_Data', '<f8'), ('Mean_5year', '<f8')])
>>>
A bit messy with that floating point representation thing appearing for a few numbers....let's clean it up by changing the dtype to limit the number of decimal points in the array showing up in the 'Mean_5year column. This will be done incrementally.
>>> x = np.round(result_v2.Mean_5year,decimals=2)
>>> result_v2.Mean_5year = x
>>> result_v2
rec.array([(2005, 0.0, -999.0), (2006, 1.0, -999.0), (2007, 2.0, 2.0),
(2008, 3.0, 3.0), (2009, 4.0, 4.0), (2010, 5.0, 5.0),
(2011, 6.0, 6.0), (2012, 7.0, 7.0), (2013, 8.0, -999.0),
(2014, 9.0, -999.0)],
dtype=[('year', '<i4'), ('Some_Data', '<f8'), ('Mean_5year', '<f8')])
>>>
So these snippets have shown some of the things that can be done with arrays and the subtle but important distinctions between numpy's array, ndarray and recarray forms.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.