Skip navigation
All People > Dan_Patterson > Py... blog > 2016 > September
2016

Numpy_Snippets

 

Updated: 2016-09-09

Previous snippets:

None                   Jan 5, 2015

Documentation   Jan 30

Edits                    May 5

 

Documentation

NumPy Reference — NumPy v1.9 Manual

Tentative NumPy Tutorial -

for numpy python packages

http://www.lfd.uci.edu/~gohlke/pythonlibs/

other links

http://rintintin.colorado.edu/~wajo8931/docs/jochem_aag2011.pdf

 

-------------------------------------------------------------------------------------------------

As a companion to the Numpy Lessons series, I have posted within my blog, I have decided to maintain a series of snippets that don't comfortably fit into a coherent lesson.  They, like the lessons, will be sequentially numbered with links to the previous ones kept in the top section.  Contributions and/or corrections.

 

All samples assume that the following imports are made.  Other required imports will be noted when necessary.

 

# default imports used in all examples whether they are or not
import numpy as np
import arcpy

 

This is a bit of a hodge-podge, but the end result is produce running means for a data set over a 10-year time period.
Simple array creation is shown using two methods, as well as how to convert array contents to specific data types.

 

>>> year_data = np.arange(2005,2015,dtype='int')   # 10 years worth of records from 2005 up to, but not 2015
>>> some_data = np.arange(0,10,dtype='float')      # some numbers...sequential and floating point in this case
>>> result = np.zeros(shape=(10,),dtype='float')   # create an array of 0's with 10 records
>>> result.fill(-999)                              # provide a null value and fill the zero's with null values
>>> result_list = zip(year_data,some_data,result)  # zip the 3 arrays together
>>>
>>> dt = np.dtype([('year','int'), ('Some_Data', 'float'),('Mean_5year',np.float64)]) # combined array type
>>> result_array = np.array(result_list,dtype=dt)  # produce the final array with the desired data type
>>> result_array
array([(2005, 0.0, -999.0), (2006, 1.0, -999.0), (2007, 2.0, -999.0),
       (2008, 3.0, -999.0), (2009, 4.0, -999.0), (2010, 5.0, -999.0),
       (2011, 6.0, -999.0), (2012, 7.0, -999.0), (2013, 8.0, -999.0),
       (2014, 9.0, -999.0)],
      dtype=[('year', '<i4'), ('Some_Data', '<f8'), ('Mean_5year', '<f8')])
>>>

 

The result_array now consists of a three columns, which can be accessed by names using array slicing.

>>> result_array['year']                           # slicing the year, data and result column values
array([2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014])
>>> result_array['Some_Data']
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])
>>> result_array['Mean_5year']
array([-999., -999., -999., -999., -999., -999., -999., -999., -999., -999.])
>>>

 

If this array, an ndarray, is converted to a recarray, field access can also be achieved using 'array.field' notation.

>>> result_v2 = (result_array.view(np.recarray))   # convert it to a recarray to permit 'array.field access'
>>>
>>> result_v2.year
array([2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014])
>>> result_v2.Some_Data
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])
>>> result_v2.Mean_5year
array([-999., -999., -999., -999., -999., -999., -999., -999., -999., -999.])
>>>

 

The remainder of the demonstration basically shows some of the things that can be done with ndarrays and recarrays.  As an example, the 5-year running mean will be calculated and the Mean_5year column's null values replaced with valid data.  The 'np.convolve' method will be used to determine the running means for no other reason than I hadn't used it before.  Since the input data are sequential numbers from 0 to 9, it will be pretty easy to do the mental math to figure out whether the running mean is indeed correct.  The steps entail:

  1. decide upon the mean step to use (eg. N=5),
  2. run the convolve method on the 'Some_Data' column in the result_v2 recarray,
  3. pad the resultant array so that the sizes of the running mean calculation array and the column array are equal.

Here it goes...

 

>>> N = 5                                          # five year running mean step, see the help on convolve
>>> rm = np.convolve(result_v2['Some_Data'],np.ones((N,))/N, mode='valid')  # a mouth-full
>>> rm                                             # however, there are only values for the mid-point year
array([ 2.,  3.,  4.,  5.,  6.,  7.])              # so we need to pad by 2 on either end of the output
>>>
>>> pad_by = N/2                                   # integer division...this has change in python 3.x
>>>
>>> new_vals = np.pad(rm,pad_by,mode='constant',constant_values=-999)   # padding the result to new_vals
>>> new_vals
array([-999., -999.,    2.,    3.,    4.,    5.,    6.,    7., -999., -999.])
>>>
>>> result_v2.Mean_5year = new_vals                # set the new_vals into the correct column
>>>
>>> result_v2                                      # voila
rec.array([(2005, 0.0, -999.0), (2006, 1.0, -999.0), (2007, 2.0, 2.0),
       (2008, 3.0, 3.0), (2009, 4.0, 4.0), (2010, 5.0, 5.000000000000001),
       (2011, 6.0, 6.0), (2012, 7.0, 7.000000000000001),
       (2013, 8.0, -999.0), (2014, 9.0, -999.0)],
      dtype=[('year', '<i4'), ('Some_Data', '<f8'), ('Mean_5year', '<f8')])
>>>


A bit messy with that floating point representation thing appearing for a few numbers....let's clean it up by changing the dtype to limit the number of decimal points in the array showing up in the 'Mean_5year column.  This will be done incrementally.

 

>>> x = np.round(result_v2.Mean_5year,decimals=2)
>>> result_v2.Mean_5year = x
>>> result_v2
rec.array([(2005, 0.0, -999.0), (2006, 1.0, -999.0), (2007, 2.0, 2.0),
       (2008, 3.0, 3.0), (2009, 4.0, 4.0), (2010, 5.0, 5.0),
       (2011, 6.0, 6.0), (2012, 7.0, 7.0), (2013, 8.0, -999.0),
       (2014, 9.0, -999.0)],
      dtype=[('year', '<i4'), ('Some_Data', '<f8'), ('Mean_5year', '<f8')])
>>>

 

So these snippets have shown some of the things that can be done with arrays and the subtle but important distinctions between numpy's array, ndarray and recarray forms.

Numpy Snippets

 

Updates: 2016-09-09

 

This just a quick example of how to use existing arrays and export them to tables.  In this case I will use arcpy functionality to produce a dBase file.  Numpy can be used directly to produce text files as well.

To begin with:

  • two arrays of X and Y values are created in the range 0-10 inclusive (ie 11 numbers),
  • a data type is specified (dt)... in this case I assigned 'X' and 'Y' to the columns and specified a 64-bit floating point number,
  • an array, XY, is create using some fancy zipping of the original arrays with the specified data type,
  • ArcPy is imported, an output table is created using the data access module's NumPyArrayToTable.
  • Now for the magical reveal

 

>>> import numpy as np
>>> X = np.arange(11)                # take some numbers
>>> Y = np.arange(11)                # ditto
>>> dt = [('X','<f8'),('Y','<f8')]   # specify a data type ie 64 bit floats
>>>
>>> XY = np.array(zip(X,Y), dtype = dt) # create a 2D array of floats
>>> XY
array([(0.0, 0.0), (1.0, 1.0), (2.0, 2.0), (3.0, 3.0), (4.0, 4.0),
       (5.0, 5.0), (6.0, 6.0), (7.0, 7.0), (8.0, 8.0), (9.0, 9.0),
       (10.0, 10.0)],
      dtype=[('X', '<f8'), ('Y', '<f8')])
>>>
>>> import arcpy                     # now lets do some arcpy stuff
>>> out_table = 'c:/temp/test.dbf'
>>> arcpy.da.NumPyArrayToTable(XY,out_table)

 

: -----------------------------------------------------

Now for the reveal...

 

Output_Table.jpg

 

: -----------------------------------------------------

Bring it back you say?   Nothing could be easier.

>>> in_array = arcpy.da.TableToNumPyArray(out_table,['OID','X','Y'])
>>> in_array
array([(0, 0.0, 0.0), (1, 1.0, 1.0), (2, 2.0, 2.0), (3, 3.0, 3.0),
       (4, 4.0, 4.0), (5, 5.0, 5.0), (6, 6.0, 6.0), (7, 7.0, 7.0),
       (8, 8.0, 8.0), (9, 9.0, 9.0), (10, 10.0, 10.0)],
      dtype=[('OID', '<i4'), ('X', '<f8'), ('Y', '<f8')])

: -----------------------------------------------------

Out to *.csv you say?  Too easy (nothing fancy this time...just the numbers but formatted a bit).

 

    0,      0.00,      0.00
    1,      1.00,      1.00
    2,      2.00,      2.00
    3,      3.00,      3.00
    4,      4.00,      4.00
    5,      5.00,      5.00
    6,      6.00,      6.00
    7,      7.00,      7.00
    8,      8.00,      8.00
    9,      9.00,      9.00
   10,     10.00,     10.00

 

So NumPy and ArcPy do play 'nice' together.  Experiment a bit.   More later.

Numpy Snippets

 

Updated: 2016-09-09

 

The purpose of this post is to show how numpy can play nicely with arcpy to produce geometry and perform tasks like shape translation, rotation and scaling which are not readily available in arcpy's available functions.

 

To pay homage to ArcMap's ... Fish Net ... I propose a very scaled down version aptly named Phish_Nyet.  A fuller version will appear when ArcScript 2.0 opens.  This is a demo script.  All you do is vary the parameters in the demo() function within the script.

 

Sampling_grids.png

 

Only the rectangular option will be presented here, however, it is possible to make all kinds of sampling grids, including hexagonal ones as shown above.

 

The following points apply:

  • Output is to a shapefile.
  • The output file must have a projected coordinate system.   To create grids in Geographic coordinates, use Fishnet in ArcMap.  Why?  because invariably people create grids in Geographic coordinates, then project them without densifying the grid blocks.  This results in shapes which are in error because the curvature associated with lines of latitude are not accounted for.
  • A corner point is specified as the origin of the net.  One can determine which corner to use by specifying a dX and dY with positive and/or negative values.  In this fashion, you can get your corner to be the top or bottom, left or right of the output.  If you want the middle to be the origin...do the math and get a corner.
  • The output is controlled by the cell widths (dX and dY), the number of columns and rows an (i.e.  the X and Y directions) and a rotation angle, which is positive for clockwise rotation.

 

Notes:

  - grid_array - does the numpy magic of generating the grid shapes.   I have tried to be as verbose as possible.  In short, I generate a

    seed shape and propagate it.  I have intentionally kept rotate and  output_polygons as visible function so you can see how they work.

 

A fuller version will surface as I stated when ArcScripts 2.0 appears.  Change the parameters in the demo() function and run it.

 

"""
Phish_Nyet.py
Author:  Dan.Patterson@carleton.ca

Purpose: Produce a sampling grid with user defined parameters.
"""

import arcpy
import numpy as np

def demo():
    """Generate the grid using the following parameter"""
    output_shp = r'C:\temp\Phish_Nyet.shp'
    SR =  arcpy.SpatialReference(2951) # u'NAD_1983_CSRS_MTM_9' YOU NEED ONE!!!
    corner = [340000.0, 5022000.0]     # corner of grid
    dX = 1000.0;  dY = 1000.0          # X and Y cell widths
    cols = 3;  rows = 3                # columns/rows...grids in X and Y direction
    angle = 0                          # rotation angle, clockwise +ve
    # create the grid
    pnts = grid_array(corner,dX,dY,cols,rows,angle)
    output_polygons(output_shp,SR,pnts)
    print('\nPhish_Nyet has created... {}'.format(output_shp))

def grid_array(corner=[0,0],dX=1,dY=1,cols=1,rows=1,angle=0):
    """create the array of pnts to pass on to arcpy using numpy magic"""
    X = [0.0,0.0,1.0,1.0,0.0]                  # X,Y values for a unit square
    Y = [0.0,1.0,1.0,0.0,0.0]                  #
    seed = np.column_stack((X,Y)) * [dX,dY]    # first array corner values scaled
    u = [seed + [j*dX,i*dY] for i in range(0,rows) for j in range(0,cols)]
    pnts = np.array(u)                         #
    x = [rotate(p,angle) for p in pnts]        # rotate the scaled points
    pnts = [ p + corner for p in x]            # translate them
    return pnts

def rotate(pnts,angle=0):
    """rotate points about the origin in degrees, (+ve for clockwise) """
    angle = np.deg2rad(angle)               # convert to radians
    s = np.sin(angle);  c = np.cos(angle)   # rotation terms
    aff_matrix = np.array([[c, -s],[s, c]]) # rotation matrix
    XY_r = np.dot(pnts, aff_matrix)         # numpy magic to rotate pnts
    return XY_r

def output_polygons(output_shp,SR,pnts):
    """produce the output polygon shapefile"""
    msg = '\nRead the script header... A projected coordinate system required'
    assert (SR != None) and (SR.type=='Projected'), msg
    polygons = []
    for pnt in pnts:                           # create the polygon geometry
        polygons.append(arcpy.Polygon(arcpy.Array([arcpy.Point(*xy) for xy in pnt]),SR))
    if arcpy.Exists(output_shp):               # overwrite any existing versions
        arcpy.Delete_management(output_shp)
    arcpy.CopyFeatures_management(polygons, output_shp)

if __name__ == '__main__':
    """Generate the grid using the listed parameters"""
    demo()  # modify the parameters in demo to run

 

Of course other regular geometric shapes can be generated in a similar fashion, but not all may pack like rectangles and others do.

N-gon Demo

 

Updated:  2016-09-09

This post .... how to draw octagon or hexagon in ArcGIS desktop ?  lead me back to an original post dealing with producing sampling grids.Numpy Snippets # 3 ... Phish_Nyet ... creating sampling grids using numpy and arcpy   For completeness, here are further thoughts.

 

There are two implementations of n-gons...flat topped and pointy topped.  They only differ by the rotation angle relative to the X/Y axis.  In the case of a square, the rotation is 45°. And yes...even a circle in ArcMap is represented as a 360-sided n-gon so it does have a pointy and a flat top.

 

Once the seed shape is created, it can be placed around the centroid of known points by creating a polygon from the array outputs.  I normally use FeatureclassToNumPyArray and NumPyArrayToFeatureclass to perform the transition from points to array and back again.  In my previous blog, I exploited this to produce a sampling grid using rectangles and hexagons of know width, location and orientation for both the flat and pointy topped examples.

 

There is nothing stopping one from creating any geometric shape in any configuration using these simple principles.  All that needs to be determined is the angles needed to produce the n-gon.  For example, the only two lines that need to be changed are these to represent the polygon (n-gon) angles.  From there, the desired width is used to create the final seed which can then be shifted into the desired configuration/location using other code samples included in my previous blogs.

 

Flat topped f_rad = np.deg2rad([180.,120.,60.,0.,-60.,-120.,-180.])        angles in degrees

Point topped p_rad = np.deg2rad([150.,90,30.,-30.,-90.,-150.,150.])

 

 

"""
hexagon_demo_shape.py
Author: 
Dan.Patterson@carleton.ca

Purpose: create hexagon shapes in two forms, flat-topped and pointy-topped
Result:
   Produce hexagon of desired width in X direction centered about
   the origin (0,0)
NOTES:   see full code for other implementations
"""

import numpy as np
np.set_printoptions(precision=4,threshold=10,edgeitems=5,linewidth=75,suppress=True)
def hex_flat(size=1,cols=1,rows=1):
    """generate the points for the flat-headed hexagon """
    f_rad = np.deg2rad([180.,120.,60.,0.,-60.,-120.,-180.])
    X = np.cos(f_rad)*size;  Y = np.sin(f_rad)*size # scaled hexagon about 0,0
    seed = np.array(zip(X,Y))            # array of coordinates
    return seed

def hex_pointy(size=1,cols=1,rows=1):
    """pointy hex angles, convert to sin,cos, zip and send"""
    p_rad = np.deg2rad([150.,90,30.,-30.,-90.,-150.,150.])
    X = np.cos(p_rad)*size;  Y = np.sin(p_rad)*size # scaled hexagon about 0,0
    seed = np.array(zip(X,Y))
    return seed

if __name__ == '__main__':
    flat = hex_flat(700,1,1)
    pointy = hex_pointy(700,1,1)
    print('\nFlat headed hexagon \n{}'.format(flat))
    print('\nPointy headed hexagon \n{}'.format(pointy))

 

Outputs for flat and pointy headed hexagons.

 

1 m width (unit width)100 Unit width700 m width

Flat headed hexagon

[[-1.     0.   ]

[-0.5    0.866]

[ 0.5    0.866]

[ 1.     0.   ]

[ 0.5   -0.866]

[-0.5   -0.866]

[-1.    -0.   ]]

Flat headed hexagon

[[-100.        0.    ]

[ -50.       86.6025]

[  50.       86.6025]

[ 100.        0.    ]

[  50.      -86.6025]

[ -50.      -86.6025]

[-100.       -0.    ]]

Flat headed hexagon

[[-700.        0.    ]

[-350.      606.2178]

[ 350.      606.2178]

[ 700.        0.    ]

[ 350.     -606.2178]

[-350.     -606.2178]

[-700.       -0.    ]]

Flat headed hexagon

[[-1.     0.   ]

[-0.5    0.866]

[ 0.5    0.866]

[ 1.     0.   ]

[ 0.5   -0.866]

[-0.5   -0.866]

[-1.    -0.   ]]

Pointy headed hexagon

[[ -86.6025   50.    ]

[   0.      100.    ]

[  86.6025   50.    ]

[  86.6025  -50.    ]

[   0.     -100.    ]

[ -86.6025  -50.    ]

[ -86.6025   50.    ]]

Pointy headed hexagon

[[-606.2178  350.    ]

[   0.      700.    ]

[ 606.2178  350.    ]

[ 606.2178 -350.    ]

[   0.     -700.    ]

[-606.2178 -350.    ]

[-606.2178  350.    ]]

 

Enjoy.  Should one require the rotation code or shape generation code, let me know or check the code for guidance in NumPy Snippets # 3

NumPy Snippets

 

Updated: 2016-09-09

 

Recently I posted about 'nothing' in None isn't...nor is 0 or 1 ... more explorations into geometry  . 

This snippet shows how to deal with nothing... errrr ... nulls.  Simply put, for most numpy functions, there is an option to account for numeric null values... NaN ... in python parlance.  Now remember, ArcMap often has to deal with null values in fields.  This is often a stumbling block for people trying to summarize their data.  Here is the snippet for you to think about then to explore. 

"""
numpy_NaN

Author:  Dan.Patterson@carleton.ca

Purpose:
Create an array using a 'seed' list, caste it as a float and then
do some sums with sums with and without a mask

"""


import numpy as np

fields = ['a','b','c','d','e']        # field names used to define columns
seed = [['1','2','3','4','5'],
        ['2','3','4','5','1'],
        ['2','3','4','5','2']]

a = np.asarray(seed,dtype='float64')  # produce the array

b = np.sum(a,axis=0)                  # sum by the columns

print("\nSum Demo... \nUsing np.sum(array,axis=0)\nUsing np.nansum(array,axis=0)")
print('\nData:\n{}\n\nColumn sum no nulls:\n{}'.format(a,b))
#
# now with nulls

null = np.NaN                         # NaN... not a number ... or is it?
seed2 = [['1',null,'3','4','5'],
         [null,'3','4','5','1'],
         [null,'3',null,'5','2']]

a2 = np.asarray(seed2,dtype='float64')

b2 = np.sum(a2,axis=0)

c2 = np.nansum(a2,axis=0)

print('\nData with nulls... :\n{}\n\nColumn sum with nulls:\n{}'.format(a2,b2))
print('\nColumn sum omitting nulls:\n{}'.format(c2))

 

Now...the reveal...

 

Sum Demo...

Using np.sum(array,axis=0)

Using np.nansum(array,axis=0)

 

Data:

[[ 1.  2.  3.  4.  5.]

[ 2.  3.  4.  5.  1.]

[ 2.  3.  4.  5.  2.]]

 

Data with nulls... :

[[  1.  nan   3.   4.   5.]

[ nan   3.   4.   5.   1.]

[ nan   3.  nan   5.   2.]]

 

Column sum no nulls:            [  5.   8.  11.  14.   8.]

Column sum with nulls:         [ nan  nan  nan  14.   8.]

Column sum omitting nulls:   [  1.   6.   7.  14.   8.]

 

So clever isn't it.. now there are other np.nan... functions to explore.

There is ... still ... confusion regarding the proper use of the  Define Projection Tool versus the Project Tool

 

"my data don't line up..."

"I am sure about the coordinate system..."

"everything is far apart onscreen..."

 

We have all been there.  The written descriptions don't seem to catch on, so a visual guide might be what is needed.

Prior to proceeding... make sure you have seen the ...References... at the end of this section... that is what you need to understand.

 

So with tongue firmly planted in cheek...

 

Existing coordinate systemDesired coordinate systemTool to useResult

Project Tool 

Project Tool

Wrong Geographic Transformation

Define Projection Tool

 

The choice really depends on:

  • what you know you have... not what you think you have...
  • what you really need...
  • applying the correct tool...   if it doesn't work out, undo what you did... locate the originals... read the metadata
  • understanding what you got and why.

 

References

 

 

And to cover some of the other combinations and permutations, just remember....things can get worse, before they get better...

The aggravation of aggregation ...

 

               Raster with some classes                                                 Aggregation using maximum

aggregation_demo.pngaggregation_demo2.png

No Spatial Analyst??? A raster is just a big array... hmmmm. I wonder.

What is available at all license levels?

Not on the list?  There are examples of how to get other geometries to array format elsewhere. But everyone knows about old school ascii.  Splurge on disk space... ascii files are easy to edit.  There are many options available to get data into array format.

 

Code section
ASCII example...

Demo code...

This simple sample code should give you some ideas.  In this version, edge considerations are not accounted for, so should that be an issue, you can pad the array with an appropriate collar.

"""
Script:  aggregate_demo.py
Modified: 2016-01-30
Author:  Dan.Patterson AT  carleton.ca
Purpose:  To demonstrate aggregation of raster data without the
spatial analyst extension.  A sample raster is created and methods
to convert an array to  a raster and vice versa are shown.
Notes:
- RasterToNumPyArray(in_raster, {lower_left_corner},
                        {ncols}, {nrows}, {nodata_to_value})
  arr = arcpy.RasterToNumPyArray(rast)
- NumPyArrayToRaster(in_array, {lower_left_corner}, {x_cell_size},
                    {y_cell_size}, {value_to_nodata})
  rast = arcpy.NumPyArrayToRaster(a, arcpy.Point(300000,5025000),
                                  10,10,-9999)
  rast.save(r"F:\Demos\raster_ops\test_agg")
    # esri grid, or add tif, jpg etc
"""
import numpy as np
from numpy.lib.stride_tricks import as_strided
np.set_printoptions(edgeitems=3,linewidth=80,precision=2,
                    suppress=True,threshold=50)
from textwrap import dedent

import arcpy
arcpy.env.overwriteOutput = True
def block_a(a, block=(3,3)):
    """
Provide a 2D block view of a 2D array. No error checking made.
    Columns and rows outside of the block are truncated.
    """
    a = np.ascontiguousarray(a)
    r, c = block
    shape = (a.shape[0]/r, a.shape[1]/c) + block
    strides = (r*a.strides[0], c*a.strides[1]) + a.strides
    b_a = as_strided(a, shape=shape, strides=strides)
    return b_a
def agg_demo(n):
    """
Run the demo with a preset array shape and content.
       See the header.
    """
    a = np.random.random_integers(0,high=5,size=n*n).reshape((n,n))
    rast = arcpy.NumPyArrayToRaster(a,x_cell_size=10)
    agg_rast = arcpy.sa.Aggregate(rast,2,"MAXIMUM")
    agg_arr = arcpy.RasterToNumPyArray(agg_rast)
    # --- a_s is the strided array, a_agg_max is the strided array max
    a_s  = block_a(a, block=(2,2))
    a_agg_max = a_s.max(axis=(2,3))
    # ---
    frmt = "\nInput array..\n{}\n\n" \
          "Arcpy.sa aggregate..\n{}\n\n" \
          "Numpy aggregate..\n{}\n\n" \
          "All close? {}"
    yup = np.allclose(agg_arr,a_agg_max)
    print(dedent(frmt).format(a, agg_arr, a_agg_max, yup))
    return a, agg_arr, a_s, a_agg_max

if __name__=="__main__":
    """ Returns the input array, it's aggregation raster from
    arcpy.sa, the raster representation of the raster and the
    block representation and the aggregation array.
    """

    n=10
    a, agg_arr, a_s, a_agg_max = agg_demo(n)

 

This is a sample ascii file so that you can

see the numeric inputs and structure of

the ascii header and its data.

ncols         10
nrows         10
xllcorner     300000
yllcorner     5025000
cellsize      10
NODATA_value  -9999
0 0 2 5 -1 3 5 2 5 0
-1 -1 2 1 1 -1 -1 4 0 4
4 5 5 5 1 4 1 1 2 2
1 1 3 1 2 0 5 -1 2 3
1 1 5 2 4 5 4 2 5 -1
2 1 0 -1 2 3 2 1 5 1
3 1 5 5 0 3 -1 3 -1 2
1 -1 0 5 2 5 2 1 -1 2
5 4 4 5 2 3 0 5 4 0
1 1 3 5 4 -1 4 5 1 5

 

Does it work with big rasters?

Does it support other summary statistics?

Of course, I covered the details of raster statistics in an earlier post.

 

That's all for now...

I have been working with arrays and lists a lot lately and I wanted to view them in column mode rather than in row mode to make documentation easier to follow and to make the output more intuitative.  The demo relies on the numpy module, but that is no issue since everyone has it with their ArcMap and ArcGIS Pro installation.

You can alter where an array gets parsed by changing the values in this line...

 

np.set_printoptions(edgeitems=2,linewidth=80,precision=2,suppress=True,threshold=8)

 

I altered the [ and ] characters common in lists and arrays, just to show you could do it.  I also added an indentation option.

If you like the array output better than the list output, you can simply make an array from a list using ... new_array np.array(input_list) ...

 

"""
Script:    array_print_demo.py
Author:   
Dan.Patterson@carleton.ca

Modified:  2016-01-16
Purpose:   Demonstration of formatting arrays/lists in alternate formats
Functions:
  -  frmt_arr(arr, indents=0)
  -  indent_arr(arr, indents=1)  called by above
"""

import numpy as np
np.set_printoptions(edgeitems=2,linewidth=80,precision=2,suppress=True,threshold=8)

def frmt_arr(arr, indents=0):
    """
    Format arrays or lists with dimensions >=3 printed by rows rather
    than the usual column expression.  Indentation can be employed as well as
    the number of indent levels.  See string.expandtabs to alter the tab level.
    """

    if isinstance(arr, list):
        arr = np.asarray(arr)  # make sure inputs are array
    result=[]; temp = []
    arr_dim = arr.ndim
    if arr_dim < 3:
        if (arr_dim == 1) and (len(arr[0]) > 1):
            arr = arr.reshape((arr.shape[-1],1))
        if indents:
            arr = indent_arr(arr, indents)
        return arr
    elif arr_dim == 3:
        temp.append(arr)
    elif arr_dim > 3:
        temp = [sub for sub in arr]   
    for arr in temp:
        for x in zip(*arr):
            result.append("   ".join(map(str,x))) # use tabs \t, not space
        if arr_dim > 3:
            result.append("----")
        out = "\n".join(result)
    if indents:
        out = indent_arr(out, indents)
        #out = (str(out).replace("[","|")).replace("]","|")
        #tabs = "    "*indents # "\t"
        # out = tabs + out.replace("\n","\n" + tabs)
    return out
def indent_arr(arr, indents=1):
    """
    Add an indent to a str or repr version of an array.
    The number of indents is determined by the indents option.
    """
       
    out = (str(arr).replace("[","|")).replace("]","|")
    tabs = "    "*indents # "\t"
    out = tabs + out.replace("\n","\n" + tabs)
    return out
if __name__=='__main__':
    """largely a demonstration of slicing by array dimensions
    """

    # syntax  frmt_arr(arr, indents=1)
    a = np.random.randint(1,1000,16)
    a1 = a.tolist()
    e = a.reshape((4,2,2)); e1 = d.tolist()

 

Your preference

>>> print(a1)
[[135, 944], [196, 335], [761, 521], [529, 687], [803, 393], [254, 797], [610, 605], [328, 516]]
>>> # or ...
>>> print(frmt_arr(a1,indents=1))
    ||135 944|
     |196 335|
     ...,
     |610 605|
     |328 516||

A 3D list or array
>>> print(e)
[[[135 944]
  [196 335]]
[[761 521]
  [529 687]]
[[803 393]
  [254 797]]
[[610 605]
  [328 516]]]
>>> # or ...
>>> print(frmt_arr(e,indents =2))
        |135 944|   |761 521|   |803 393|   |610 605|
        |196 335|   |529 687|   |254 797|   |328 516|

Date: 2015-10-08   Modified: 2017-04-19 new ***

Included:

  • Add X Y coordinates to table by distance or percent along poly* features  ***
  • Distance to a specific point
  • Cumulative distance
  • Inter-point distance
  • Azimuth to a point
  • Angle between consecutive points
  • Convert Azumith to compass bearing   ***
  • Convert Degrees decimal minutes to decimal degrees ***

 

Purpose:

Many posts on GeoNet come from people looking to write a script or produce a tool which can be handled in a much simpler fashion.  This is first in a series of posts that will address how to use the field calculator to its fullest.  The use, and perhaps the limitations to using, the field calculator should be guided by the following considerations:

  1. you need to perform some rudimentary task for which there is no existing tool (ie.  it is so simple to do...what is the point of having it builtin)
  2. your datasets are relatively small ...
    • of course this is a subjective classification...  so if processing takes more than a minute... you have left the realm of small
  3. you don't have to perform the task that often...
    • basically you are trying to fix up an oversight is the initial data requirements, or the requirements have changed since project inception
    • you got the data elsewhere and you are trying to get it to meet project standards
    • you want to explore
  4. you have no clue how to script but want to dabble as a prelude to full script development.
  5. whatever else I left out ....

 

Notes:

  • Make sure you have projected data and a double field or nothing will make sense.  I care not about performing great circle calculations or determining geodetic areas.  The field calculator...IMHO... is not the place to do that.
  • Ensure that the sequence of points is indeed correct.  If you have been editing or fiddling around with the data, make sure that nothing is amiss with the data.  What you see, is often not what is.  You can get erroneous results from any environment
  • Sorting will not change the outcome, the results will be in feature order.
    • If you want them in another order, then use tools to do that.
    • I will be paralleling geometric calculations using NumPy and Arcpy is a separate series which removes this issue.
  • Selections can be used but they will be in feature order.
    • sorting and selecting can get it to look like you want it ... but it doesn't make it so... want and is,  are two separate concepts like project and define projection (but I digress)
  • VB anything is not covered.
    • It is time to move on to the next realm in the sequence and evolution of languages.
    • in ArcGIS Pro, VB isn't even an option, so get used to it
  • code is written verbosely, for the most part
    • I will try to use the simplest of programming concepts...  simplicity trumps coolness and speed.
    • parallel, optional and/or optimized solutions will be discussed elsewhere.
  • the math module is already imported

 

Useage:

For all functions, do the following:

  • select the Python parser ... unfortunately, it can't be selected by default
  • toggle on, Show code block
  • Pre-logic Script Code:
    • paste the code in this block
  • Expression box
    • paste the function call in the expression box
    • ensure that there is preceding space before the expression...it will generate an error otherwise
    • ie  dist_between(!Shape!)     ... call the dist_between function ... the shape field is required
    • field names are surrounded by exclamation marks ( ! ) ... the shapefile and file geodatabase standard

----------------------------------------------------------------------------------------------------------------------------------------

Add X or Y coordinate to table field by distance or percentage

For use in tables, to retrieve the X or Y coordinates in an appropriate field.  See the header for requirements.


Pre-logic Script Code:

def pnt_along(shape, value=0.0, use_fraction=False, XorY="X"): 
    """Position X or Y coordinate, x/y meters or decimal fraction along a line.
    :Requires:
    :--------
    : shape field: python parser use !Shape!
    : value: (distance or decimal fraction, 0-1)
    : use_fraction: (True/False)
    : XorY: specify X or Y coordinates
    :
    :Returns: the specified coordinate (X or Y) meters or % along line or boundary
    :-------
    :
    :Useage: pnt_along(!Shape!, 100, False, "X") # X, 100 m from start point
    :
    """
    XorY = XorY.upper()
    if use_fraction and (value > 1.0):
        value = value/100.0
    if shape.type.lower() == "polygon":
        shape = shape.boundary()
    pnt = shape.positionAlongLine(value,use_fraction)
    if XorY == 'X':
        return pnt.centroid.X
    else:
        return pnt.centroid.Y

expression =

pnt_along(!Shape!, 100, False, "X")  # X, 100 m from start point

--------------------------------------------------------------------------------------------------------------

Distance to a specific point

Calculate the distance to a specific point within a dataset.   For example, you can determine the distance from a point cloud's center to other points.  Features not in the file can be obtained by other means.  Useful to get summary information on inter-point distances as a prelude to a full clustering or KD-means study.

For example, the potential workflow to get the distance of every point to the point clouds center might include the following steps:

  • add X and Y style fields to contain the coordinates
  • use Field Statistics to get the mean center (use a pen to record them... not everything is automagic)
  • add a Dist_to field to house the calculations
  • use the script below

 

Pre-logic Script Code:

"""  dist_to(shape, from_x, from_y)
input:      shape field, origin x,y
returns:    distance to the specified point
expression: dist_to(!Shape!, x, y)
"""

def dist_to(shape, from_x, from_y):
    x = shape.centroid.X
    y = shape.centroid.Y
    distance = math.sqrt((x - from_x)**2 + (y - from_y)**2)
    return distance

expression =

dist_to(!Shape!, x, y)

-----------------------------------------------------------------------------------------------------------------------------------------

Cumulative Distance

To determine the distance from a point in a data table to every other point in sequence.  Essentially a sequential perimeter calculation.

 

Pre-logic Script Code:

""" input shape field: returns cumulative distance between points
dist_cumu(!Shape!) #enter into the expression box"""
x0 = 0.0
y0 = 0.0
distance = 0.0
def dist_cumu(shape):
    global x0
    global y0
    global distance
    x = shape.centroid.X
    y = shape.centroid.Y
    if x0 == 0.0 and y0 == 0.0:
        x0 = x
        y0 = y
    distance += math.sqrt((x - x0)**2 + (y - y0)**2)
    x0 = x
    y0 = y
    return distance

expression =

dist_cum(!Shape!)

----------------------------------------------------------------------------------------------------------------------------------------

Inter-point distance

Determine distances between sequential point pairs (not cumulative).

 

Pre-logic Script Code:

""" dist_between(shape)
input: shape field
returns: distance between successive points
expression: dist_between(!Shape!)
"""

x0 = 0.0
y0 = 0.0
def dist_between(shape):
    global x0
    global y0
    x = shape.centroid.X
    y = shape.centroid.Y
    if x0 == 0.0 and y0 == 0.0:
        x0 = x
        y0 = y
    distance = math.sqrt((x - x0)**2 + (y - y0)**2)
    x0 = x
    y0 = y
    return distance

expression =

dist_between(!Shape!)

 

-----------------------------------------------------------------------------------------------------------------------------------------

Azimuth to a specific point

Determine the azimuth to a specific point, for example, a point cloud's center.

 

Pre-logic Script Code:

"""  azimuth_to(shape, from_x, from_y)
input:      shape field, from_x, from_y
returns:    angle between 0 and <360 between a specified point and others
expression: azimuth_to(!Shape!, from_x, from_y)
"""

def azimuth_to(shape, from_x, from_y):
    radian = math.atan((shape.centroid.X - from_x)/(shape.centroid.Y - from_y))
    degrees = math.degrees(radian)
    if degrees < 0:
        return degrees + 360.0
    else:
        return degrees

expression =

azimuth_to(!Shape!,from_x, from_y)

azimuth_1.png

 

-----------------------------------------------------------------------------------------------------------------------------------------

Angle between successive points

Determine the angle between points, for example, angle changes between waypoints.

 

Pre-logic Script Code:

"""  angle_between(shape)
input:      shape field
returns:    angle between successive points,
            NE +ve 0 to 90, NW +ve 90 to 180,
            SE -ve <0 to -90, SW -ve <-90 to -180
expression: angle_between(!Shape!)
"""

x0 = 0.0
y0 = 0.0
angle = 0.0
def angle_between(shape):
    global x0
    global y0
    x = shape.centroid.X
    y = shape.centroid.Y
    if x0 == 0.0 and y0 == 0.0:
        x0 = x
        y0 = y
        return 0.0
    radian = math.atan2((shape.centroid.Y - y0),(shape.centroid.X - x0))
    angle = math.degrees(radian)
    x0 = x
    y0 = y
    return angle

expression =

angle_between(!Shape!)

 

-----------------------------------------------------------------------------------------------------------------------------------------
Line direction or Azimuth to Compass Bearing

 

 

Can be used to determine the direction/orientation between two points which may or may not be on a polyline.  Provide the origin and destination points.  The origin may be the 0,0 origin or the beginning of a polyline or a polyline segment.

def line_dir(orig, dest, fromNorth=False):
    """Direction of a line given 2 points
    : orig, dest - two points representing the start and end of a line.
    : fromNorth - True or False gives angle relative to x-axis)
    :
    """

    orig = np.asarray(orig)
    dest = np.asarray(dest)
    dx, dy = dest - orig
    ang = np.degrees(np.arctan2(dy, dx))
    if fromNorth:
        ang = np.mod((450.0 - ang), 360.)
    return ang

-----------------------------------------------------------------------------------------------------------------------------------------

Convert Azimuth to Compass Bearing

 

Once the Azimuth to a particular point is determined, this can be converted to a compass direction, centered in 22.5 degree bands.  The type of compass can be altered to suit... see script header.

 

Pre-logic Script Code:

import numpy as np
global a
global c
c = np.array(['N', 'NNE', 'NE', 'ENE', 'E', 'ESE', 'SE', 'SSE',
              'S', 'SSW', 'SW', 'WSW', 'W', 'WNW', 'NW', 'NNW', 'N'])
a = np.arange(11.25, 360., 22.5)
def compass(angle):
    """Return the compass direction based on supplied angle.
    :Requires:
    :--------
    : angle - angle(s) in degrees, no check made for other formats.
    : - a single value, list or np.ndarray can be used as input.
    : - angles are assumed to be from compass north, alter to suit.
    :
    :Returns: The compass direction.
    :-------
    :
    :Notes:
    :-----
    : Compass ranges can be altered to suit the desired format.
    : See various wiki's for other options. This incarnation uses 22.5
    : degree ranges with the compass centered on the range.
    : ie. N between 348.75 and 11.25 degrees, range equals 22.5)
    :
    :----------------------------------------------------------------------
    """

    if isinstance(angle, (float, int, list, np.ndarray)):
        angle = np.atleast_1d(angle)
    comp_dir = c[np.digitize(angle, a)]
    if len(comp_dir) == 1:
        comp_dir[0]
    return comp_dir

expression =

compass(!Azimuth_Field_Name!)  # python parser, field name enclosed in quotes

 

:--------------------------------------------------------------------------------------------------------------------------------------------------------------

Degrees decimal minutes to decimal degrees

def ddm_ddd(a, sep=" "):
    """ convert decimal minute string to decimal degrees
    : a - degree, decimal minute string
    : sep - usually a space, but check
    """

    d, m = [abs(float(i)) for i in a.split(sep)]
    sign = [-1, 1][d < 0]
    dd = sign*(d + m/60.)
    return dd

 

For example   ddm_ddd(!YourStringField!, sep=" ") will convert a string/text field into a double in your new field

This post seemed interesting for several reasons:   Iterate and Add Values from many fields

  • it is generally the type of thing that we know how to do in a spreadsheet easily
  • it is a different thought process that one uses to translate into code
  • I got to experiment with block based functions that weren't related to raster or image data

 

Reflecting before problem solving

As a preamble, let's examine the year in different ways.

Shaping up your year in various ways....

the conventional year, by day

 

[[  1,   2,   3, ..., 363, 364, 365]] 

the year by average month

      [[  1,   2,   3, ...,  28,  29,  30], 

       [ 31,  32,  33, ...,  58,  59,  60],

       [ 61,  62,  63, ...,  88,  89,  90],

       ...,   ... snip ....

       [271, 272, 273, ..., 298, 299, 300],

       [301, 302, 303, ..., 328, 329, 330],

       [331, 332, 333, ..., 358, 359, 360]]

the year by weeks

      [[  1,   2,   3, ...,   5,   6,   7], 

       [  8,   9,  10, ...,  12,  13,  14],

       [ 15,  16,  17, ...,  19,  20,  21],

       ...,  ... snip ....

       [344, 345, 346, ..., 348, 349, 350],

       [351, 352, 353, ..., 355, 356, 357],

       [358, 359, 360, ..., 362, 363, 364]]

how about the average lunar cycle?

      [[  1,   2,   3, ...,  26,  27,  28],

       [ 29,  30,  31, ...,  54,  55,  56],

       [ 57,  58,  59, ...,  82,  83,  84],

       ...,

       [281, 282, 283, ..., 306, 307, 308],

       [309, 310, 311, ..., 334, 335, 336],

       [337, 338, 339, ..., 362, 363, 364]])

 

Your year can be shaped in different ways... but notice that the year is never always equal to 365 (not talking about leap years here).

The year is divisible by chunks... the fiddly-bits get truncated.  Notice the different year lengths in the above (365, 364, 360, 364).

Did you ever wonder why your birthday is on a different sequentially changing day every year?

 

On to the problem

So here is some data.  It is constructed so that you can do the 'math' yourself.  The human brain is pretty good at picking out the patterns...it is just less adept at formulating a plan to implement it.  So here goes..

 

The task is to take a set of data and determine some properties on a block by block basis.  In this example, the maximum needs to be determined for blocks of 4 values for each block in a row.  Coincidently ... (aka, to simplify the problem) this yields 5 blocks of 4 in each row.  This type of block is denoted as a 1x4 block... one row by 4 columns, if you wish.

 

1.  Input data.... subdivide the data into 1x4 blocks

[[  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20]

[ 21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40]

[ 41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60]

[ 61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80]

[ 81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100]]

 

2.  Imagine that the data are split into 5 blocks of 1x4 values by 5 rows which means that you would have 25 1x4 blocks of values.   Here is an example [1, 2, 3,4].  Cleverly we can detect that the maximum as being 4 which most of you know how to derive in some programming language using some interface, whether it be the field calculator or a tool in arctoolbox.

 

3.  Using your minds eye, or a spreadsheet, determine the maximum for each block of 4, within each row, for all rows.

The maximum of each 1x4 block in each row, would yield.

[[  4   8  12  16  20]

[ 24  28  32  36  40]

[ 44  48  52  56  60]

[ 64  68  72  76  80]

[ 84  88  92  96 100]]

 

4.  The final stage of the cited problem was to determine the maximum of each of the 1x4 blocks within each row.  That is, the maximum of the values produced in step 2.

The results on a row basis is as follows:

 

[ 20  40  60  80 100]

 

: ---- Example 2 ----
5.  Let's do the sum of each 1x4 block and determine the row max

     Again, I won't show the intermediate step since it is visually confusing unless you are interested in it.

 

6. sums of each 1x4 block in each row

[[[[ 10  26  42  58  74]

  [[ [ 90 106 122 138 154]

  [[ [170 186 202 218 234]

  [[ [250 266 282 298 314]

  [[ [330 346 362 378 394]]

 

7. maximum of (6) by row        [ 74 154 234 314 394]

 

: ---- Example 3 ----

8.  Do the sum but with a masked array. The mask locations should

:    not be included in calculations

[[1 2 3 4 5 -- 7 8 9 10 11 -- 13 14 15 16 17 -- 19 20]

[[ [21 22 23 -- 25 26 27 28 29 -- 31 32 33 34 35 -- 37 38 39 40]

[[[41 -- 43 44 45 46 47 -- 49 50 51 52 53 -- 55 56 57 58 59 --]

[[ [61 62 63 64 65 -- 67 68 69 70 71 -- 73 74 75 76 77 -- 79 80]

[[[81 82 83 -- 85 86 87 88 89 -- 91 92 93 94 95 -- 97 98 99 100]]

 

9. now with masked values

 

10. sum of the 1x 4 blocks accounting for the mask

[[10 20 30 58 56]

[[ [66 106 92 102 154]

[[ [128 138 202 164 174]

[[ [250 200 210 298 236]

[[ [246 346 272 282 394]]

 

11. maximum of 10 by row      [58 154 202 298 394]

# -------------------------------------------------------------------------------------------------------------------------

Now, this can obviously be extended to look at longer data lists.  The following example looks at how to determine the maximum monthly value and maximum sum over a 5 year period.  To simplify the example and to facilitate mental math, a year has 360 days, and 1 month has 30 days.

---- the trick ----

a = int_range((5,360),1,1)  # rows, columns, start, step
b = block_reshape(a,block=(1,30))

 

:---- Example 4 ----

1.  Input data.... subdivide the data into 1x30 blocks... each 360 day year, by 30 days per month

[[   1    2    3         ...,  358  359  360]

[ 361  362  363    ...,  718  719  720]

[ 721  722  723    ..., 1078 1079 1080]

[1081 1082 1083 ..., 1438 1439 1440]

[1441 1442 1443 ..., 1798 1799 1800]]

 

2.  Now imagine what that would look like if subdivided

3.  maximum of each 1x30 block in each row

[[  30   60   90  120  150  180  210  240  270  300  330  360]

[ 390  420  450  480  510  540  570  600  630  660  690  720]

[ 750  780  810  840  870  900  930  960  990 1020 1050 1080]

[1110 1140 1170 1200 1230 1260 1290 1320 1350 1380 1410 1440]

[1470 1500 1530 1560 1590 1620 1650 1680 1710 1740 1770 1800]]

 

4.  maximum of (3) by row

[ 360  720 1080 1440 1800]

 

: ---- Example 5 ----

5.  Let's do the sum of each 1x30 block and determine the row max

6. sums of each 1x30 block in each row

[[  465  1365  2265  3165  4065  4965  5865  6765  7665  8565  9465 10365]

[11265 12165 13065 13965 14865 15765 16665 17565 18465 19365 20265 21165]

[22065 22965 23865 24765 25665 26565 27465 28365 29265 30165 31065 31965]

[32865 33765 34665 35565 36465 37365 38265 39165 40065 40965 41865 42765]

[43665 44565 45465 46365 47265 48165 49065 49965 50865 51765 52665 53565]]

7. maximum of (6) by row

[10365 21165 31965 42765 53565]

 

Summary:

Still not convinced?  Well what is the cumulative sum of 1 to 30 inclusive:

[  1,   3,   6,  10,  15,  21,  28,  36,  45,  55,  66,  78,  91, 105, 120, 136,
    153, 171, 190, 210, 231, 253, 276, 300, 325, 351, 378, 406, 435, 465 ]

How about the next 30 days:

[  31,   63,   96,  130,  165,  201,  238,  276,  315,  355,  396,  438,  481,  525,  570,

        616,  663,  711,  760,  810,  861,  913,  966, 1020, 1075, 1131, 1188, 1246, 1305, 1365]

Now notice the first 2 entries in (6) above... 465 and 1365

 

---- What is the magic??? -----

>>> a.shape   # 5 years, 360 days in a year

(5, 360)

>>> b.shape   # 5 years, 12 months, 30 days per 1 month

(5, 12, 1, 30)

>>> c.shape   # reshape by month

(5, 12)

>>> d.shape   # number of years

(5,)

>>> c1.shape  # reshape the 5 years by month and sum each month, take the max

(5, 12)

>>> d1.shape  # sums for the 5 years, take the yearly max

(5,)

>>>

 

And finally in code

 

let's make up 5 years of data where it rains 1mm per day, day in-day out
>>> # here we go
>>> x = np.ones((5, 360), dtype='int32')
>>> x = np.ones((5, 360), dtype='int32')
>>> x.shape
(5, 360)
>>> x.min(), x.max()
(1, 1)
>>> # so far so good
>>>
>>> x1 = block_reshape(x, block=(1, 30)) # produce the monthly blocks
>>> x2 = ((x1.T).max(axis=0))            # start reorganizing
>>> x2 = (x2.T).squeeze()                # more reorganizing
>>> x3 = x2.max(axis=1)                  # get the maximum per year
>>> #
>>> x4 =((x1.T).sum(axis=0))             # do the sums
>>> x4 = (x4.T).squeeze()                # finish reorganizing
>>> x5 = x4.max(axis=1)                  # and finish up with the maximums
>>>
>>> x   # the daily precipitation, for 360 days for 5 years
array([[1, 1, 1, ..., 1, 1, 1],
       [1, 1, 1, ..., 1, 1, 1],
       [1, 1, 1, ..., 1, 1, 1],
       [1, 1, 1, ..., 1, 1, 1],
       [1, 1, 1, ..., 1, 1, 1]])
>>> x2  # The monthly maximum for 5 years (12 months for 5 years)
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])
>>> x3  # the yearly summary for the maximums for 5 years
array([1, 1, 1, 1, 1])
>>> x4  # the monthly sums  12 months for 5 years
array([[30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30],
       [30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30],
       [30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30],
       [30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30],
       [30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30]])
>>> x5  # the annual montly maximum for each year
array([30, 30, 30, 30, 30])
>>>

 

That's all for now

I had made reference before of using numpy and arcpy together to perform particular tasks.  Sometimes the combination of the two enables the user to do things they wouldn't otherwise be able to do.  This ditty presents a very simple way to sort a dataset and produce a permantly sorted version while maintaining the original feature identification numbers.  So to begin with, I will remind/introduce the reader to the arcpy data access module's functions as well as comparable ones in numpy.

 

On the arcmap side, the user should become familiar with:

  • arcpy.Describe
  • arcpy.Sort
  • arcpy.da.FeatureClassToNumPyArray and
  • arcpy.da.NumPyArrayToFeatureClass. 

From the  numpy side:

  • np.setprintoptions
  • np.arange and
  • np.sort

The latter performs the same work as arcpy.Sort, but it doesn't have any license restrictions as the Sort tool does.  It also allows you to produce a sequential list of numbers like the oft used Autoincrement field calculator function.

 

The script will perform the following tasks:

  • import the required modules,
  • specify the input file,
  • the spatial reference of the input,
  • the fields used to sort on, in the order of priority,
  • a field to place the rank values, and
  • specify the output file.

 

One can add the field to place the ranks into ahead of time or add a line of code to do it within the script (see arcpy.AddField_management).

 

# imports ......

import numpy as np
import arcpy

np.set_printoptions(edgeitems=4,linewidth=75,precision=2,
                    suppress=True,threshold=10)

arcpy.env.overwriteOutput = True

# inputs  ......

input_shp = r'c:\!Scripts\Shapefiles\RandomPnts.shp'
all_fields = [fld.name for fld in arcpy.ListFields(input_shp)]
SR = arcpy.Describe(input_shp).spatialReference

sort_fields = ['X','Y']
order_field = 'Sort_id'

output_shp = r'c:\!Scripts\Shapefiles\SortedPnts.

# outputs ......

arr = arcpy.da.FeatureClassToNumPyArray(input_shp, "*", spatial_reference=SR)
arr_sort = np.sort(arr, order=sort_fields)

arr_sort['Sort_id'] = np.arange(arr_sort.size)

arcpy.da.NumPyArrayToFeatureClass(arr_sort, output_shp, ['Shape'], SR)


# summary .....

print('\nInput file: {}\nField list: {}'.format(input_shp,all_fields))

print('Sort by: {},  ranks to: {}'.format(sort_fields,orderfield))

 

result

 

Input file: c:\!Scripts\Shapefiles\RandomPnts.shp

Field list: [u'FID', u'Shape', u'Id', u'X', u'Y', u'Sort_id']

Sort by: ['X', 'Y'],  ranks to: Sort_id

 

Sort_np_arc.png 

Give it a try with your own files.

It is really a pain that certain highly used functions are only available an advanced license level.  This is an alternate to the options of using Excel to produce a pivot table from ArcMap tabular data.

 

Flipping rows and columns in data generally works smoothly when the table contains one data type, whether it be integer, float or text.  Problems arise when you add stuff to Excel is that it allows you do so without regard to the underlying data.  So, columns get mixed data types and rows do as well. Uniformity by column is the rule.

 

In NumPy, each column has a particular data type.  The data type controls the operations that can be performed on it.  Numeric fields can have all the number type operations used...similarly for string/text fields.  It is possible to cast a field as an "object" type allowing for mixed type entries.  The nice thing about this type, is that you can't really do anything with it unless it is recast into a more useful form...but it does serve as a conduit to other programs or just for presentation purposes.

 

In the following example, the line

a = # your array goes here

can be derived using

a = arcpy.FeatureClasstoNumPyArray(....)  FeatureClassToNumPyArray

 

The nature of np.unique change in version 1.9 to get the number of unique classes as well.  So if you are using ArcGIS Pro, then you can use the newer version if desired by simply changing line 04 below.

 

a_u, idx, counts = np.unique(a_s, return_inverse=True, unique_counts=True)

 

Array conversion to summary table or pivot tableInput and output

Well... who needs an advanced license or excel ...

Assume we have an array of the format shown in the Input section.  We can determine the counts or sums of unique values in a field, using the following.

  • sort the array on a field,
  • get unique values in that field,
  • sum using the values in another field as weights
  • rotate if desired
    import numpy as np
    a = # your array goes here
    a_s = a[np.argsort(a, order="Class")]
    a_u, idx = np.unique(a_s["Class"], return_inverse=True)
    bin_cnt = np.bincount(idx,weights=a_s['Count'])
    ans = np.array((a_u, bin_cnt), dtype='object')
    print("a_u\n{}\nidx {}\nanswer\n{}".format(a_u, idx, ans))
    rot90 = np.rot90(ans, k=1) 
    and_flipud = np.flipud(rot90) #np.flipud(np.rot90(a,k=1))))
    frmt = ("pivot table... rotate 90, flip up/down\n{}" 
    print(frmt.format(and_flipud))

 

The trick is to set the data type to 'object'. You just use FeatureClassToNumPyArray or TableToNumPyArray and their inverses to get to/from array format.  Ergo....pivot table should NOT be just for an advanced license

For all-ish combos, you can just add the desired lines to the above

 

for i in range(4):
    print("\nrotated {}\n{}".format(90*i, np.rot90(a, k=i)))
for i in range(4):
    f = "\nrotate by {} and flip up/down\n{}"
    print(f.format(90*i, np.flipud(np.rot90(a, k=i))))
for i in range(5):
    f = "\nrotate by {} and flip left/right\n{}"
    print(f.format(90*i, np.fliplr(np.rot90(a, k=i))))

Input table with the following fields

'ID', 'X', 'Y', 'Class', 'Count'

 

>>> input array...
[[(0, 6.0, 0.0, 'a', 10)]
[(1, 7.0, 9.0, 'c', 1)]
[(2, 8.0, 6.0, 'b', 2)]
[(3, 3.0, 2.0, 'a', 5)]
[(4, 6.0, 0.0, 'a', 5)]
[(5, 2.0, 5.0, 'b', 2)]
[(6, 3.0, 2.0, 'a', 10)]
[(7, 8.0, 6.0, 'b', 2)]
[(8, 7.0, 9.0, 'c', 1)]
[(9, 6.0, 0.0, 'a', 10)]]

>>> # the results

a_u  # unique values
['a' 'b' 'c']
idx [0 0 0 0 0 1 1 1 2 2]

answer # the sums
[['a' 'b' 'c']
[40.0 6.0 2.0]]

pivot table... rotate 90, flip up/down
[['a' 40.0]
['b' 6.0]
['c' 2.0]]