Skip navigation
All People > Dan_Patterson > Py... blog
1 2 3 Previous Next

Py... blog

75 posts

You have 12 months of data in some raster form... You want some statistical parameter... There are areas of nodata, the extents are all the same... and ... you have a gazillion of these to do.

Sounds like you have a 'cube'... the original 'space-time' cube' .  You can pull space from a time slice... You can slice time through space.  At every location on a 'grid' you have Z as a sequence over time. 

 

Here is the code.   I will use ascii files, but they don't have to be, you just need to prep your files before you use them.

 

Source data originally from .... here .... thanks Xander Bakker.

 

import os
import numpy as np
from textwrap import dedent, indent
import arcpy

arcpy.overwriteOutput = True

ft = {'bool': lambda x: repr(x.astype('int32')),
      'float': '{: 0.3f}'.format}
np.set_printoptions(edgeitems=3, linewidth=80, precision=2, suppress=True,
                    threshold=80, formatter=ft)
np.ma.masked_print_option.set_display('-')  # change to a single -

# ---- Processing temporal ascii files ----
# Header information
# ncols 720
# nrows 360
# xllcorner -180
# yllcorner -90
# cellsize 0.5
# NODATA_Value -9999
# ---- Begin the process ----
#
cols = 720
rows = 360
ll_corner =  arcpy.Point(-180., -90.0)  # to position the bottom left corner
dx_dy = 0.5
nodata = '-9999'
#
# ---- create the basic workspace parameters, create masked arrays ----
#
out_file = r'c:\Data\ascii_samples\avg_yr.tif'
folder = r'C:\Data\ascii_samples'
arcpy.env.workspace = folder
ascii_files = arcpy.ListFiles("*.asc")
a_s = [folder + '\{}'.format(i) for i in ascii_files]
arrays = []
for arr in a_s:
    a = np.mafromtxt(arr, dtype='int', comments='#',
                     delimiter=' ', skip_header=6,
                     missing_values=nodata, usemask=True)
    arrays.append(a)
#
# ---- A sample calculation from the inputs.... calculate the mean ----
#
N = len(arrays)                    # number of months... in this case
arr_shp = (N,) + arrays[0].shape   # we want a (month, col, row) array
msk = arrays[0].mask               # clone the mask... assume they are the same
e = np.zeros(arr_shp, 'int')       # one way is create a zeros array and fill
for i in range(len(arrays)):
    e[i] = arrays[i]
a = np.ma.array(e, mask=e*msk[np.newaxis, :, :])  # the empty array is ready
#
# ---- your need here... ie. Calculate a mean ----
#
avg = a.mean(axis=0)  # calculate the average down through the months
#
# ---- send it out to be viewed in ArcMap or ArcGIS Pro ----
#
value_to_nodata = int(avg.get_fill_value())
out = avg.view(np.ndarray, fill_value=value_to_nodata)
g = arcpy.NumPyArrayToRaster(out, ll_corner, dx_dy, dx_dy)
g.save(out_file)

So the basic process is simple.  I have coded this verbosely and used input parameters read manually from the ascii header since it is the quickest way.... and you would probably know what they are from the start.

 

So in this example... 12 months of some variable, averaged accounting for the nodata cells.  Do the map stuff, define its projection... project it, give it some symbology and move on.

I will leave that for those that make maps.

Modify to suit... maybe I will get this into a toolbox someday

 

 

NOTE.....

 

Now in the linked example, there was a need to simply convert those to rasters from the input format.  In that case you would simply consolidate the salient portions of the script as follows and create the output rasters within the masked array creation loop

......
for arr in a_s:
    a = np.mafromtxt(arr, dtype='int', comments='#',
                     delimiter=' ', skip_header=6,
                     missing_values=nodata, usemask=True)
    value_to_nodata = int(a.get_fill_value())
    out = a.view(np.ndarray, fill_value=value_to_nodata)
    r = arcpy.NumPyArrayToRaster(out, ll_corner, dx_dy, dx_dy)
    out_file = arr.replace(".asc", ".tif")
    r.save(out_file)
    del r, a

.....

 

So for either doing statistical calculations for temporal data, or for format conversion... there are options available where arcpy and numpy play nice.

Geometry...

That is what I am normally interested in.  The baggage (aka, attributes) tag along for the ride and I normally find it easier to keep the baggage separate until I am done with the geometry. For those following along, see my previous post.

 

Let us compare some of the ways that we can pull the geometry out of a featureclass.  The following demonstrations can be followed in your own workflow for testing your specific cases.

 

Imports first ___________________________________________________________________________________

import sys
import numpy as np
import arcpy
import arcgis

_______________________________________________________________________________________________

I prefer to import modules in the order of less polluting first to ensure namespace is grabbed by what I want in order of importance in case there is any conflict during import.

 

Searchcursor __________________________________________________________________________________

# ---- pick a featureclass ----
fc0 = r'drive:\your_spaceless_folder_path\your_geodatabase.gdb\polygon'

# ---- using arcpy.Describe ----
desc = arcpy.Describe(fc0)
shp = desc.shapeFieldName    # ---- using dot notation ----

# ---- using arcpy.da.Describe ----
desc = arcpy.da.Describe(fc0)
shp = desc['shapeFieldName'] # ---- note... now a dictionary ----

# ---- geometry, as X, Y ----
flds = [shp + "@"]
shps = [row[0] for row in arcpy.da.SearchCursor(fc0, flds)]

_______________________________________________________________________________________________

 

Which of course you can begin to use to get basic properties.  In this example,

Geometry properties  ____________________________________________________________________________

for s in shps:
    print(s.type, s.partCount, s.pointCount, s.length, s.area)
polygon 1 5 40.0 100.0
polygon 1 10 64.0 64.0
polygon 2 14 99.41640786499875 182.0

_______________________________________________________________________________________________

 

Then you can get to work and convert to a NumPy array quickly and simply.  Since I know this is a polygon featureclass, it only takes a couple of lines to perform the task.

 

Get to the point(s) ______________________________________________________________________________

pnts = []
for shp in shps:
    for j in range(shp.partCount):
        pt = shp.getPart(j)
        p_list = [(pnt.X, pnt.Y) for pnt in pt if pnt]
        pnts.extend(p_list)
dt = [('Xs', '<f8'), ('Ys', '<f8')]
a = np.asarray(pnts, dtype=dt)

_______________________________________________________________________________________________

The basic difference between the array forms is how you want to work with them.

In the example above array 'a' has a specified data type (dtype).  The fields/columns of the array can be accessed by name (Xs and Ys).  Since this array is a structured array, the access would be a['Xs'] and a['Ys'].  If I converted this to a record array, one could use object.field notation.

 

The coordinates are nothing more than the same type of number, so the field names can be dispensed with altogether.  In this case, the sentient being is responsible for knowing what they are working with.  Both forms of the same data are shown below

_______________________________________________________________________________________________

a # a structured array
array([(300020.0, 5000000.0), (300010.0, 5000000.0), (300010.0, 5000010.0), ...,
          (300002.0, 5000002.0), (300008.0, 5000002.0), (300005.0, 5000008.0)],
      dtype=[('Xs', '<f8'), code="" ('ys',="" '<f8')])

a_nd = np.asarray(pnts)  # as an np.ndarray
array([[ 300020.00,  5000000.00],
       [ 300010.00,  5000000.00],
       [ 300010.00,  5000010.00],
       ...,
       [ 300002.00,  5000002.00],
       [ 300008.00,  5000002.00],
       [ 300005.00,  5000008.00]])
a_nd.dtype
dtype('float64')

_______________________________________________________________________________________________

 

The following demo function can be used on your data to examine some of the options and explore some of the properties and methods available to each.  Just don't forget the imports at the start of the post

 

The demo  _____________________________________________________________________________________

def _demo(in_fc):
    """Run the demo and return some objects
    : create a SpatialDataFrame class object from a featureclass
    : create a record array from a_sdf
    : create a numpy.ndarray using the da module
    """

    SR = arcpy.Describe(in_fc).SpatialReference

    sdf = arcgis.SpatialDataFrame
    a_sdf = sdf.from_featureclass(in_fc,
                                  sql_clause=None,
                                  where_clause=None,
                                  sr=SR)
    a_rec = sdf.to_records(a_sdf)  # record array
    a_np = arcpy.da.FeatureClassToNumPyArray(in_fc,
                                             field_names="*",
                                             spatial_reference=SR,
                                             explode_to_points=True)
    a_np2 = fc_g(in_fc)
    return sdf, a_sdf, a_rec, a_np, a_np2

 

Now... behind all the magic of the above options, a searchcursor is behind the acquisition of all of the options shown above.  The options do, however, provide access to methods and properties which are unique to their data class.  In many instances these are shared.

 

Here is what the 'look like' .  In the case of the SpatialDataFrame, a_sdf, the same when converted to a record array and the conventional da.FeatureClassToNumPyArray... all fields were included.  in the last option, a_np2, just the geometry is returned as demonstrated in the above examples, with the addition of handling the geometry parts and point when the polygon is 'exploded' to it's constituent points.

 

a_sdf
Out[40]:
   OBJECTID  Id Text_fld                                              SHAPE
0         1   1     None  {'rings': [[[300020, 5000000], [300010, 500000...
1         2   2        C  {'rings': [[[300010, 5000020], [300010, 500001...
2         3   3        A  {'rings': [[[300010, 5000010], [300010, 500002...

a_rec
Out[41]:
rec.array([ (0, 1, 1, None, {"rings": [[[300020, 5000000], [300010, 5000000], [300010, 5000010], [300020, 5000010], [300020, 5000000]]], "spatialReference": {"latestWkid": 2951, "wkid": 2146}}),
(1, 2, 2, 'C', {"rings": [[[300010, 5000020], [300010, 5000010], [300000, 5000010], [300000, 5000020], [300010, 5000020]], [[300002, 5000018], [300002, 5000012], [300008, 5000012], [300008, 5000018], [300002, 5000018]]], "spatialReference": {"latestWkid": 2951, "wkid": 2146}}),
(2, 3, 3, 'A', {"rings": [[[300010, 5000010], [300010, 5000020], [300020, 5000020], [300020, 5000010], [300010, 5000010]], [[300010, 5000010], [300010, 5000000], [300000, 5000000], [300000, 5000010], [300010, 5000010]], [[300005, 5000008], [300002, 5000002], [300008, 5000002], [300005, 5000008]]], "spatialReference": {"latestWkid": 2951, "wkid": 2146}})],
          dtype=[('index', '<i8'), ('OBJECTID', '<i8'), ('Id', '<i8'), ('Text_fld', 'O'), ('SHAPE', 'O')])

a_np
Out[42]:
array([(1, [300020.0, 5000000.0], 1, 'None', 40.0, 100.0),
       (1, [300010.0, 5000000.0], 1, 'None', 40.0, 100.0),
       (1, [300010.0, 5000010.0], 1, 'None', 40.0, 100.0), ...,
       (3, [300002.0, 5000002.0], 3, 'A', 99.41640786499875, 182.0),
       (3, [300008.0, 5000002.0], 3, 'A', 99.41640786499875, 182.0),
       (3, [300005.0, 5000008.0], 3, 'A', 99.41640786499875, 182.0)],
      dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)), ('Id', '<i4'), ('Text_fld', '<U255'), ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])

a_np2
Out[43]:
array([(1, 0, 300020.0, 5000000.0), (1, 0, 300010.0, 5000000.0),
       (1, 0, 300010.0, 5000010.0), ..., (3, 1, 300002.0, 5000002.0),
       (3, 1, 300008.0, 5000002.0), (3, 1, 300005.0, 5000008.0)],
      dtype=[('ID_num', '<i4'), ('Part_num', '<i4'), ('Xs', '<f8'), ('Ys', '<f8')])

 

When the SpatialDataFrame is converted to a numpy record array, the geometry field (Shape) has a dtype of 'O', which is an object array.  This will be the 'normal' case since it is unlikely that all polygons will contain the same number of parts and points per part.  To truly be a numpy array, the 'shape' of the array needs to be consistent.  

 

The latter two representations, (a_np and a_np2) deal with this but converting the polygons to points.  These points can be converted back to polygons after they are used in some process such as moving, calculating parameters, reprojecting.

 

Next installation... working with geometry in its various representations.

ArcGIS API for Python  ... version 1.2.0

Another cleverly named product to provide more clarity to the other named kindred... ArcGIS for Desktop, ArcGIS Pro, Arc... etcetera.  The link to the help is here.  The ability to work with Jupyter notebooks, NumPy, SciPy and Arcpy is touted and welcomed (...and there is something about web mapping and stuff as well).

 

Where stuff is

Locating ArcGIS in your installation path, depends on how you installed it... for a single user (aka no sharesies) or for anyone.  This describes the single user situation.

To begin, import the module and check its __path__ and __file__ property.  My installation path is in an ArcPro folder that I created... yours will differ, but beyond the base folder, everything should be the same.

 

Basic import______________________________________________________________________

In [1]: import arcgis
In [2]: arcgis.__path__
Out[2]: ['C:\\ArcPro\\bin\\Python\\envs\\arcgispro-py3\\lib\\site-packages\\arcgis']

In [3]: arcgis.__file__

Out[3]: 'C:\\ArcPro\\bin\\Python\\envs\\arcgispro-py3\\lib\\site-packages\\arcgis\\__init__.py'

_________________________________________________________________________________

 

The __init__.py file is read during the import and a whole load of imports are done to cover access to pretty well all available modules contained in the ArcGIS path.

 

If you only want to work with the geometry or Pandas related functionality, you can import the SpatialDataFrame class directly.

 

SpatialDataFrame___________________________________________________________________

In [1]: from arcgis import SpatialDataFrame as sdf

# ---- or ----

In [1]: from arcgis.features._data.geodataset import SpatialDataFrame as sdf

In [2]: sdf.__module__  # to confirm the path

Out[2]: 'arcgis.features._data.geodataset.geodataframe'

_________________________________________________________________________________

 

The SpatialDataFrame class is pretty well all that is in geodataframe.py script.

Another useful call within that class is one to from_featureclass and to_featureclass which can be obtained by a variety of other imports or by a direct call to the io module's fileops.py

 

Featureclass access________________________________________________________________

In [3]: from arcgis.features._data.geodataset.io import from_featureclass, to_featureclass

# ---- or ----

In [4] from arcgis.features._data.geodataset import io

_________________________________________________________________________________

 

If you prefer the module.function method of calling, the option in ln [4] can be used to convert a featureclass to a SpatialDataFrame.

 

The Reveal________________________________________________________________________

In [5] fc0 = r'drive:\your_space_free_path\your_geodatabase.gdb\polygon'

In [6] a = io.from_featureclass(fc0)

In [7] print("\nSpatialDataFrame 'a' \n{}".format(a))  # ---- the sdf contents ----

 

SpatialDataFrame 'a'
   OBJECTID  Id Text_fld                                              SHAPE
0         1   1     None  {'rings': [[[300020, 5000000], [300010, 500000...
1         2   2        C  {'rings': [[[300010, 5000020], [300010, 500001...
2         3   3        A  {'rings': [[[300010, 5000010], [300010, 500002...

ln [8] type(a)

Out[8]: <class 'arcgis.features._data.geodataset.geodataframe.spatialdataframe'>

_________________________________________________________________________________

 

Behind the Scenes

Great so far... BUT when you delve deeper into what is going on under the hood, you will find out that the from_featureclass method...

  1. imports the SpatialDataFrame class
  2. uses arcpy.da.SearchCursor to convert the featureclass into lists of
    1. values (non-geometry fields)
    2. geometry (geometry field)
  3. the value fields are then converted to a pandas dataframe (pd.DataFrame)
  4. the dataframe in step 3 is then used with the geometry values to produce a SpatialDataFrame modelled after a Pandas dataframe.

 

Final Comments

So... in conclusion, an arcpy.da searchcursor is used to get all the necessary geometry and attribute data then ultimately to a SpatialDataFrame ... which is like a Pandas dataframe ... but it is geometry enabled.

Sort of like geopandas... (geopandas on GitHub) but relying on arc* functionality and code for the most part.

 

More to Come

There will be more posts as I have time... the next post... Geometry.... part II

Requirements:  You need to have ArcGIS Pro installed and the associated installation of jupyter notebooks as described in previous blogs. 

 

This is a novel way of presenting all things python, arcpy and eventually through the Python API for ArcGIS Pro...

 

Background link   Arcgis Pro 2... Jupyter Notebook Setup Basics

 

The link here...distance.pdf .... should open up the explanatory pdf for the accompanying jupyter notebook.  If not... click the attachment below.

 

Unfortunately, Jive isn't at the stage to allow for this type of interactive content.

 

If you have comments, questions or found a bug, send them to me via email.

In the last blog post... ArcGIS Pro 2... Creating Desktop Shortcuts .. I showed how to create desktop shortcuts to some of the more commonly used applications within esri's distribution of python.  In this post, the Jupyter Notebook will be addressed.

 

Before you start on this venture, make sure that you have read up on notebooks and see if they are for you and your workflow and not something that '... everyone is doing it, and so should I.... '.  They do have their place, but they require maintenance.

 

The first order of business.... 

  • Create a folder location to store your notebooks... it is just way easier to keep them in one location and distribute them from there.  I chose to put them in my local machine's GitHub folder, in a separate folder within. ... C:\GitDan\JupyterNoteBooks ... pretty clever ehhh?
  • Right-click on the file ---- "C:/ArcPro/bin/Python/envs/arcgispro-py3/Scripts/jupyter-notebook-script.py" ---- and select create shortcut  which will simply create a shortcut to that file.
    • Go to the Shortcut tab and edit it the Target line and put ...
    • ---- C:\ArcPro\bin\Python\envs\arcgispro-py3\pythonw.exe ---- in front what is there... this will yield the whole path to python in the Pro distribution, plus the script to run (yes, the paths are that long and cryptic).
    • C:\ArcPro\bin\Python\envs\arcgispro-py3\pythonw.exe "C:/ArcPro/bin/Python/envs/arcgispro-py3/Scripts/jupyter-notebook-script.py"
  • In the Start in: line, put the path to the folder that you are going to house your notebooks.  In my example, this was the folder ---- C:\Git_Dan\JupyterNoteBooks ----
  • Finally, right-click on the shortcut, select Copy, flip over to your Desktop and Paste it there.  
  • yes... I know you can go to the command interface and run the command line from there, but why.  You can also use Anaconda Navigator in other non-ArcGIS Pro environments.  The installation and setup of the application within the Pro environment isn't worth the extra effort.

 

 

Python runs that script, which imports notebook.notebookapp as shown below.  That import does the work of setting up notebooks to work in your target folder.

""" Source code of ---- jupyter-notebook-script.py ---- located in
:  _drive_:\_folder_\bin\Python\envs\arcgispro-py3\Scripts ... where _drive_ and _folder
:  are unique to your machine... for example
:  ---- C:\ArcPro\bin\Python\envs\arcgispro-py3\Scripts ----
"""

if __name__ == '__main__':
    import sys
    import notebook.notebookapp

    sys.exit(notebook.notebookapp.main())

 

The details aren't really relevant... Just click on the shortcut, create your notebook and begin sharing... but before you do that, begin your reading here

 

And from a recent post... perhaps the future might look like this...

 

Here are a few tips for creating desktop shortcuts to make working with Pro a tad easier.

Installation of packages is given in pictoral form at the end of the blog (I sort of forgot... )

 

I like to have stuff easily accessed. Desktop icons work for me. Here is an example of having the pro.bat, Spyder and Jupyter IDE icons created and organized in one spot.

 

 

Launch File Explorer and find your installation folder. In my case, I installed Pro in a folder with the same name.  The rest of your folder structure will be the same after this point. Navigate to the jupyter-qtconsole-script.py script within the path (see 1 and 2 in the figure)

 

 

Once the properties dialog is open, click on Shortcut, then ensure your path emulates the example below.  It consists of two parts... the first is the location of python.exe within the installation path... the second is the aforementioned script which is run and noted above.  I make sure that the Start in: is specified as well.

 

 

 

If you work on your machine solely, you can click on the Advanced button in the previous dialog and toggle off the Run as administrator checkbox.  This will just avoid you having to answer a security question when the shortcut is used.

 

And now you are done.

 

I created one for proenv.bat as well as Spyder for ArcGIS Pro since I use different versions of Spyder and python and separate conda distributions, so I keep different shortcuts.

Here are the equivalent options for the Spyder shortcut.

Target...  (pythonw.exe location and the spyder-script.py location)

         C:\ArcPro\bin\Python\envs\arcgispro-py3\pythonw.exe "C:/ArcPro/bin/Python/envs/arcgispro-py3/Scripts/spyder-script.py"

 Starts in....
     C:\ArcPro\bin\Python\envs\arcgispro-py3\Scripts
Now you can fire up that puppy and put it to some useless work.
So that is the Jupyter QtConsol... on to Spyder and Pythonwin later....
Enjoy
----------------------------------------------------------------------------------------------------------------------------------
A pictoral guide to installing packages in Pro
Get to the package manager
To get there, look south-west
Install, update and add.  If you miss anything, just go back and add it.
That is about it.
Dan_Patterson

Circling the point

Posted by Dan_Patterson Champion Feb 21, 2017

Again, points and directions have arisen.  This time, ...in this post ... ,there was a need to place points within a polygon for geocoding based on compass directions.  The simplest solution turns out to be to create a centroid, then generate points about it at some distance and angle.

 

So the solution can be approached using search cursors, but as usual, I like numpy since you can vectorize most operations.  The code below, reads in the centroid points of an input polygon theme (r"...\fishnet_label.shp") which in this case was the centroids of a fishnet since they are easy to create.  From there, the steps are simple:

 

  • convert the input point file to an array (FeatureClassToNumPyArray
  • collect the point coordinates that fall on a circle at a distance and angle from the centroid (_circle def)
  • stack the points and convert back to a point file (Numpyarraytofeatureclass)

 

Of course, you can put in the angles in the table, provide, sub-identifiers and even the compass representation in text form.  The points can also be created with random angles and distances with little code modification.  A couple of examples are below, followed by the code.  See Darren's example in the link for a cursor approach.

 

points around circle .... points around circle 2

-------------------------------------------------------------------

And the code

"""
:Script:  createcompass.py
:Author:  Dan_Patterson@carleton.ca
: if north angle is needed, you can use this to convert
:    ang = np.mod((450.0 - ang), 360.)
"""

import numpy as np
import arcpy


def _circle(radius=10, theta=22.5, xc=0.0, yc=0.0):
    """Produce points around a circle.
    :  radius - distance from centre
    :  theta - either a single value to form angles about a circle or
    :        - a list or tuple of the desired angles
    """

    if isinstance(theta, (list, tuple)):
        angles = np.deg2rad(np.array(theta))
    else:
        angles = np.deg2rad(np.arange(180.0, -180.0-theta, step=-theta))
    x_s = radius*np.cos(angles) + xc    # X values
    y_s = radius*np.sin(angles) + yc    # Y values
    pnts = np.c_[x_s, y_s]
    return pnts


# --------------------------------------------------------------------
if __name__ == '__main__':
    """produce some points around a centroid at angles and distances"""

    inFC = r"C:\Data\points\fishnet_label.shp"
    outFC = r"C:\Data\points\pnts2.shp"
    radius = 2
    theta = 22.5  # or a list like... theta = [0, 90, 180, 270]
    a = arcpy.da.FeatureClassToNumPyArray(inFC, ["SHAPE@X", "SHAPE@Y"])
    b = [_circle(radius, theta, a['SHAPE@X'][i], a['SHAPE@Y'][i])
         for i in range(len(a))]
    c = np.vstack(b)
    c.dtype = a.dtype
    arcpy.da.NumPyArrayToFeatureClass(c, outFC, c.dtype.names)

 

See other posts about circles and related things.

circles sectors rings buffers and the n-gons

New ... Now that python 3.x is 8 years old, it became obvious that Continuum would update their package distribution

I have just started playing with doing non Arc*ish stuff for now since only the 3.4 distribution is officially supported in ArcGIS PRO.  I was pleasantly surprised to see one of my old-time favourite IDEs back ... Pythonwin ... by Mark Hammond.

 

Mix a little Spyder, 'win and Jupyter across multiple screens working on separate things and they all behaved nice.

I will post more as I find it... but for now, here are they are... (pycharm and others will join the play group when I have time).

 

anaconda for python 3.6

 

And a closer look at Spyder

Check it out and compare anaconda package documentation

:-----------------------------------------------------------------------------------------

Matplotlib

Yes... version 2.0.0 is now here.  check ... http://matplotlib.org/ for more information 

It is part of the new distribution and enables you to work with the 'classic' style (sic status quo) or the new style.

Dan_Patterson

Spanning trees

Posted by Dan_Patterson Champion Jan 31, 2017

Connecting the dots.  Spanning trees have a wide variety of applications in GIS... or did.  I have been finishing writings on distance and geometry and I almost forgot this one.  The principle is simple.  Connect the dots, so that no line (edge) overlaps, but only connects at a point.  In practice, you want the tree to produce a minimal length/distance output.  There are a variety of names forms, each with their subtle nuance and application, but my favorite is Prim's... only because I sortof understand it.  I am not sure if this his implementation exactly, but it is close enough.  So I post it here for those that want to try it.  Besides, Prim's can be used to produce mazes, a far more useful application of dot connection.

 

My favorite code header to cover imports and array printing and some bare-bones graphing

 

import sys
import numpy as np
import matplotlib.pyplot as plt
from textwrap import dedent, indent

ft = {'bool': lambda x: repr(x.astype('int32')),
      'float': '{: 0.1f}'.format}
np.set_printoptions(edgeitems=10, linewidth=100, precision=2,
                    suppress=True, threshold=120,
                    formatter=ft)
np.ma.masked_print_option.set_display('-')

script = sys.argv[0]

 

Now the ending of the script where the actual calls are performed and an explanation of what is happening occurs.

 

# ---------------------------------------------------------------------
if __name__ == "__main__":
    """Main section...   """
    #print("Script... {}".format(script))
    # ---- Take a few points to get you started ----
    a = np.array([[0, 0], [0,8], [10, 8],  [10,0], [3, 4], [7,4]])
    #
    idx= np.lexsort((a[:,1], a[:,0]))  # sort X, then Y
    a_srt = a[idx,:]                   # slice the sorted array
    d = _e_dist(a_srt)                 # determine the square form distances
    pairs = mst(d)                     # get the orig-dest pairs for the mst
    plot_mst(a_srt, pairs)             # a little plot
    o_d = connect(a_srt, d, pairs)     # produce an o-d structured array

 

Now the rest is just filler.  The code defs are given below.

 

def _e_dist(a):
    """Return a 2D square-form euclidean distance matrix.  For other
    :  dimensions, use e_dist in ein_geom.py"""

    b = a.reshape(np.prod(a.shape[:-1]), 1, a.shape[-1])
    diff = a - b
    d = np.sqrt(np.einsum('ijk,ijk->ij', diff, diff)).squeeze()
    #d = np.triu(d)
    return d


def mst(W, copy_W=True):
    """Determine the minimum spanning tree for a set of points represented
    :  by their inter-point distances... ie their 'W'eights
    :Requires:
    :--------
    :  W - edge weights (distance, time) for a set of points. W needs to be
    :      a square array or a np.triu perhaps
    :Returns:
    :-------
    :  pairs - the pair of nodes that form the edges
    """

    if copy_W:
        W = W.copy()
    if W.shape[0] != W.shape[1]:
        raise ValueError("W needs to be square matrix of edge weights")
    Np = W.shape[0]
    pairs = []
    pnts_seen = [0]  # Add the first point                   
    n_seen = 1
    # exclude self connections by assigning inf to the diagonal
    diag = np.arange(Np)
    W[diag, diag] = np.inf
    #
    while n_seen != Np:                                    
        new_edge = np.argmin(W[pnts_seen], axis=None)
        new_edge = divmod(new_edge, Np)
        new_edge = [pnts_seen[new_edge[0]], new_edge[1]]
        pairs.append(new_edge)
        pnts_seen.append(new_edge[1])
        W[pnts_seen, new_edge[1]] = np.inf
        W[new_edge[1], pnts_seen] = np.inf
        n_seen += 1
    return np.vstack(pairs)


def plot_mst(a, pairs):
    """plot minimum spanning tree test """
    plt.scatter(a[:, 0], a[:, 1])
    ax = plt.axes()
    ax.set_aspect('equal')
    for pair in pairs:
        i, j = pair
        plt.plot([a[i, 0], a[j, 0]], [a[i, 1], a[j, 1]], c='r')
    lbl = np.arange(len(a))
    for label, xpt, ypt in zip(lbl, a[:,0], a[:,1]):
        plt.annotate(label, xy=(xpt, ypt), xytext=(2,2), size=8,
                     textcoords='offset points',
                     ha='left', va='bottom')
    plt.show()
    plt.close()

def connect(a, dist_arr, edges):
    """Return the full spanning tree, with points, connections and distance
    : a - point array
    : dist - distance array, from _e_dist
    : edge - edges, from mst
    """

    p_f = edges[:, 0]
    p_t = edges[:, 1]
    d = dist_arr[p_f, p_t]
    n = p_f.shape[0]
    dt = [('Orig', '<i4'), ('Dest', 'i4'), ('Dist', '<f8')]
    out = np.zeros((n,), dtype=dt)
    out['Orig'] = p_f
    out['Dest'] = p_t
    out['Dist'] = d
    return out

 

The output from the sample points is hardly exciting.  But you can see the possibilities for the other set.

 

This one is for a 100 points, with a minimum spacing of 3 within a 100x100 unit square.  Sprightly solution even on an iThingy using python 3.5 

 

 

 

Now on to maze creation .....

Dan_Patterson

Vincenty

Posted by Dan_Patterson Champion Jan 24, 2017

Distance calculations using longitude and latitude on the ellipsoid.   Haversine or Vincenty?

The choice depends on the resolution needed.  This thread on Computing Distances reminded me that I had the latter kicking around and Bruce Harold's Reverse Geocoding uses it.  

 

So I thought I would throw the reference in the Py..Links so I/we don't forget.  

The code can be found on my GitHub link but I have attached a copy to this post.

 

Ispared no effort in code verbosity and I tried to remain faithful to Chris Veness's documentation and presentation of the original code implementation.  His page contains an interactive calculator implemented in Java if you just need a few calculations.  I have included a few sample calculations here should you be interested

 

:Examples:
: long0  lat0  long1  lat1   dist       initial    final  head to
: -75.0, 45.0, -75.0, 46.0   111141.548   0.000,   0.000   N
: -75.0, 46.0, -75.0, 45.0   111141.548 180.000, 180.000   S
: -76.0, 45.0, -75.0, 45.0    78846.334  89.646,  90.353   E
: -75.0, 45.0, -76.0, 45.0    78846.334 270.353, 269.646   W
: -76.0, 46.0, -75.0, 45.0   135869.091 144.526, 145.239   SE
: -75.0, 46.0, -76.0, 45.0   135869.091 215.473, 214.760   SW
: -76.0, 45.0, -75.0, 46.0   135869.091  34.760,  35.473   NE
: -75.0, 45.0, -76.0, 46.0   135869.091 325.239, 324.526   NW
: -90.0,  0.0    0.0   0.0 10018754.171  90.000   90.000   1/4 equator
: -75.0   0.0  -75.0  90.0 10001965.729   0.000    0.000   to N pole
:

 

So use Bruce's if you need to geocode... use Chris's if you just need a few points.... or play with this incarnation should you need to incorporate a function in your own code. I will get around to converting it to NumPy eventually so one can process large sets of origin-destinations that need a tad more than a spherical estimate.  The current version of the program uses iteration which makes it a poor candidate for vectorization on arrays, but there are other implementations one can use.

 

The most recent version of vincenty.py can be found on my github repository numpy_samples/vincenty.py at master · Dan-Patterson/numpy_samples · GitHub 

Generate closest features by distance


From near.py  See references


Emulating Generate Near Table from ArcMap

Let us begin with finding the closest 3 points to every point in a point set.
Well, that is easy... we just use the 'Generate Near Table tool'.

You have been spoiled. This tool is only available at the Advanced license level. You are now working in a job that uses a Standard license... what to do!?

Of course!.... roll out your own.
We will begin with a simple call to 'n_near' in near.py.

We can step through the process...

Begin with array 'a'. Since we are going to use einsum to perform the distance calculations, we need to clone and reshape the array to facilitate the process.

The following array, 'a', represents the 4 corners of a 2x2 unit square, with a centre point. The points are arranged in clockwise order.

>>> a # a.shape => (5, 2)
array([[0, 0],
       [0, 2],
       [2, 2],
       [2, 0],
       [1, 1]], dtype=int32)
The array reshaping is needed in order subtract the arrays.
>>> b = a.reshape(np.prod(a.shape[:-1]), 1, a.shape[-1])
>>> b # b.shape => (5, 1, 2)
array([[[0, 0]],
       [[0, 2]],
       [[2, 2]],
       [[2, 0]],
       [[1, 1]]], dtype=int32)

I have documented the details of the array construction and einsum notation elsewhere. Suffice to say, we can now subtract the two arrays, perform the einsum product summation and finish with the euclidean distance calculation.

The difference array produces 5 blocks of 5x2 values. The summation of the products of these arrays essentially yields the squared distance, from which, euclidean distance is derived. There are other ways of doing this, such as dot product calculations. I prefer einsum methods since it can be scaled up from 1D to n-D unlike most other approaches.

 

>>> diff = b - a # diff.shape => (5, 5, 2)

 

The 'diff' array looks like the following. I took the liberty of using a function in arr_tools (on github) to rearrange the array into a more readable form ( https://github.com/Dan-Patterson/numpy_samples/blob/master/formatting/arr_frmts.py )

>>> import arr_tools as art
>>> art.frmt_(diff)
Array...
-shape (5, 5, 2), ndim 3
. 0 0 0 -2 -2 -2 -2 0 -1 -1
. 0 2 0 0 -2 0 -2 2 -1 1
. 2 2 2 0 0 0 0 2 1 1
. 2 0 2 -2 0 -2 0 0 1 -1
. 1 1 1 -1 -1 -1 -1 1 0 0



The distance calculation is pretty simple, just a bit of einsum notation, get rid of some extraneous dimensions if present and there you have it...

>>> dist = np.einsum('ijk,ijk->ij', diff, diff) # the magic happens...
>>> d = np.sqrt(dist).squeeze() # get rid of extra 'stuff'
>>> d # the distance array...
array([[ 0.0, 2.0, 2.8, 2.0, 1.4],
       [ 2.0, 0.0, 2.0, 2.8, 1.4],
       [ 2.8, 2.0, 0.0, 2.0, 1.4],
       [ 2.0, 2.8, 2.0, 0.0, 1.4],
       [ 1.4, 1.4, 1.4, 1.4, 0.0]])

The result as you can see from the above is a row-column structure much like that derived from scipy's cdist function. Each row and column represents a point, resulting in the diagonal having a distance of zero.

The next step is to get a sorted list of the distances. This is where np.argsort comes into play, since it returns a list of indices that represent the sorted distance values. The indices are used to pull out the coordinates in the appropriate order.

>>> kv = np.argsort(d, axis=1) # sort 'd' on last axis to get keys
>>> kv
array([[0, 4, 1, 3, 2],
       [1, 4, 0, 2, 3],
       [2, 4, 1, 3, 0],
       [3, 4, 0, 2, 1],
       [4, 0, 1, 2, 3]])

>>> coords = a[kv] # for each point, pull out the points in closest order
>>> a[kv].shape # the shape is still not ready for use...
(5, 5, 2)

The coordinate array (coords) needs to be reshaped so that the X, Y pair values can be laid out in row format for final presentation. Each point calculates the distances to itself and the other points, so the array has 5 groupings of 5 pairs of coordinates. This can be reshaped, to produce 5 rows of x, y values using the following.

>>> s0, s1, s2 = coords.shape
>>> coords = coords.reshape((s0, s1*s2)) # the result will be a 2D array...
>>> coords
array([[0, 0, 1, 1, 0, 2, 2, 0, 2, 2],
       [0, 2, 1, 1, 0, 0, 2, 2, 2, 0],
       [2, 2, 1, 1, 0, 2, 2, 0, 0, 0],
       [2, 0, 1, 1, 0, 0, 2, 2, 0, 2],
       [1, 1, 0, 0, 0, 2, 2, 2, 2, 0]], dtype=int32)

Each row represents an input point in the order they were input. Compare input array 'a' with the first two columns of the 'coords' array to confirm. The remaining columns are pairs of the x, y values arranged by their distance sorted order (more about this later).

The distance values are then sorted in ascending order. Obviously, the first value in each list will be the distance of each point to itself (0.0) so it is sliced off leaving the remaining distances.

>>> dist = np.sort(d)[:,1:] # slice sorted distances, skip 1st
>>> dist
array([[ 1.4, 2.0, 2.0, 2.8],
       [ 1.4, 2.0, 2.0, 2.8],
       [ 1.4, 2.0, 2.0, 2.8],
       [ 1.4, 2.0, 2.0, 2.8],
       [ 1.4, 1.4, 1.4, 1.4]])

If you examine the points that were used as input, they formed a rectangle with a point in the middle. It should come as no surprise that the first column represents the distance of each point to the center point (the last row). The next two columns are the distance of each point to its adjacent neighbour while the last column is the distance of each point to its diagonal. The exception is of course the center point (last row) which is equidistant to the other 4 points.

The rest of the code is nothing more that a fancy assemblage of the resultant data into a structure that can be used to output a structured array of coordinates and distances, which can be brought in to ArcMap to form various points or polyline assemblages.


Here are the results from the script...

:-----------------------------------------------------------------
:Closest 2 points for points in an array. Results returned as
: a structured array with coordinates and distance values.
Demonstrate n_near function ....
:Input points... array 'a'
[[0 0]
[0 2]
[2 2]
[2 0]
[1 1]]
:output array
ID     Xo   Yo C0_X C0_Y C1_X C1_Y Dist0 Dist1
0.00 0.00 0.00 1.00 1.00 0.00 2.00 1.41 2.00
1.00 0.00 2.00 1.00 1.00 0.00 0.00 1.41 2.00
2.00 1.00 1.00 0.00 0.00 0.00 2.00 1.41 1.41
3.00 2.00 2.00 1.00 1.00 0.00 2.00 1.41 2.00
4.00 2.00 0.00 1.00 1.00 0.00 0.00 1.41 2.00
:------------------------------------------------------------------
This is the final form of the array.
array([(0, 0.0, 0.0, 1.0, 1.0, 0.0, 2.0, 1.4142135..., 2.0),
       (1, 0.0, 2.0, 1.0, 1.0, 0.0, 0.0, 1.4142135..., 2.0),
       (2, 1.0, 1.0, 0.0, 0.0, 0.0, 2.0, 1.4142135..., 1.4142135...),
       (3, 2.0, 2.0, 1.0, 1.0, 0.0, 2.0, 1.4142135..., 2.0),
       (4, 2.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.4142135..., 2.0)],
      dtype=[('ID', '<i4'),
             ('Xo', '<f8'), ('Yo', '<f8'),
             ('C0_X', '<f8'), ('C0_Y', '<f8'),
             ('C1_X', '<f8'), ('C1_Y', '<f8'),
             ('Dist0', '<f8'), ('Dist1', '<f8')])

I took the liberty of doing some fiddling with the format to make it easier to read. It should be readily apparent that this array could be used as input to NumPyArrayToFeatureClass so that you can produce a featureclass or shapefile of the data.

That is about all now. There are a variety of ways to perform the same thing... hope this adds to your arsenal of tools.


References:
----------
http://desktop.arcgis.com/en/arcmap/latest/tools/analysis-toolbox/generate-near-table.htm

https://github.com/Dan-Patterson/numpy_samples/blob/master/geometry/scripts/near.py


----------------------------------------------------------------------------

Code posted on my GitHub repository... called circle_make.py, perhaps a bit of a misnomer since it works to create all things associated with 'circular' features, which would include ellipses, triangles, squares, rectangles, sectors, arcs, pentagons, hexagons, octagons and n-gons... anything whose points can be placed on a circle.

 

Here are some pictures, you can examine the code at your leisure.  The functions (def) can be used in scripts to work with arcpy, numpy and with some stretching... the field calculator.  If you have any useful examples, pass them on.

 

With donut holes in the middle... the radiating line is a matplotlib artifact...I didn't want to waste time removing it.  

Note that the ring widths are not equal... they need not be, you just set the threshold distances you want.

These last two have a rotation set.  And the ellipse is the result of scaling the y-values and rotating the coordinates.

 

As a simple example of the internal structure of the inputs, the following is an example of two triangles (3 points on a circle) with holes.  The data input is simply an input array as shown by the coordinates and the plotting routine handles the output.

a = buffer_ring(outer=10, inner=8, theta=120, xc=0.0, yc=0.0)
b = buffer_ring(outer=10, inner=8, theta=120, xc=10.0, yc=0.0)
a0 = Polygon(a, closed=False)
b0 = Polygon(b, closed=False)
#plot_([a,b])
a0p = a0.get_xy()
b0p = b0.get_xy()
props = b0.properties()

 

Header 1
>>> a0p
array([[-10.000, 0.000],
       [ 5.000, 8.660],
       [ 5.000, -8.660],
       [-10.000, -0.000],
       [-8.000, -0.000],
       [ 4.000, -6.928],
       [ 4.000, 6.928],
       [-8.000, 0.000]])
>>> b0p
array([[ 0.000, 0.000],
       [ 15.000, 8.660],
       [ 15.000, -8.660],
       [ 0.000, -0.000],
       [ 2.000, -0.000],
       [ 14.000, -6.928],
       [ 14.000, 6.928],
       [ 2.000, 0.000]])

       The triangles are as follows:

     

 

 

The coordinates are above.

The outer rings go clockwise and the inner rings are counter clockwise.  You will notice that the first and last point of each ring are identical.  

Belated Happy 8th python...  Grown quite a bit over the intervening 8 years.  

 

What is new in Python 3.7.... you are growing so fast

 

This blog-ette is just to show some things that should be used more in python 3.x.  Sadly... they aren't even available in python 2.7.x.  (The party is going to be delayed for ArcMap 10.5)

 

I often work with lists and array and tabular data, often of unknown length.  If I want to get the 3rd element in a list or array or table, you can do the old slicing thing which is fine.  

  • But what if you only want the first two but not the rest?
  • What about the last 2 but not the rest?  
  • How about the middle excluding the first or last two?  
  • How about some weird combination of the above.
  • What if you a digital hoarder and are afraid to throw anything away just in case...

 

Yes slicing can do it.  What if you could save yourself a step or two?.

To follow along... just watch the stars... * ... in the following lines.  I printed out the variables all on one line so they will appear as a tuple just like usual.  Now when I say 'junk', I mean that is the leftovers from the single slices.  You can't have two stars on the same line, before I forget but you can really mess with your head by parsing stacked lines with variable starred assignment.  

 

Let's keep it simple ... an array (cause I like them) ... and a list as inputs to the variable assignments... 

 

>>> import numpy as np
>>> a = np.arange(10)
>>>
>>> # ---- play with auto-slicing to variables ----
>>> a0, a1, *a2 = a       # keep first two, the rest is junk
>>> a0, a1, a2
(0, 1, [2, 3, 4, 5, 6, 7, 8, 9])

>>> a0, a1, *a2, a3 = a   # keep first two and last, junk the rest
>>> a0, a1, a2, a3
(0, 1, [2, 3, 4, 5, 6, 7, 8], 9)

>>> *a0, a1, a2, a3 = a   # junk everything except the last 3
>>> a0, a1, a2, a3
([0, 1, 2, 3, 4, 5, 6], 7, 8, 9)

>>> # ---- What about lists? ----
>>> *a0, a1, a2, a3 = a.tolist()  # just convert the array to a list]
>>> a0, a1, a2, a3
([0, 1, 2, 3, 4, 5, 6], 7, 8, 9)

 

Dictionaries too.

>>> k = list('abcd')
>>> v = [1,2,3,4,5]
>>> dct = {k:v for k,v in zip(k, v)}  # your dictionary comprehension
{'c': 3, 'b': 2, 'a': 1, 'd': 4}

>>> # ---- need to add some more ----
>>> dct = dict(f=9, g=10, **dct, h=7)
>>> dct
{'h': 7, 'c': 3, 'b': 2, 'a': 1, 'g': 10, 'f': 9, 'd': 4}

 

Think about the transition to 3?

Update (2017-03)

Pro 1.4.1 uses python 3.5.3, hopefully the next release will use python 3.6 since it is already out and 3.7 is in development.

ArcMap will catchup, but I suspect not at the same rate/pace as PRO.

 

References

Python 3.0 Release | Python.org Happy Birthday Python

Python 2.7 Countdown 

For more examples far more complex than the above, see...

python 3.x - Unpacking, Extended unpacking, and nested extended unpacking - Stack Overflow 

A stats ditty... so I don't forget and some of you may be interested in graphics using MatPlotLib

bivariate_normal.png

Code for the above...written verbosely so you get the drift.

"""
Script:    bivariate_normal.py
Path:      F:\A0_Main\
Author:    Dan.Patterson@carleton.ca

Created:   2015-06-06
Modified:  2015-06-06  (last change date)
Purpose:   To examine the affect of parameters on the bivariate distribution
Requires:  numpy and matplot lib
Notes:
  see help on (np.random.normal)
  x = numpy.array([1, 2, 3])         # put in the values for the x values
  y = numpy.array([10, 20, 30])   # ditto for y
  XX, YY = numpy.meshgrid(x, y)   # make your 'mesh' which is the result
  ZZ = XX + YY                    # in this case the sum of X and Y
  ZZ => array([[11, 12, 13],
               [21, 22, 23],
               [31, 32, 33]])
  >>> (X,Y) = meshgrid(x,y)     # actually Y
  YY, XX = numpy.mgrid[10:40:10, 1:4]  # Y,X
  ZZ = XX + YY # These are equivalent to the output of meshgrid
  YY, XX = numpy.ogrid[10:40:10, 1:4] #
  ZZ = XX + YY # These are equivalent to the atleast_2d example
  which
  XX, YY = numpy.atleast_2d(x, y)
  YY = YY.T # transpose to allow broadcasting
  ZZ = XX + YY
References: many
"""
import numpy as np
import matplotlib.pyplot as plt
np.set_printoptions(edgeitems=3,linewidth=75,precision=2,
                    suppress=True,threshold=10)
# (1) make a floating point grid and get some stats
#     100x100 grid cells numbered from top left
Xs = np.arange(0., 100.1, 1) # as floats
Ys = np.arange(0., 100.1, 1)
Xc = Xs.mean();   Yc = Ys.mean()
Xmin = Xs.min();  Xmax = Xs.max()
Ymin = Ys.min();  Ymax = Ys.max()
# (2) ....now do some work.... -----------------------------------------|
X,Y = np.meshgrid(Xs,Ys)   # X, Y as a meshgrid
XX = X**2                  # square the X
out = 2*X + Y - 3.0        # do the spatial math
Z = X**2 + Y**2            # getting it now?       
# (3) .... calculate the pdf -------------------------------------------|
#   normal pdf/ellipse
rho = -0.5
s_x = 60.0
s_y = 45.0   # 4*3 ratio Xc,Yc = (50,50) stds at 2Std level
d_x = ((X-Xc)/s_x)
d_y = ((Y-Yc)/s_y)
rho2 = (1.0-rho**2)
m_rsq = np.sqrt(rho2)
lower = 2.0*(1-rho**2)                    # rho = 0 lower = 2
upper = d_x**2 + d_y**2 - 2.0*rho*d_x*d_y # if rho= 0 drop last term
left = (1.0/(2.0*np.pi*s_x*s_y*m_rsq))
f_xy = left**(upper/lower)
# (4) .... make a figure -----------------------------------------------|
fig = plt.figure(facecolor='white')
ax = fig.add_subplot(1, 1, 1)
plt.axis('equal')
plt.axis([Xmin, Xmax, Ymax, Ymin]) # decreasing Y
plt.set_cmap('Blues')
cont = plt.contourf(Xs, Ys, f_xy, origin='upper')
plt.title("Bivariate normal distribution")
plt.xlabel("X ==>")
plt.ylabel("<== Y")
cbar = plt.colorbar(cont)
cbar.ax.set_ylabel('values')
lbl_r = "rho = {}\n2 std x\n1 std y".format(abs(rho))  # reverse rho since plotting -ve Y
plt.text(0,20,lbl_r)
#
plt.show()
plt.close()

Check out their code gallery for a multitude of options on the MatPlotLib Home Page.

So ... new interface, time to try out some formatting and stuff.  What a better topic than how to order, structure and view 3D data like images or raster data of mixed data types for the same location or uniform data type where the 3rd dimension represents time.

 

I will make it simple.  Begin with 24 integer numbers and arange them into all the possible configurations in 3D.  Then it is time to mess with your mind and show you how to convert from one arrangement to another.  Sort of like Rubic's Cube, but simpler.

 

So here is the generating script (note the new cool python syntax highlighting... nice! ... but you still can't change the brownish background color, stifling any personal code preferences).  The def just happens to be number 37... it has no meaning, just 37 in a collection of functions

def num_37():
    """(num_37) playing with 3D arrangements...
    :Requires:
    :--------
    :  Arrays are generated within... nothing required
    :Returns:
    :-------
    :  An array of 24 sequential integers with shape = (2, 3, 4)
    :Notes:
    :-----
    :  References to numpy, transpose, rollaxis, swapaxes and einsum.
    :  The arrays below are all the possible combinations of axes that can be
    :  constructed from a sequence size of 24 values to form 3D arrays.
    :  Higher dimensionalities will be considered at a later time.
    :
    :  After this, there is some fancy formatting as covered in my previous blogs.
    """

    nums = np.arange(24)      #  whatever, just shape appropriately
    a = nums.reshape(2,3,4)   #  the base 3D array shaped as (z, y, x)
    a0 = nums.reshape(2,4,3)  #  y, x axes, swapped
    a1 = nums.reshape(3,2,4)  #  add to z, reshape y, x accordingly to main size
    a2 = nums.reshape(3,4,2)  #  swap y, x
    a3 = nums.reshape(4,2,3)  #  add to z again, resize as befor
    a4 = nums.reshape(4,3,2)  #  swap y, x
    frmt = """
    Array ... {} :..shape  {}
    {}
    """

    args = [['nums', nums.shape, nums],
            ['a', a.shape, a], ['a0', a0.shape, a0],
            ['a1', a1.shape, a1], ['a2', a2.shape, a2],
            ['a3', a3.shape, a3], ['a4', a4.shape, a4],
            ]
    for i in args:
        print(dedent(frmt).format(*i))
    return a

 

And here are the results

|-----------------------------------------------------  

|

3D Array .... a 3D array .... a0
Array ... a :..shape  (2, 3, 4)
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

[[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

# This is the base array...
Array ... a0 :..shape  (2, 4, 3)
[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]
  [ 9 10 11]]

[[12 13 14]
  [15 16 17]
  [18 19 20]
  [21 22 23]]]

 

|-----------------------------------------------------
|
In any event, I prefer to think of a 3D array as consisting of ( Z, Y, X ) if they do indeed represent the spatial component.  In this context, however, Z is not simply taken as elevation as might be the case for a 2D raster.  The mere fact that the first axis is denoted with a 2 or above, indicates to me that it is a change array.  Do note that the arrays need not represent anything spatial at all, but this being a place for GIS commentary, there is often an implicit assumption that at least two of the dimensions will be spatial.

 

To go from array a to a0, and conversely, we need to reshape the array.  Array shaping can be accomplished using a variety of numpy methods, including rollaxes, swapaxes, transpose and einsum to name a few.

 

The following can be summarized:

R   rollaxis       - roll the chosen axis back by the specified positions

E   einsum       - for now, just see the swapping of letters in the ijk sequence

S   swapaxes   - change the position of specified axes

T   transpose   - similar to swapaxes, but with multiple changes

 

 

a0 = np.rollaxis(a, 2, 1)           #  a = np.rollaxis(a0, 2, 1)
a0 = np.swapaxes(a, 2, 1)           #  a = np.swapaxes(a0, 1, 2)
a0 = a.swapaxes(2, 1)               #  a = a0.swapaxes(1, 2)
a0 = np. transpose(a, (0, 2, 1))    #  a = np.transpose(a0, (0, 2, 1))
a0 = a.transpose(0, 2, 1)           #  a = np.transpose(a0, 2, 1)
a0 = np.einsum('ijk -> ikj', a)     #  a = np.einsum('ijk -> ikj', a0)

 

When you move on to higher values for the first dimension you have to be careful about which of these you can use, and it is generally just better to use reshape or stride tricks to perform the reshaping

|-----------------------------------------------------
|

3D array .... a13D array .... a2

Array ... a1 :..shape  (3, 2, 4)
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23]]])
Array ... a2 :..shape  (3, 4, 2)
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11],
        [12, 13],
        [14, 15]],

       [[16, 17],
        [18, 19],
        [20, 21],
        [22, 23]]])

|-----------------------------------------------------

3D array .... a2 to a conversion
>>> from numpy.lib import stride_tricks as ast
>>> back_to_a = a2.reshape(2, 3, 4)
>>> again_to_a = ast.as_strided(a2, a.shape, a.strides)
>>> back_to_a
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
>>> again_to_a
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

 

|-----------------------------------------------------

Now for something a little bit different

 

Array 'a' which has been used before.  It has a shape of (2, 3, 4).  Consider it as 2 layers or bands occupying the same space.

array([[[ 0, 1, 2, 3],
        [ 4, 5, 6, 7],
        [ 8, 9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

 

A second array, 'b', can be constructed using the same data, but shaped differently, (3, 4, 2).  The dimension consisting of two parts is effectively swapped between the two arrays.  It can be constructed from:

 

>>> x = np.arange(12)
>>> y = np.arange(12, 24)
>>>
>>> b = np.array(list(zip(x,y))).reshape(3,4,2)
>>> b
array([[[ 0, 12],
        [ 1, 13],
        [ 2, 14],
        [ 3, 15]],

       [[ 4, 16],
        [ 5, 17],
        [ 6, 18],
        [ 7, 19]],

       [[ 8, 20],
        [ 9, 21],
        [10, 22],
        [11, 23]]])

 

If you look closely, you can see that the numeric values from 0 to 11 are order in a 4x3 block in array 'a', but appear as 12 entries in a column, split between 3 subarrays.  The same data can be sliced from their respetive array dimensions to yield

 

... sub-array 'a[0]' or ... sub-array 'b[...,0]'

yields

[[ 0  1  2  3]
[ 4  5  6  7]
[ 8  9 10 11]]

 

The arrays can be raveled to reveal their internal structure.

>>> b.strides # (64, 16, 8)
>>> a.strides # (96, 32, 8)
a.ravel()...[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
b.ravel()...[ 0 12 1 13 2 14 3 15 4 16 5 17 6 18 7 19 8 20 9 21 10 22 11 23]
a0_r = a[0].reshape(3,4,-1) # a0_r.shape = (3, 4, 1)
array([[[ 0],
[ 1],
[ 2],
[ 3]],
[[ 4],
[ 5],
[ 6],
[ 7]],
[[ 8],
[ 9],
[10],
[11]]])

Enough for now.  Learning how to reshape and work with array structures can certainly make dealing with raster data much easier.