Dan_Patterson

ArcGIS API for Python... Geometry... part II

Blog Post created by Dan_Patterson Champion on Jul 31, 2017

Geometry...

That is what I am normally interested in.  The baggage (aka, attributes) tag along for the ride and I normally find it easier to keep the baggage separate until I am done with the geometry. For those following along, see my previous post.

 

Let us compare some of the ways that we can pull the geometry out of a featureclass.  The following demonstrations can be followed in your own workflow for testing your specific cases.

 

Imports first ___________________________________________________________________________________

import sys
import numpy as np
import arcpy
import arcgis

_______________________________________________________________________________________________

I prefer to import modules in the order of less polluting first to ensure namespace is grabbed by what I want in order of importance in case there is any conflict during import.

 

Searchcursor __________________________________________________________________________________

# ---- pick a featureclass ----
fc0 = r'drive:\your_spaceless_folder_path\your_geodatabase.gdb\polygon'

# ---- using arcpy.Describe ----
desc = arcpy.Describe(fc0)
shp = desc.shapeFieldName    # ---- using dot notation ----

# ---- using arcpy.da.Describe ----
desc = arcpy.da.Describe(fc0)
shp = desc['shapeFieldName'] # ---- note... now a dictionary ----

# ---- geometry, as X, Y ----
flds = [shp + "@"]
shps = [row[0] for row in arcpy.da.SearchCursor(fc0, flds)]

_______________________________________________________________________________________________

 

Which of course you can begin to use to get basic properties.  In this example,

Geometry properties  ____________________________________________________________________________

for s in shps:
    print(s.type, s.partCount, s.pointCount, s.length, s.area)
polygon 1 5 40.0 100.0
polygon 1 10 64.0 64.0
polygon 2 14 99.41640786499875 182.0

_______________________________________________________________________________________________

 

Then you can get to work and convert to a NumPy array quickly and simply.  Since I know this is a polygon featureclass, it only takes a couple of lines to perform the task.

 

Get to the point(s) ______________________________________________________________________________

pnts = []
for shp in shps:
    for j in range(shp.partCount):
        pt = shp.getPart(j)
        p_list = [(pnt.X, pnt.Y) for pnt in pt if pnt]
        pnts.extend(p_list)
dt = [('Xs', '<f8'), ('Ys', '<f8')]
a = np.asarray(pnts, dtype=dt)

_______________________________________________________________________________________________

The basic difference between the array forms is how you want to work with them.

In the example above array 'a' has a specified data type (dtype).  The fields/columns of the array can be accessed by name (Xs and Ys).  Since this array is a structured array, the access would be a['Xs'] and a['Ys'].  If I converted this to a record array, one could use object.field notation.

 

The coordinates are nothing more than the same type of number, so the field names can be dispensed with altogether.  In this case, the sentient being is responsible for knowing what they are working with.  Both forms of the same data are shown below

_______________________________________________________________________________________________

a # a structured array
array([(300020.0, 5000000.0), (300010.0, 5000000.0), (300010.0, 5000010.0), ...,
          (300002.0, 5000002.0), (300008.0, 5000002.0), (300005.0, 5000008.0)],
      dtype=[('Xs', '<f8'), code="" ('ys',="" '<f8')])

a_nd = np.asarray(pnts)  # as an np.ndarray
array([[ 300020.00,  5000000.00],
       [ 300010.00,  5000000.00],
       [ 300010.00,  5000010.00],
       ...,
       [ 300002.00,  5000002.00],
       [ 300008.00,  5000002.00],
       [ 300005.00,  5000008.00]])
a_nd.dtype
dtype('float64')

_______________________________________________________________________________________________

 

The following demo function can be used on your data to examine some of the options and explore some of the properties and methods available to each.  Just don't forget the imports at the start of the post

 

The demo  _____________________________________________________________________________________

def _demo(in_fc):
    """Run the demo and return some objects
    : create a SpatialDataFrame class object from a featureclass
    : create a record array from a_sdf
    : create a numpy.ndarray using the da module
    """

    SR = arcpy.Describe(in_fc).SpatialReference

    sdf = arcgis.SpatialDataFrame
    a_sdf = sdf.from_featureclass(in_fc,
                                  sql_clause=None,
                                  where_clause=None,
                                  sr=SR)
    a_rec = sdf.to_records(a_sdf)  # record array
    a_np = arcpy.da.FeatureClassToNumPyArray(in_fc,
                                             field_names="*",
                                             spatial_reference=SR,
                                             explode_to_points=True)
    a_np2 = fc_g(in_fc)
    return sdf, a_sdf, a_rec, a_np, a_np2

 

Now... behind all the magic of the above options, a searchcursor is behind the acquisition of all of the options shown above.  The options do, however, provide access to methods and properties which are unique to their data class.  In many instances these are shared.

 

Here is what the 'look like' .  In the case of the SpatialDataFrame, a_sdf, the same when converted to a record array and the conventional da.FeatureClassToNumPyArray... all fields were included.  in the last option, a_np2, just the geometry is returned as demonstrated in the above examples, with the addition of handling the geometry parts and point when the polygon is 'exploded' to it's constituent points.

 

a_sdf
Out[40]:
   OBJECTID  Id Text_fld                                              SHAPE
0         1   1     None  {'rings': [[[300020, 5000000], [300010, 500000...
1         2   2        C  {'rings': [[[300010, 5000020], [300010, 500001...
2         3   3        A  {'rings': [[[300010, 5000010], [300010, 500002...

a_rec
Out[41]:
rec.array([ (0, 1, 1, None, {"rings": [[[300020, 5000000], [300010, 5000000], [300010, 5000010], [300020, 5000010], [300020, 5000000]]], "spatialReference": {"latestWkid": 2951, "wkid": 2146}}),
(1, 2, 2, 'C', {"rings": [[[300010, 5000020], [300010, 5000010], [300000, 5000010], [300000, 5000020], [300010, 5000020]], [[300002, 5000018], [300002, 5000012], [300008, 5000012], [300008, 5000018], [300002, 5000018]]], "spatialReference": {"latestWkid": 2951, "wkid": 2146}}),
(2, 3, 3, 'A', {"rings": [[[300010, 5000010], [300010, 5000020], [300020, 5000020], [300020, 5000010], [300010, 5000010]], [[300010, 5000010], [300010, 5000000], [300000, 5000000], [300000, 5000010], [300010, 5000010]], [[300005, 5000008], [300002, 5000002], [300008, 5000002], [300005, 5000008]]], "spatialReference": {"latestWkid": 2951, "wkid": 2146}})],
          dtype=[('index', '<i8'), ('OBJECTID', '<i8'), ('Id', '<i8'), ('Text_fld', 'O'), ('SHAPE', 'O')])

a_np
Out[42]:
array([(1, [300020.0, 5000000.0], 1, 'None', 40.0, 100.0),
       (1, [300010.0, 5000000.0], 1, 'None', 40.0, 100.0),
       (1, [300010.0, 5000010.0], 1, 'None', 40.0, 100.0), ...,
       (3, [300002.0, 5000002.0], 3, 'A', 99.41640786499875, 182.0),
       (3, [300008.0, 5000002.0], 3, 'A', 99.41640786499875, 182.0),
       (3, [300005.0, 5000008.0], 3, 'A', 99.41640786499875, 182.0)],
      dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)), ('Id', '<i4'), ('Text_fld', '<U255'), ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])

a_np2
Out[43]:
array([(1, 0, 300020.0, 5000000.0), (1, 0, 300010.0, 5000000.0),
       (1, 0, 300010.0, 5000010.0), ..., (3, 1, 300002.0, 5000002.0),
       (3, 1, 300008.0, 5000002.0), (3, 1, 300005.0, 5000008.0)],
      dtype=[('ID_num', '<i4'), ('Part_num', '<i4'), ('Xs', '<f8'), ('Ys', '<f8')])

 

When the SpatialDataFrame is converted to a numpy record array, the geometry field (Shape) has a dtype of 'O', which is an object array.  This will be the 'normal' case since it is unlikely that all polygons will contain the same number of parts and points per part.  To truly be a numpy array, the 'shape' of the array needs to be consistent.  

 

The latter two representations, (a_np and a_np2) deal with this but converting the polygons to points.  These points can be converted back to polygons after they are used in some process such as moving, calculating parameters, reprojecting.

 

Next installation... working with geometry in its various representations.

Outcomes