Return all Geometries that Intersect with other Geometries

396
1
11-27-2023 01:42 PM
ericsamson_tract
New Contributor

Hey all,

I need to drop duplicates on geometries within a spatially enabled dataframe. In order to do this, I first need to convert the SHAPE column to a new column using the WKT value:

 

sdf['SHAPE_WKT'] = sdf['SHAPE'].apply(lambda geom: geom.WKT if geom else None)

 

 

However, this takes a long time to run. Is there anyway to limit the size of my sdf but only running this on the geometries that intersect with other geometries? I know you can run intersections between different feature layers, but what about within the SHAPE column of the current working sdf?

 

For anyone curious, I can't drop duplicates on the SHAPE column directly because of the following error that occurs:

 

sdf.drop_duplicates(subset='SHAPE', keep='first')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
In  [46]:
Line 1:     sdf.drop_duplicates(subset='SHAPE', keep='first')

File C:\Users\Eric\AppData\Local\ESRI\conda\envs\assemblage_env\Lib\site-packages\pandas\util\_decorators.py, in wrapper:
Line 311:   return func(*args, **kwargs)

File C:\Users\Eric\AppData\Local\ESRI\conda\envs\assemblage_env\Lib\site-packages\pandas\core\frame.py, in drop_duplicates:
Line 6125:  duplicated = self.duplicated(subset, keep=keep)

File C:\Users\Eric\AppData\Local\ESRI\conda\envs\assemblage_env\Lib\site-packages\pandas\core\frame.py, in duplicated:
Line 6262:  labels, shape = map(list, zip(*map(f, vals)))

File C:\Users\Eric\AppData\Local\ESRI\conda\envs\assemblage_env\Lib\site-packages\pandas\core\frame.py, in f:
Line 6235:  labels, shape = algorithms.factorize(vals, size_hint=len(self))

File C:\Users\Eric\AppData\Local\ESRI\conda\envs\assemblage_env\Lib\site-packages\pandas\core\algorithms.py, in factorize:
Line 749:   codes, uniques = values.factorize(na_sentinel=na_sentinel)

File C:\Users\Eric\AppData\Local\ESRI\conda\envs\assemblage_env\Lib\site-packages\pandas\core\arrays\base.py, in factorize:
Line 1028:  uniques_ea = self._from_factorized(uniques, self)

TypeError: _from_factorized() takes 2 positional arguments but 3 were given
---------------------------------------------------------------------------

 

 

0 Kudos
1 Reply
EarlMedina
Esri Regular Contributor

I believe you can identify which records intersect one another with a spatial join. In my small test case, this works and returns the expected records:

 

 

from arcgis import GIS
from arcgis.features import FeatureLayer

gis = GIS("https://machine.domain.com/portal", "user", "pass")
fl_url = "https://src.domain.com/server/rest/services/test/FeatureServer/0"
fl = FeatureLayer(fl_url, gis)
df = fl.query(as_df=True)

joined = df.spatial.join(
    df.copy(), 
    how="left", 
    op="intersects", 
    left_tag="left", 
    right_tag="right"
)

intersecting_records = joined.loc[joined.OBJECTID_left != joined.OBJECTID_right]

 

 

Hope this helps! This was a really good question that I'm sure many will get value from.