Unable to convert shapefile using pandas dataframe from_featureclass

1512
9
Jump to solution
07-21-2023 08:16 AM
DanielStoelb
Occasional Contributor III

I'm running a script to process an attached zipped shapefile, which converts it to a featureset to append into a hosted feature service. When processing the zipped shapefile, I'm running into an error when using the pandas.DataFrame.spatial.from_featureclass method. Here's the error that I'm getting:

Column SHAPE does not exist

Here's the extract of my code:

 

def extract_geometry_from_shp(layer, objectid, attachmentid):
    #download shapefile and create a featureset.
    import pandas as pd
    from arcgis.features import GeoAccessor, GeoSeriesAccessor
    attachment_folder = os.path.join(os.getcwd(), "attachments")
    file_path = download_attachment(layer=layer, oid=objectid, attachmentid=attachmentid, download_folder=attachment_folder)
    extract_path = os.path.join(attachment_folder, f"{objectid}_{attachmentid}")
    logger.info(f"extracting shapefile(s) to {extract_path}")
    shutil.unpack_archive(file_path, extract_path)
    logger.info(f"extracted shapefile(s)")
    for shp_file in os.listdir(extract_path):
        if shp_file.endswith(".shp"):
            shp_file = os.path.join(extract_path, shp_file)
            logger.info(f"converting to featureset - shapefile: {shp_file}")
            sdf = pd.DataFrame.spatial.from_featureclass(shp_file)   
            logger.debug(sdf.tail())
            f_set = sdf.spatial.to_featureset() 
            logger.debug(f_set)
            return f_set

 

To further clarify, I am running this in an ArcGIS Notebook. This used to work last year, but seems to not be working any more.

Further, the zipped shapefiles are able to be added using the "Add Data" tool in a Web Appbuilder product we have.

Thoughts?

0 Kudos
1 Solution

Accepted Solutions
DanielStoelb
Occasional Contributor III

I was able to resolve this issue by upgrading the ArcGIS Online notebook to the advanced runtime (I was originally using Standard). 

A question for Esri folks that are monitoring these threads - are there any documents out there online that indicate why it needs to be advanced? Is there a different way that the pandas.DataFrame.spatial.from_featureclass() function is run in standard versus advanced? I could not find anything in the documentation online referencing this, unless I’ve missed something on my end.

I don’t know how many other users are affected by this change, but it seems like it would be major, especially for those that have had scripts running in ArcGIS Online notebooks for a while.

View solution in original post

0 Kudos
9 Replies
DanPatterson
MVP Esteemed Contributor

Code formatting ... the Community Version - Esri Community

will provide line numbers to facilitate comment and reference


... sort of retired...
0 Kudos
DanielStoelb
Occasional Contributor III

Just edited it. Hadn't posted code in a forum in a long time. Thanks for the reminder.

0 Kudos
Brian_Wilson
Occasional Contributor III

The following all assumes your shapefile might still be readable just not by the tool you selected.

Some things you have probably already tried 🙂

-- Can you just open it in Pro by adding it to a map?

-- Can you use a Feature to Feature copy to create another feature class and then feed that to the Pandas?

-- Can you try bypassing "arcgis" and use "shapefile" module instead?

 

 

import pandas as pd
import shapefile
sf = shapefile.Reader("Oregon.shp", encoding = 'utf-8')
fields = [x[0] for x in sf.fields][1:]
print(fields)
records = sf.records()
df = pd.DataFrame(records, columns = fields)
# but this is not spatial data, just attributes...
# probably not what you want but fast and easy!!

 

 

You should be able to manipulate it with geopandas too, but first you have to install it. Conda was taking so long to "solve the environment" with the default environment that I just made a new one temporarily with "conda create --name=geopandas -c conda-forge geopandas"

Then you could try this, first "conda activate geopandas"

 

import geopandas as gp
df = gp.read_file('Oregon.shp')
df.to_file('Oregon_copy.shp')

 

Just three lines to read and write a feature class... Of course you can do the usual Pandas dataframe manipulations on "df". Geopandas is not arcpy, so it won't be the same but it's nice to have alternatives when one thing does not work. Geopandas depends on "Fiona" to convert file formats so in theory you should be able to write output to a FGDB.

0 Kudos
Brian_Wilson
Occasional Contributor III

I upgraded to Pro 3 today, and with the latest arcpy I was able to install geopandas into the same environment, very convenient.

0 Kudos
DanielStoelb
Occasional Contributor III

I'm getting an issue when trying to use the shapefile method: "Shapefile dbf header lacks expected terminator."

The files that we are attempting to import appear to have blank dbf files. I tried adding it to Pro, but it won't work, stating "Empty Path Specified". Just attempting to add in a zip file directly to the new map.

I've attached an example zip file for reference.

0 Kudos
Brian_Wilson
Occasional Contributor III

You need to provide a DIFFERENT zip because that one works!! 🙂 Yes, the DBF file is empty.  I copied the DBF file out of the folder and added it to a Pro map as a Standalone Table to make sure it has a row in it. It has a single column OID with a value of 0.

I unzipped the file and rewrote your code, here it is and I ran it in a notebook.

import pandas as pd
import arcgis
import os
def extract_geometry_from_shp(path):
    os.chdir(path)
    for shp_file in os.listdir('.'):
        if shp_file.endswith(".shp"):
            print(shp_file)
            sdf = pd.DataFrame.spatial.from_featureclass(shp_file)   
            f_set = sdf.spatial.to_featureset() 
            return f_set
    return None
path="C:/Users/bwilson/Downloads/072922 Corey Fire"
df=extract_geometry_from_shp(path)
df

dfc620cf6643426fb2e01d5e33f56a91.shp
<FeatureSet> 1 features

 No errors..?? I am disappointed.

0 Kudos
DanielStoelb
Occasional Contributor III

I am not sure why it isn't working on my end. Using that same shapefile and attempting to run this in an ArcGIS Online notebook, it fails to execute properly, stating it can't find a SHAPE field. Could the ArcGIS Online notebook environment be a root cause to this for processing? It properly finds the file using my earlier code, but seems to fail during the sdf = pd.DataFrame.spatial.from_featureclass(shp_file) step.

And here's the logs pertaining to it:

2023-07-24 21:00:51,264: INFO - extracting shapefile(s) to /arcgis/home/attachments/220_115
2023-07-24 21:00:51,274: INFO - extracted shapefile(s)
2023-07-24 21:00:51,276: INFO - converting to featureset - shapefile: /arcgis/home/attachments/220_115/dfc620cf6643426fb2e01d5e33f56a91.shp
2023-07-24 21:00:51,281: ERROR - Column SHAPE does not exist

0 Kudos
DanielStoelb
Occasional Contributor III

I was able to resolve this issue by upgrading the ArcGIS Online notebook to the advanced runtime (I was originally using Standard). 

A question for Esri folks that are monitoring these threads - are there any documents out there online that indicate why it needs to be advanced? Is there a different way that the pandas.DataFrame.spatial.from_featureclass() function is run in standard versus advanced? I could not find anything in the documentation online referencing this, unless I’ve missed something on my end.

I don’t know how many other users are affected by this change, but it seems like it would be major, especially for those that have had scripts running in ArcGIS Online notebooks for a while.

0 Kudos
NicholasGiner1
Esri Contributor

Hi Daniel, the big difference between the "Standard" and "Advanced" Notebook runtimes is that Advanced contains the ArcPy library.  If you read the Geometry Engines section of the Python API doc on the spatially enabled DataFrame (SeDF), you'll see that the Python API automatically chooses a geometry engine based on if ArcPy is present or not.  So it's possible that this is why you experienced different behavior when switching from Standard to Advanced runtime.

thanks,

Nick Giner

Product Manager, ArcGIS Notebooks

0 Kudos