Overwriting hosted table in ArcGIS Online with CSV file from Pandas dataframe in Python

FredMitchell · ‎04-01-2024

I have a script that I'm using to sort through a very large feature layer with thousands of records and create a summary table of them by type. It also calculates the count of records of the last 5 years to the current date for each year rather than the entire calendar year and then calculates the average. My output table is the type, average, and current count for the year. Right now I have this outputting to a CSV but I'm trying to get that CSV to overwrite a hosted feature table. I've referenced some posts on the Esri Community forums and it seems like this should be possible but I keep getting an error. Here is my code:

import arcpy, os, uuid
import pandas as pd
import datetime
from datetime import timedelta
from arcgis.features import GeoAccessor, FeatureLayer
from arcgis.gis import GIS

# variables

url = 'arcgis online url'
username = 'username'
password = 'password'
hostedTableID = 'item ID'
inputfc = r'feature layer from sde on server'
outputCSVFile = r'csv in folder on server'

# get table from AGOL
gis = GIS(url, username, password)
CrimeTrendsTable = gis.content.get(hostedTableID)
CrimeTrendsTableLyr = CrimeTrendsTable.tables[0]
crimeLayer = FeatureLayer(CrimeTrendsTableLyr.url, gis=gis)

# truncate table
crimeLayer.manager.truncate()

# import feature class and create slice of the data set in a new dataframe
df = pd.DataFrame.spatial.from_featureclass(inputfc)
df = df.sort_values(['OFFENSES_YEAR', 'CRIME_STAT_TYPE', 'FROM_DATE'], ascending= [True, True, True], ignore_index=True)
df = df.loc[:, ['OFFENSES_YEAR', 'CRIME_STAT_TYPE', 'FROM_DATE']]

# set variables for today and the previous 5 years
today = datetime.datetime.today()
todate1 = today - timedelta(365)
todate2 = today - timedelta(730)
todate3 = today - timedelta(1095)
todate4 = today - timedelta(1460)
todate5 = today - timedelta(1825)
this_year = datetime.datetime.today().year
one_year_ago = this_year - 1
two_years_ago = this_year - 2
three_years_ago = this_year - 3
four_years_ago = this_year - 4
five_years_ago = this_year - 5

# create new dataframes for current and each previous year to date
df0 = df.loc[(df['OFFENSES_YEAR']==this_year) & (df['FROM_DATE'] < today)]
df1 = df.loc[(df['OFFENSES_YEAR']==one_year_ago) & (df['FROM_DATE'] < todate1)]
df2 = df.loc[(df['OFFENSES_YEAR']==two_years_ago) & (df['FROM_DATE'] < todate2)]
df3 = df.loc[(df['OFFENSES_YEAR']==three_years_ago) & (df['FROM_DATE'] < todate3)]
df4 = df.loc[(df['OFFENSES_YEAR']==four_years_ago) & (df['FROM_DATE'] < todate4)]
df5 = df.loc[(df['OFFENSES_YEAR']==five_years_ago) & (df['FROM_DATE'] < todate5)]

# create tables for each current and previous year grouped by crime type, add together previous year tables and average
table0 = df0.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count().reset_index()\
            .rename(columns={"OFFENSES_YEAR" : this_year})
table1 = df1.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table2 = df2.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table3 = df3.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table4 = df4.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table5 = df5.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
avgtable = pd.concat([table1, table2, table3, table4, table5]).groupby('CRIME_STAT_TYPE').mean().reset_index()\
             .rename(columns={"OFFENSES_YEAR" : "previous5yearAverage"})
trendstable = avgtable.join(table0.set_index('CRIME_STAT_TYPE'), on='CRIME_STAT_TYPE').fillna(0)
trendstable.to_csv(outputCSVFile, sep='\t', encoding='utf-8')

# update hosted table from csv file
csvDF = GeoAccessor.from_table(outputCSVFile)
adds_fs = csvDF.spatial.to_featureset()
adds_dict = adds_fs.to_dict()
adds = adds_dict["features"]
crimeLayer.edit_features(adds=adds)

Here's an example of some data from my csv table:

	CRIME_STAT_TYPE	previous5yearAverage	2024
0	AGGRAVATED ASSAULT	12.000000	9.0
1	ALL OTHER OFFENSES	54.800000	46.0
2	ANIMAL CRUELTY	1.666667	1.0
3	ARSON	1.000000	0.0
4	BAD CHECKS	1.750000	1.0
5	BURGLARY/BREAKING AND ENTERING	44.400000	50.0
45	WIRE FRAUD	6.000000	0.0

This is the error I'm getting when I run the adds_fs = csvDF.spatial.to_featureset() line:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
In  [60]:
Line 1:     adds_fs = csvDF.spatial.to_featureset()

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in to_featureset:
Line 3573:  return FeatureSet.from_dict(self.__feature_set__)

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in __feature_set__:
Line 3297:  if self.sr is None:

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in sr:
Line 3503:  for g in self._data[self.name]

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\frame.py, in __getitem__:
Line 3505:  indexer = self.columns.get_loc(key)

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\indexes\base.py, in get_loc:
Line 3631:  raise KeyError(key) from err

KeyError: None
---------------------------------------------------------------------------

Also here is a screenshot of the CSV. I think something is happening with how pandas is creating the CSV. When trying to read the CSV file, it is not being read correctly. See below:

I'm wondering if it has something to do with the delimiter being used or something else?

I'm very new to all of this so any help I could get would be greatly appreciated.

FredMitchell · ‎04-01-2024

Hi @EarlMedina,

I've got some code that works finally! By taking the dataframe and putting it into an XLS file and then pushing it to a table in a file geodatabase I was able to append to the hosted table from there:

# update hosted table from xls file
trendstable.to_excel(outputXLSFile)
trendstable = arcpy.ExcelToTable_conversion(outputXLSFile, outputTable, 'Sheet1')
trendstable = arcpy.management.Append(inputs=[outputTable], target=CrimeTrendsTableLyr.url)[0]

I'm still not sure why the CSV wouldn't work but I'm glad to have a solution!

View solution in original post

FredMitchell · ‎04-01-2024

I also tried it this way and got a different error:

csvDF = pd.read_csv(outputCSVFile, sep=',')
adds = csvDF.to_dict("records")
crimeLayer.edit_features(adds=adds)

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
In  [22]:
Line 1:     crimeLayer.edit_features(adds=adds)

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\layer.py, in edit_features:
Line 3408:  return self._con.post_multipart(path=edit_url, postdata=params)

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\gis\_impl\_con\_connection.py, in post_multipart:
Line 1270:  return self._handle_response(

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\gis\_impl\_con\_connection.py, in _handle_response:
Line 1008:  self._handle_json_error(data["error"], errorcode)

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\gis\_impl\_con\_connection.py, in _handle_json_error:
Line 1031:  raise Exception(errormessage)

Exception: Cannot perform operation. Invalid operation parameters.
'adds' parameter is invalid
Object reference not set to an instance of an object.
(Error Code: 400)
---------------------------------------------------------------------------

EarlMedina · ‎04-01-2024

Hey, my initial guess from the key error is there's a missing column header. I noticed you are exporting to csv without taking out the index column. Can you try changing this line

trendstable.to_csv(outputCSVFile, sep='\t', encoding='utf-8')

to

trendstable.to_csv(outputCSVFile, sep='\t', encoding='utf-8', index=False)

FredMitchell · ‎04-01-2024

Hi @EarlMedina,

Thanks so much for your reply! I noticed this earlier today and thought that might be an issue as well and so I tried it with index=False and with header=False. I also tried not defining the separator. When I open the csv in excel all the data is there. However, when I open it in ArcGIS Pro some of the fields show all the rows as Null. I wonder if this might have something to do with why it's not working. Same thing happens when I try loading it into the GeoAccessor. I recently tried re-working my code to push the dataframe to an XLS instead of a CSV and then to a table. This seems to be working but I'm still not sure why the CSV is loading with null values. I'll post my code with the XLS work around if I can get it to fully work.

FredMitchell · ‎04-01-2024

Hi @EarlMedina,

I've got some code that works finally! By taking the dataframe and putting it into an XLS file and then pushing it to a table in a file geodatabase I was able to append to the hosted table from there:

# update hosted table from xls file
trendstable.to_excel(outputXLSFile)
trendstable = arcpy.ExcelToTable_conversion(outputXLSFile, outputTable, 'Sheet1')
trendstable = arcpy.management.Append(inputs=[outputTable], target=CrimeTrendsTableLyr.url)[0]

I'm still not sure why the CSV wouldn't work but I'm glad to have a solution!