Hello,
I'm not sure if this is the right place to send this so please correct me if I am wrong.
I had a script that converted data from a clients site to a usable GIS format until they changed the output. I've tried two different directions with no luck. Due to how an earlier part of the script works, CSV to Table, and the nature of the data being converted; the 'total_length' field that's imported might end up being a string or an integer (TableToTable conversion will not work). I need it to look at the field type and if it's already an integer then populate the 'length_val' field with the number. If it's a string then strip it down to the number (typically the greater than, less than or tilde off the front) and populate the 'length_val' field.
In other words:
If 'total_length' = STRING, then send only the number to 'length_val'
Otherwise, if 'total_length' = INTEGER, then calculate the field.
I've been using update cursors to do a lot of the rest of the work, but that doesn't have to be the case. Any assistance would be great.
Thanks,
John
Row[3] = 'total_length'
Row[17] = 'length_val'
---First Attempt---
with arcpy.da.UpdateCursor(aoi_table_gdb,aoi_fields) as cursor:
totlen = "total_length" #<--- This was just an idea
fields = arcpy.ListFields(aoi_table_gdb)
for field in fields:
if field.name = '"'+ totlen +'"' and field.type = String:
for row in cursor:
if row[3] == None:
row[17] = None
elif row[3].find(str('~ ')) > -1:
row[17] = row[3].lstrip(str('~ '))
elif row[3].find(str('> ')) > -1:
row[17] = row[3].lstrip(str('> '))
elif row[3].find(str('< ')) > -1:
row[17] = row[3].lstrip(str('< '))
else:
row[17] = row[3]
else:
for row in cursor:
row[17] = row[3]
cursor.updateRow(row)
---Second Attempt---
with arcpy.da.UpdateCursor(aoi_table_gdb,aoi_fields) as cursor:
for row in cursor:
#Length Value
if row[3] == None:
row[17] = None
arcpy.AddWarning("A. Row: " + str(row[1]) + " - " + str(row[3]))
elif row[3].isdigit():
row[17] = row[3]
arcpy.AddWarning("B. Row: " + str(row[1]) + " - " + str(row[3]))
else:
length = filter(unicode.isdigit, row[3])
row[17] = length
arcpy.AddWarning("C. Row: " + str(row[1]) + " - " + str(row[3]))
Outside of ESRI stack you can take care of the unwanted characters before converting to your gdb table. This is just an example of reading the .csv and using string manipulation to remove the '>, < or ~ characters from the total_length field, then using the arcpy.da.NumPyArrayToTable method to get it back into ESRI landia.
The ztest.csv is just 3 cols and 3 rows for testing consisting of:
total_length,IntCol,TextCol
<999.99,999,Some Text
~111.89,123,Some more text
>1.42,321,Even more text
import arcpy import numpy import pandas as pd df = pd.read_csv('H:\ztest.csv') df['total_length'] = df['total_length'].map(lambda x: x.lstrip('<>~')) dfarr = numpy.array(df.to_records(), numpy.dtype([('total_length', '<f8'),('IntCol', numpy.int32),('TextCol', '|S25')])) tab = 'in_memory\dfarr_tab' if arcpy.Exists(tab): arcpy.Delete_management(tab) arcpy.da.NumPyArrayToTable(dfarr, tab) with arcpy.da.SearchCursor(tab, '*') as cursor: for row in cursor: print str(row[0]) + " " + str(row[1]) + " " + str(row[2]) + " " + str(row[3])
If there is no chance that floats are involved, the str.isdigit method can be used like you already tried. Instead of type checking for int, str, or unicode; you can just cast the field as a string and process it like a string.
row[3] = None if row[17] is not None: row[3] = int(filter(unicode.isdigit, unicode(row[17])))
If a float could show up in the string fields, you could go with a regular expression approach.
row[3] = None f = re.findall(r"[-+]?\d*\.\d+|\d+", unicode(row[17]))[0] f = float(f) row[3] = int(f)
The above regular expression is abridged in that it doesn't capture scientific notation, exponents, etc....
There is a good discussion on extracting floats from Python strings over at Stackoverflow: How to extract a floating number from a string in Python [duplicate].
Wow, you guys are awesome. I've been combing through all of this and still running into a few issues, but this has been exciting.
Blake, I tried to incorporate your code, but the values are all dropping through to None.
I'm going to look at the last few examples/ideas and see if I can get them hashed out. I'd like to force the CSV to make the "total_length" column a string from the start, but unfortunately the CSVtoTable code I'm using runs a check on the data and makes the columns what the data are. So if it's all integers then it makes it a "LONG" but if it's got a string in it then it defaults to TEXT. We use it to convert other tables so I thought I'd incorporate it here. (It's actually to solve a separate issue with this entire conversion, but in this case it was one step forward to take two back).
Thanks everyone for your help so far and when I have it figured out I'll post the result and mark the correct answer.
Have a great weekend!