ArcPy - Check a field type and execute depending on the type

8940
12
03-13-2015 11:00 AM
JohnSemanchik
New Contributor II

Hello,

I'm not sure if this is the right place to send this so please correct me if I am wrong. 

I had a script that converted data from a clients site to a usable GIS format until they changed the output.  I've tried two different directions with no luck.  Due to how an earlier part of the script works, CSV to Table, and the nature of the data being converted; the 'total_length' field that's imported might end up being a string or an integer (TableToTable conversion will not work).  I need it to look at the field type and if it's already an integer then populate the 'length_val' field with the number.  If it's a string then strip it down to the number (typically the greater than, less than or tilde off the front) and populate the 'length_val' field. 

In other words:

If 'total_length' = STRING, then send only the number to 'length_val'

Otherwise, if 'total_length' = INTEGER, then calculate the field. 

I've been using update cursors to do a lot of the rest of the work, but that doesn't have to be the case.  Any assistance would be great. 

Thanks,

John

Row[3] = 'total_length'

Row[17] = 'length_val'

---First Attempt---

with arcpy.da.UpdateCursor(aoi_table_gdb,aoi_fields) as cursor:

    totlen = "total_length"  #<--- This was just an idea

    fields = arcpy.ListFields(aoi_table_gdb)

    for field in fields:

        if field.name = '"'+ totlen +'"' and field.type = String:

            for row in cursor:

                if row[3] == None:

                    row[17] = None

                elif row[3].find(str('~ ')) > -1:

                    row[17] = row[3].lstrip(str('~ '))

                elif row[3].find(str('> ')) > -1:

                    row[17] = row[3].lstrip(str('> '))

                elif row[3].find(str('< ')) > -1:

                    row[17] = row[3].lstrip(str('< '))

                else:

                    row[17] = row[3]

        else:

            for row in cursor:

                row[17] = row[3]

        cursor.updateRow(row)

---Second Attempt---

with arcpy.da.UpdateCursor(aoi_table_gdb,aoi_fields) as cursor:

    for row in cursor:

        #Length Value

        if row[3] == None:

            row[17] = None

            arcpy.AddWarning("A. Row: " + str(row[1]) + " - " + str(row[3]))

        elif row[3].isdigit():

            row[17] = row[3]

            arcpy.AddWarning("B. Row: " + str(row[1]) + " - " + str(row[3]))

        else:

            length = filter(unicode.isdigit, row[3])

            row[17] = length

            arcpy.AddWarning("C. Row: " + str(row[1]) + " - " + str(row[3]))

Tags (1)
0 Kudos
12 Replies
JamesCrandall
MVP Frequent Contributor

Outside of ESRI stack you can take care of the unwanted characters before converting to your gdb table.  This is just an example of reading the .csv and using string manipulation to remove the '>, < or ~ characters from the total_length field, then using the arcpy.da.NumPyArrayToTable method to get it back into ESRI landia.


The ztest.csv is just 3 cols and 3 rows for testing consisting of:

total_length,IntCol,TextCol
<999.99,999,Some Text
~111.89,123,Some more text
>1.42,321,Even more text

import arcpy
import numpy
import pandas as pd

df = pd.read_csv('H:\ztest.csv')
df['total_length'] = df['total_length'].map(lambda x: x.lstrip('<>~'))    
dfarr = numpy.array(df.to_records(), numpy.dtype([('total_length', '<f8'),('IntCol', numpy.int32),('TextCol', '|S25')]))
tab = 'in_memory\dfarr_tab'
if arcpy.Exists(tab):
    arcpy.Delete_management(tab)

arcpy.da.NumPyArrayToTable(dfarr, tab)
with arcpy.da.SearchCursor(tab, '*') as cursor:
    for row in cursor:
        print str(row[0]) + " " + str(row[1]) + " " + str(row[2]) + " " + str(row[3])
JoshuaBixby
MVP Esteemed Contributor

If there is no chance that floats are involved, the str.isdigit method can be used like you already tried.  Instead of type checking for int, str, or unicode; you can just cast the field as a string and process it like a string.

row[3] = None
if row[17] is not None:
    row[3] = int(filter(unicode.isdigit, unicode(row[17])))

If a float could show up in the string fields, you could go with a regular expression approach.

row[3] = None
    f = re.findall(r"[-+]?\d*\.\d+|\d+", unicode(row[17]))[0]
    f = float(f)
    row[3] = int(f)

The above regular expression is abridged in that it doesn't capture scientific notation, exponents, etc....

There is a good discussion on extracting floats from Python strings over at Stackoverflow:  How to extract a floating number from a string in Python [duplicate].

JohnSemanchik
New Contributor II

Wow, you guys are awesome.  I've been combing through all of this and still running into a few issues, but this has been exciting. 

Blake, I tried to incorporate your code, but the values are all dropping through to None. 

I'm going to look at the last few examples/ideas and see if I can get them hashed out.  I'd like to force the CSV to make the "total_length" column a string from the start, but unfortunately the CSVtoTable code I'm using runs a check on the data and makes the columns what the data are.  So if it's all integers then it makes it a "LONG" but if it's got a string in it then it defaults to TEXT.  We use it to convert other tables so I thought I'd incorporate it here.  (It's actually to solve a separate issue with this entire conversion, but in this case it was one step forward to take two back). 

Thanks everyone for your help so far and when I have it figured out I'll post the result and mark the correct answer. 

Have a great weekend!

0 Kudos