Update cursor with joined tables work around w/ dictionaries

6304
14
Jump to solution
04-12-2012 09:24 AM
MathewCoyle
Frequent Contributor
This post could easily be called "How I fell in love with dictionaries"

Drawing the idea from this post http://forums.arcgis.com/threads/52511-Cool-cursor-dictionary-constructor-one-liner

I've come up with a solution to a nagging problem I know I have been having, and I believe some others have as well, of not being able to reliably use an update cursor when dealing with joined tables. I was really happy with my first foray into dictionaries, and I thought I'd share my work around for anyone looking to optimize some tedious processing with joins. My data was ~900k rows of forest stand data in one table, and a strata reference table of ~50 rows to calculate volumes. My previous method of using a permanent JoinField, processing, then deleting those fields, took approximately 3.5 hours. Temporary joins never worked for me in the manner I needed. Using dictionaries instead of joins, that time was reduced to under 15 minutes.

This code goes through any table and creates a list of field names for every field other than OID and the key field you want to reference.

Here is the fairly complete code to create the dictionary
    print "Starting function"     # Define and setup variables, tables, key field etc     calc_table = arcpy.MakeTableView_management(table_path)     vol_tab = join_table_path     strata_tab = "in_memory/temp"     arcpy.MakeTableView_management(vol_tab, strata_tab)     joinField = "STRATA"          # Create list of value fields, leaving out OID field and key/join field     flistObj = arcpy.ListFields(strata_tab)     flist = []     for f in flistObj:         if f.type != "OID" and f.name != joinField:             flist.append(f.name)      # Create empty dict object then populate each key with a sub dict by row using value fields as keys     strataDict = {}      for r in arcpy.SearchCursor(strata_tab):         fieldvaldict = {}         for field in flist:             fieldvaldict[field] = r.getValue(field)         strataDict[r.getValue(joinField)] = fieldvaldict      del strata_tab, flistObj


In the update cursor you can then either explicitly reference dictionary objects like this
    rows = arcpy.UpdateCursor(calc_table, "\"%s\" IS NOT NULL" % joinField)     for row in rows:         strata = row.getValue(joinField)         variable = strataDict[strata]["sub_key_field"]


What I did was use a reference list to reference the dictionary to keep things legible, and so I could remember what went where. This may not even be necessary for some people, but it helped me conceptually. Without getting in to too much detail, here's essentially my update cursor sans the actual calculations.
    species = [     ("C","Fb","FB_STEMS"),("C","Sw","SW_STEMS"),("C","Pj","PJ_STEMS"), # 0,1,2     ("C","Pl","PJ_STEMS"),("C","Lt","LT_STEMS"),("C","Sb","SB_STEMS"), # 3,4,5     ("D","Bw","BW_STEMS"),("D","Aw","AW_STEMS"),("D","Pb","PB_STEMS")  # 6,7,8     ]     sp_fields = [("SP1","SP1_PER"),("SP2","SP2_PER"),("SP3","SP3_PER"),     ("SP4","SP4_PER"),("SP5","SP5_PER")]     print "Beginning updates"     rows = arcpy.UpdateCursor(calc_table, "\"%s\" IS NOT NULL" % joinField)     for row in rows:         strata = row.getValue(joinField)         for sp, per in sp_fields:             sp_type = row.getValue(sp)             spp_f = float(row.getValue(per))             if spp_f > 0:                 for grp, spec, stem in species:                     stem_f = strataDict[strata][stem]                     (...)


Hopefully that didn't get too convoluted, anyone else have anything that might contribute in terms of optimization?
Tags (2)
14 Replies
LisaMay
New Contributor
I'm creating a dictionary from featureclasses - emassDict =
{u'1': [2009621.0, 2009622.0, 2009624.0, 2009623.0, 2009625.0, 2009626.0, 2009627.0]}{u'2': [2009633.0]}{u'3': [2009632.0, 2009631.0, 2009630.0, 2009629.0, 2009628.0]}{u'4': [2009617.0, 2009611.0, 2009610.0, 2009614.0, 2009620.0, 2009612.0, 2009616.0, 2009615.0, 2009613.0, 2009618.0, 2009607.0, 2009605.0, 2009619.0, 2009609.0, 2009606.0, 2009608.0]}{u'5': [2009604.0, 2009601.0, 2009600.0, 2009603.0, 2009602.0]}{u'6': [2009100.0]}{u'7': [2009009.0]}{u'8': [2009004.0, 2009005.0, 2009007.0, 2009008.0, 2009001.0, 2009003.0, 2009002.0, 2009006.0]}{u'9': [2009500.0]}

In this same script I want to update one of the fields "iField" with the values from the dictionary - if the key matches the values in another field "eZoneName".    The dictionary is being created, but "iField" is not being populated.  I'm not receiving any error messages so it has to be in the logic, but I can't see it.  Please help, the total script is here:

eZones = r"C:\temp\NLF.gdb\NLF_EM_2009_Dissolve"
eZoneName = str("UniqueID")
iField = "All_EM_List"

eIncidents = r"C:\temp\NLF.gdb\NLF_EM_2009_Identity"
emNameField = ("E_MASS")
joinField = "Dissolve_FID"
arcpy.MakeFeatureLayer_management(eIncidents, "eIncidentsLayer")
    
with arcpy.da.UpdateCursor(eZones, (eZoneName, iField)) as zoneRows:
    for zone in zoneRows:
        eZoneNameString = zone[0]
        queryString = '"' + eZoneName + '" = ' + "'" + eZoneNameString + "'"

        arcpy.MakeFeatureLayer_management(eZones, "CurrenteZonesLayer", queryString)

        try:
            arcpy.SelectLayerByLocation_management("eIncidentsLayer", "CONTAINED_BY", "CurrenteZonesLayer")
            
            emassDict = dict()
            for row in arcpy.SearchCursor("eIncidentsLayer"):
                emName = row.getValue(emNameField)
                snName = row.getValue(joinField)
                
                if snName in emassDict:
                    emassDict[snName].append(emName)
                else:
                    emassDict[snName] = [emName]
                    
            print emassDict


            if  eZoneNameString == [snName]:
                zone[1] = [emName]
                zoneRows.updateRow(zone)
                
        except arcpy.ExecuteError:
            print(arcpy.GetMessages(0))

        finally:
            arcpy.Delete_management("CurrenteZonesLayer")

arcpy.Delete_management("eIncidentsLayer")
del zone, zoneRows
0 Kudos
MathewCoyle
Frequent Contributor
This is never true so you are not stepping into the update line.

if  eZoneNameString == [snName]:


I also think you may be confusing lists and dictionaries.
0 Kudos
TylerLauzen1
New Contributor

This is how I've implemented the new DA cursors. The cursor is limited to the fields you want to update with the join field located at index 0, the dictionary is created with row[0] as the key and easily accessed by the update cursor.

#Define fields to update, and the field to use as join field

Fields = ['Direction', 'Cost', 'year', 'Color']

Key = "UniqueID"

Fields.insert(0, Key)

#Create Dictionaries ; The dictionaries store the values from the update table in memory

x = len(Fields)

UpdateDict = {}

#Iterates through all values in the table and stores them in the update dictionary

#Dictionary format; Join Field value : list of field values

with arcpy.da.SearchCursor(Table, Fields) as cursor:

    for row in cursor:

        FieldValDict = {}

        for y in range(1,x):

            FieldValDict = row

        UpdateDict[row[0]] = FieldValDict

#Updates the FC from the Update Dictionary

#Uses the Join Field value to look up update values

with arcpy.da.UpdateCursor(Input, Fields) as cursor:

    for row in cursor:

        for y in range(1,x):

            row = UpdateDict[row[0]]

            cursor.updateRow(row)

0 Kudos
Luke_Pinner
MVP Regular Contributor

You could replace:

        FieldValDict = {}

        for y in range(1,x):

            FieldValDict = row

With:

        FieldValDict = dict(zip(fields[1:], row[1:]))

0 Kudos
FilipKrál
Occasional Contributor III

Hi all,

Arcapi has these kind of functions for joining tables:

join_using_dict

https://github.com/NERC-CEH/arcapi/blob/master/arcapi.py/#L2052-2153

update_col_from_dict

https://github.com/NERC-CEH/arcapi/blob/master/arcapi.py/#L1207-1272

In many cases it is much faster than the Join Tool.

Filip.