Struggling to understand field mapping

clt_cabq · ‎07-20-2023

I'm writing a process to append data from one table to a new one with a different schema and so need to used field mappings to associate an input field to the output field where the field names are not equivalent. Reading through the help guides on field mappings I am seeing this approach and I'm not sure why this is being done. This code seems circular, first setting a variable equal to an object, setting some attributes of the object and then reassigning the object back to itself? Anyone have an explanation to help me understand what is being accomplished here?

These lines of code are from this help article:

FieldMap help

diam_name = fm_diam.outputField
diam_name.name = 'Veg_Diam'
fm_diam.outputField = diam_name

I really wish there was an easier way to achieve this, the fieldmap/fieldmappings approach seems really arcane.

JohannesLindner · ‎07-20-2023

They're reading the outputField property, change it, and then set it to the changed value.

Here's the doc for the outputField property:

outputField
(Read and Write)
The properties of the output field are either set or returned in a Field object.
Field

So outputField is a Field object. Why don't they construct a brand-new Field and set outputField to that?

Well, let's look at Field's properties:

That's a lot of properties you'd have to set. Granted, you probably only really need name and type, but maybe some other properties are important for your use case. And the FieldMap object automatically sets these properties to appropriate values.

So the easier way is to get this automatically created Field object, change one of it's properties (name) and give it back to the FieldMap.

I really wish there was an easier way to achieve this, the fieldmap/fieldmappings approach seems really arcane.

I've never really used FieldMap, but I agree, this isn't the simplest way. If you know your input fields, you can just execute an Append or Merge manually and copy the Python code from the Geoprocessing History.

Alternatively, you could do it with the arcpy.da.*Cursor objects.

Have a great day!
Johannes

View solution in original post

JohannesLindner · ‎07-20-2023

They're reading the outputField property, change it, and then set it to the changed value.

Here's the doc for the outputField property:

outputField
(Read and Write)
The properties of the output field are either set or returned in a Field object.
Field

So outputField is a Field object. Why don't they construct a brand-new Field and set outputField to that?

Well, let's look at Field's properties:

That's a lot of properties you'd have to set. Granted, you probably only really need name and type, but maybe some other properties are important for your use case. And the FieldMap object automatically sets these properties to appropriate values.

So the easier way is to get this automatically created Field object, change one of it's properties (name) and give it back to the FieldMap.

I really wish there was an easier way to achieve this, the fieldmap/fieldmappings approach seems really arcane.

I've never really used FieldMap, but I agree, this isn't the simplest way. If you know your input fields, you can just execute an Append or Merge manually and copy the Python code from the Geoprocessing History.

Alternatively, you could do it with the arcpy.da.*Cursor objects.

Have a great day!
Johannes

clt_cabq · ‎07-20-2023

Thank you for this Johannes, this helps me understand what is going on here and that makes sense.

That said, in the interval since I posted this, I was able to implement a nested Insert/Search cursor to accomplish the same approach - much simpler, I think. The field mapping tools make sense in the desktop geoprocessing tools, but implemented in code makes them challenging.

TylerT · ‎07-22-2023

@clt_cabq et al,
I am working on mapping an existing feature class to new feature class with name changes only, so doing some research on all the options to perform this.

Field mappings are attractive to me because they are a parameter in the Export Features tool. Nice and clean in one fell swoop.

arcpy.conversion.ExportFeatures(in_features, out_features, {where_clause}, {use_field_alias_as_name}, {field_mapping}, {sort_field})

Alter Field is an option but requires an additional step.

arcpy.management.AlterField(in_table, field, {new_field_name}, {new_field_alias}, {field_type}, {field_length}, {field_is_nullable}, {clear_field_alias})

Now, I'm reading you used nested Insert/Search cursor. Can you expound on this technique. I've worked with cursors minimally and am curious to what your solution looks like. Thx.

Tyler

Anonymous User · ‎07-22-2023

I have several fieldmapping snippets that I refer to often for this purpose. It really helps when doing spatial joins and only keeping the fields that you want to bring over or exporting only the fields you want without needing to do a delete fields after export. Was going to post this for clt_cabq but ran out of time. 4 variations of the fieldmapping use:

"""
Keeping only a selected list of fields:

Creates a fieldmap that keeps all of the output fields listed in the keepfields:
"""
fc = r'C:\Users\...\Documents\ArcGIS\Local.gdb\address'
fieldmappings = arcpy.FieldMappings()
fieldmappings.addTable(fc)

keepfields = ['address','bldg_no','space','units']

# Removing unwanted fields
for field in fieldmappings.fields:
    if all([not field.required, field.name not in keepfields]):
        fieldmappings.removeFieldMap(fieldmappings.findFieldMapIndex(field.name))

arcpy.FeatureClassToFeatureClass_conversion(fc, r'C:\Users\...\Documents\ArcGIS\Projects\Testing', 'adr.shp', '', fieldmappings)

"""
Keeping only the changed fields version:

Creates a field map between a two featureclasses and keeps only fields that are required and 
maps the incoming names to new field names in the destination fc.
"""
fc1 = r'C:\Users\...\Documents\ArcGIS\Local.gdb\address'
fc2 = r'C:\Users\...\Documents\ArcGIS\Local.gdb\parcel'

# {incoming field name: desired name}
fldPair = {'AddressNum': 'bldg_no',
           'PreDirecti': 'st_dir',
           'PreType': 'st_type',
           'StreetName': 'st_name',
           'PostDirect': 'suf_dir',
           'Unit': 'units',
           'City': 'town',
           'ZipCode': 'zip',
           'FullAddres': 'address'}

fldmap = arcpy.FieldMappings()

# Creating field maps for the two files
fldmap.addTable(fc1)
fldmap.addTable(fc2)

# Removing all unwanted fields
for field in fldmap.fields:
    if not field.required:
        fldmap.removeFieldMap(fldmap.findFieldMapIndex(field.name))

# create field map using fields from the incoming fc (ifld) to fields in the target
# tfld
for ifld, tfld in fldPair.items():
    ufldmap = arcpy.FieldMap()
    ufldmap.addInputField(fc1, ifld)
    ufldmap.addInputField(fc2, tfld)
    # ufldmap.mergeRule = "Unique"
    ufldmap_name = ufldmap.outputField
    ufldmap_name.name = tfld
    ufldmap.outputField = ufldmap_name
    fldmap.addFieldMap(ufldmap)


arcpy.Append_management(fc2, fc1, field_mapping= fldmap)



"""
Keeping all mapped fields, but changing some field names version:

Creates a field map between the targetfc and incomingfc and changes the mapping
of the fields listed in the fldPair dictionary
"""

fc1 = r'C:\Users\...\Documents\ArcGIS\Local.gdb\address'
fc2 = r'C:\Users\...\Documents\ArcGIS\Local.gdb\parcel'

# {incoming field name: targetFC field}
fldPair = {'AddressNum': 'bldg_no',
           'PreDirecti': 'st_dir',
           'PreType': 'st_type',
           'StreetName': 'st_name',
           'PostDirect': 'suf_dir',
           'Unit': 'units',
           'City': 'town',
           'ZipCode': 'zip',
           'FullAddres': 'address'}

fldmap = arcpy.FieldMappings()

# Creating field maps for the two files
fldmap.addTable(fc1)
fldmap.addTable(fc2)

# Removing fields whose names will be changed.
for field in fldmap.fields:
    if field.name in fldPair.keys():
        fldmap.removeFieldMap(fldmap.findFieldMapIndex(field.name))

# create field map using fields from the incoming fc (ifld) to fields in the target
# tfld
for ifld, tfld in fldPair.items():
    ufldmap = arcpy.FieldMap()
    ufldmap.addInputField(fc1, ifld)
    ufldmap.addInputField(fc2, tfld)
    # ufldmap.mergeRule = "Unique"
    ufldmap_name = ufldmap.outputField
    ufldmap_name.name = tfld
    ufldmap.outputField = ufldmap_name
    fldmap.addFieldMap(ufldmap)

arcpy.SpatialJoin_analysis(fc2, fc1, out_fc, 'INTERSECT', field_mapping= fldmap)

"""
Renaming fields version of single featureclass (for exporting):

Creates a fieldmap of output fields and maps the input fields names.

"""
fc = r'C:\Users\..\Documents\ArcGIS\Projects\Testing\Default.gdb\team_city'
out_features = r'C:\Users\..\Documents\ArcGIS\Projects\Testing\Default.gdb\team_city_update'

fieldmappings = arcpy.FieldMappings()
fieldmappings.addTable(fc)

# {incoming field name : targetFC field}
fldPair = {'UIDENT': 'bldg_no',
           'POPCLASS': 'st_dir',
           'STATEABB': 'st_type'}

for kfld, tfld in fldPair.items():
    fldmp = fieldmappings.getFieldMap(fieldmappings.findFieldMapIndex(kfld))
    ufldmap_name = fldmp.outputField
    ufldmap_name.name = tfld
    ufldmap_name.aliasName = tfld
    fldmp.outputField = ufldmap_name
    fieldmappings.replaceFieldMap(fieldmappings.findFieldMapIndex(kfld), fldmp)

arcpy.conversion.ExportFeatures(fc, out_features, field_mapping=fieldmappings)

Hope this helps- If you can get into the fieldmap object in the debugger, you can see how things work and what other properties you can set using these methods.

clt_cabq · ‎07-24-2023

Thanks for this Jeff, appreciate it. I like the idea of field mappings because it is explicit, i hate it because it seems very tedious with lots of room for error - I was looking at having to do mappings for about 15 fields. The add table approach looks like it might be the efficient solution to that.

clt_cabq · ‎07-24-2023

What I'm doing is taking records from one table that has undergone a lot of modification via field calcs, to change values, intersects, etc. with the result being a table that is very messy and then appending those to a new, empty table that has a clean schema. Field names from the first have to be matched to the second and the code I am using to do this looks like this:

outfields = ['SHAPE@', 'field1', ...'fieldn']
in_fields = ['SHAPE@','field_a',...'field_z']
      with arcpy.da.InsertCursor(empty_fc,outfields) as insertcursor:
            with arcpy.da.SearchCursor(source_fc,in_fields) as cursor:
                  for row in cursor:
                        new_row = list(row)                     
                        insertcursor.insertRow(new_row)

The fields identified in outfields and in_fields essentially perform the field mapping by 'position' of the fields in the list which need to match the order in the tables - field1 gets populated by values in field_a, and so on. You have to have the right number of fields listed in the lists of outputfields or you get an error. Also, in this case I'm populating a spatial table, a feature class, and using the SHAPE@ field adds the geometry from the source data to the new features. There is also some opportunity to modify the input rows on the fly before inserting them into the new table - in my case, my last field identified in the empty table, has a field that isn't in the source table - but I can create an empty value and append that to the end of the input row like this:

new_row = list(row)
pgm = '' # creates an empty value
new_row.append(pgm)
# appends new item to input row so the number of input
# values match the output destinations.
insertcursor.insertRow(new_row)

This seems to work really efficiently and avoids the whole field mapping mess. In my process, I do this process two different times - one against a spatial layer that contains about 30K records, and another non-spatial table that contains about 115K records, and it seems to move pretty quickly. Hope this helps!

TylerT · ‎07-22-2023

ESRI uses this pattern in other places too. For instance in arcpy.mp:

It took me a while to get somewhat comfortable with it. My current mental model is: get object (think of as template) > modify properties > set object, which gets me over the hump.

This pattern does feel idiosyncratic, and I've never truly understood if the API could or couldn't be improved to set these properties directly.

Coincidentally, I am studying field mappings right now and have some questions that I will post separately.

Tyler

clt_cabq · ‎07-24-2023

Thanks - that way of looking at this helps; I hadn't considered that getting the object initially leaves it as essentially an 'empty' occurrence even once its attributes have been modified.

TylerT · ‎07-24-2023

With inspiration from @Anonymous User snippets above, here's what I cobbled together to take care of simple renaming for my needs. One only needs to pass in a name mapper dictionary of changes and full schema will be returned if desired with an optional parameter.

def fms_builder(fc, name_mapper, full_schema=True):
    """
    Build a Field Mappings Object.  For renaming fields only.  Full_schema keeps full schema,
    or discards non-required schema not in name_mapper.  
    """
    #Instansiate FieldMappings
    fms = arcpy.FieldMappings()
    
    # if full schema otherwise adds only input fields later
    if full_schema:
        fms.addTable(fc)

    # Loop thru name mapper dictionary
    for name_old, name_new in name_mapper.items():

        # Remove old FieldMap fields if full schema added
        if full_schema:
            fms.removeFieldMap(fms.findFieldMapIndex(name_old))

        # Instansiate FieldMap
        fm = arcpy.FieldMap()

        # Add InputField to FieldMap
        fm.addInputField(fc, name_old)

        # Get copy of outputField properties as starting point
        f_output_field = fm.outputField

        # Develop/modify outputField properties
        f_output_field.name = name_new
        f_output_field.aliasName = name_new

        # Set outputField properties
        fm.outputField = f_output_field

        # Add FieldMap to FieldMappings
        fms.addFieldMap(fm)

    return fms
    
if __name__ == "__main__":
    # Name mapper dictionary old names to new names
    name_mapper = {'f1_old': 'f1_new'
                  ,'f2_old': 'f2_new'
                  ,'f3_old': 'f3_new'
                  ,'f4_old': 'f4_new'}
    
    fms = fms_builder(fc, name_mapper)
    
    # View results as string in Jypyter
    display(fms.exportToString())

Tyler