Calculate Field with arcpy

JuttaSchiller · ‎09-26-2014

At first I made a statistic analyst to get the sum of the field area for equal Ids.Now I need the sum of all area fields from the output-table therefore I did another statistic analyst to get the sum. With a searchcursor I tried to get the sumvalue to put it in the calculate field tool. But in the end it is not calculating the percent, so the field I add before and I don’t know what I did wrong. I need to do it with arcpy because it is a small part of a big tool I’m trying to build and the sumvalue will change from input to input.

It would be wonderful if you can tell me my mistake.

The following code is working, but there is just a 0 in the percent field.

import arcpy

import os

from arcpy import env

env.overwriteOutput = True

env.workspace = r"D:\Users\jul\ers\cities_UA\resultfolder"

inputshp = r"D:\Users\jul\ers\cities_UA\resultfolder\at005l_innsbruck_result.shp"

outtable= r"D:\Users\jul\ers\cities_UA\resultfolder\outtable.dbf"

arcpy.Statistics_analysis(inputshp, outtable, [["area", "SUM"]], "Id")

outtableSUM= r"D:\Users\jul\ers\cities_UA\resultfolder\outtableSUM.dbf"

arcpy.Statistics_analysis(outtable, outtableSUM, [["SUM_area", "SUM"]])

with arcpy.da.SearchCursor(outtableSUM, "SUM_SUM_ar") as cursor:

    for row in cursor:

          print (int(row[0]))



fieldsum = (int(row[0]))

print fieldsum

arcpy.AddField_management(outtable, "prozent", "DOUBLE")#working

arcpy.CalculateField_management(outtable, "prozent", "([SUM_area]*100)/fieldsum", "VB")

curtvprice · ‎09-26-2014

Hey everybody: Posting Code blocks in the new GeoNet‌

Jutta, Ian is right, let me put it another way: you need to pack the sum you've calculated into the calculate expression string like this. I added spaces to your expression for readability. They are not required, it's a style thing.

arcpy.CalculateField_management(outtable, "prozent",
    "( [SUM_area] * 100 ) / {}".format(fieldsum)

A shortcoming here is that you are using a folder workspace so everything writes to shapefile/dbf which truncates all your field names (and there are other limitations too). I highly recommend moving to file gdb, or try the in_memory gdb (very fast - a good choice for small tables like this).

Another piece of advice, always use PYTHON or PYTHON_9.3 with Calculate_Field so your scripts will work in background GP or maybe someday in ArcGIS Pro. (Calculate Field does not support the default VB parser in x64 arcpy.)

I always use upper case field names in my code (even though they are case-insensitive) because... INFO.

tmp1 = "in_memory/xxsum1"
# sum area by ID
arcpy.Statistics_analysis(inputshp, tmp1, [["AREA", "SUM"]], "ID")
# sum SUM_AREA - using arcpy.da cursor (nice plan, Ian!)
fieldsum = 0
with arcpy.da.SearchCursor(tmp1, "SUM_AREA") as cursor:
    for row in cursor:
        fieldsum += row[0]
# create percent field and populate it
arcpy.AddField_management(tmp1, "PROZENT", "DOUBLE")
expr = "100.0 *  !SUM_AREA! / {}".format(fieldsum)
arcpy.CalculateField_management(tmp1, "PROZENT", expr, "PYTHON")
arcpy.CopyRows_management(tmp1, out_table)
arcpy.Delete_management(tmp1)

View solution in original post

IanMurray · ‎09-26-2014

Hi Jutta,

with arcpy.da.SearchCursor(outtableSUM, "SUM_SUM_ar") as cursor:

for row in cursor:

print (int(row[0]))

fieldsum = (int(row[0]))

print fieldsum

you are setting the value of fieldsum after the cursor finishes running. Are you trying to sum up the values in "SUM_SUM_ar"? if so, you need to make it a variable prior to the cursor then add the value of int(row[0]) to it each time

fieldsum = 0

with arcpy.da.SearchCursor(outtableSUM, "SUM_SUM_ar") as cursor:

for row in cursor:

fieldsum+= int(row[0]))

print fieldsum

curtvprice · ‎09-26-2014

Hey everybody: Posting Code blocks in the new GeoNet‌

Jutta, Ian is right, let me put it another way: you need to pack the sum you've calculated into the calculate expression string like this. I added spaces to your expression for readability. They are not required, it's a style thing.

arcpy.CalculateField_management(outtable, "prozent",
    "( [SUM_area] * 100 ) / {}".format(fieldsum)

A shortcoming here is that you are using a folder workspace so everything writes to shapefile/dbf which truncates all your field names (and there are other limitations too). I highly recommend moving to file gdb, or try the in_memory gdb (very fast - a good choice for small tables like this).

Another piece of advice, always use PYTHON or PYTHON_9.3 with Calculate_Field so your scripts will work in background GP or maybe someday in ArcGIS Pro. (Calculate Field does not support the default VB parser in x64 arcpy.)

I always use upper case field names in my code (even though they are case-insensitive) because... INFO.

tmp1 = "in_memory/xxsum1"
# sum area by ID
arcpy.Statistics_analysis(inputshp, tmp1, [["AREA", "SUM"]], "ID")
# sum SUM_AREA - using arcpy.da cursor (nice plan, Ian!)
fieldsum = 0
with arcpy.da.SearchCursor(tmp1, "SUM_AREA") as cursor:
    for row in cursor:
        fieldsum += row[0]
# create percent field and populate it
arcpy.AddField_management(tmp1, "PROZENT", "DOUBLE")
expr = "100.0 *  !SUM_AREA! / {}".format(fieldsum)
arcpy.CalculateField_management(tmp1, "PROZENT", expr, "PYTHON")
arcpy.CopyRows_management(tmp1, out_table)
arcpy.Delete_management(tmp1)

JuttaSchiller · ‎09-27-2014

Hi Ian and Curtis Price,

thanks so much for your answers and advices. I will try them on Monday because I do not have ArcGIS 10.2 at home.

JuttaSchiller · ‎09-29-2014

Thank you so much. It works perfectly.

JoshuaBixby · ‎09-29-2014

When doing Python scripting in ArcGIS, I tend to favor native ArcPy methods over geoprocessing tools. The geoprocessing tools are convenient, but that convenience can come with a performance hit, and the hit can be significant with some tools. Even within the ArcPy realm, I have found methods in the data access module (da) noticeably more efficient than corresponding methods in the base site package (arcpy). (I think this has been one of the main selling points of the da module from the beginning)


# import functions from modules that are available but not commonly imported


from collections import defaultdict
from numpy import fromiter, dtype


# sum area by ID


stats = defaultdict(int)
with arcpy.da.SearchCursor(inputshp, ['ID','AREA']) as cur:
    for k, v in cur:
        stats += v


# sum SUM_AREA - using sum over an iterable of dict's values
fieldsum = sum(stats.itervalues())


# create iterable and populate numpy array


stats_iterable = ((k, v, 100.0 * v / fieldsum) for (k, v) in stats.iteritems())
tmp1 = fromiter(stats_iterable,
                dtype([('ID', 'i4'), ('SUM_AREA', 'f8'), ('PROZENT', 'f8')]))


# Dump numpy array to table


arcpy.da.NumPyArrayToTable(tmp1, out_table)


del tmp1


del stats


....

The code above only uses 2 arcpy functions, both of which are in the data access module. On a randomly generated million-record test data set, cutting out most of the arcpy functions reduced the runtime by more than 75%. Profiling the original code shows the bulk of the extra runtime comes from a single function: analysis.py:(Statistics). I have seen this numerous times, i.e., scripts that call Statistics_analysis on large data sets get really bogged down. I am not sure how large your data sets are, but if they are large and you want to speed them up, rolling your own functions can usually get you performance gains.

JuttaSchiller · ‎10-01-2014

Thank you for this answer, Joshua Bixby. I will try it!