Slow performance with python?

3314
5
12-22-2010 06:12 AM
AndrewTewkesbury
New Contributor
Hi,

I'm new to python and have just made my first script but am very disapointed with the performance acheived, not sure if this is a problem with the script or inherent? First my script......

I'm working on a problem of spatially joining road edge vertices to the nearest road centreline vertex. The potential problem encountered here is where wide road edges are closer to an adjacent roads centreline than their own and hence will be joined to the wrong road. To solve this I draw a line between two candidate vertices and proceed only if this line is completely within the road area (the joining line does not jump between roads).

Here is an exerpt of the script which is looping through the road edge vertices:

#start the counter. Used to select vertices sequentially
count = 1
#Setup search cursor for the road outline vertices
rows1 = arcpy.SearchCursor("RoadOutline_Points_Layer")
#Loop through each road outline vertex joining each to its nearest intrinsic centreline vertex
for row1 in rows1:
    #create a variable used to select outline (OL) vertices by their unique ID
    id_sel = "OLID = %f" % count

    #Select a point based upon its ID
    arcpy.SelectLayerByAttribute_management("RoadOutline_Points_Layer","NEW_SELECTION",id_sel)

    #Calculate the distance between each road centreline vertex and the selected outline point
    arcpy.Near_analysis(p_centrelinepoints,"RoadOutlin e_Points_Layer")

    #Sort the centreline vertices by this calculated distance and export to a new featureclass
    arcpy.Sort_management(p_centrelinepoints,"RoadCentreline_Points_Near","NEAR_DIST")

    #Delete the near analysis fields from the centre line points
    arcpy.DeleteField_management(p_centrelinepoints,["NEAR_DIST","NEAR_FID"])

    #Create a layer from the new FC
    arcpy.MakeFeatureLayer_management("RoadCentreline_ Points_Near","RoadCentreline_Points_Near_Layer")

    #Setup a loop for the centreline vertices so that starting with the nearest point they are interated through until the conditions are met. the conditions being that the line between the two points is intrinsic to the road area
    #Variables for loop control and counter respectively
    nearloop = 0
    nearloop_count = 0

    #loop until the conditions are not met
    while nearloop == 0:
        #Increment the counter
        nearloop_count = nearloop_count + 1
        #create a varaible to select the centreline vertex by unique ID (CLID)
        objectid_sel = "OBJECTID = %f" % nearloop_count
        #Select the sorted record
        arcpy.SelectLayerByAttribute_management("RoadCentr eline_Points_Near_Layer","NEW_SELECTION",objectid_ sel)

        #Merge with the selected outline point to give a new FC. This gives a feature class with the road outline vertex and a candidate centreline vertex
        arcpy.Merge_management(["RoadCentreline_Points_Near_Layer","RoadOutline_Po ints_Layer"],"temp_pair_points")

        #Draw a line between these two points
        arcpy.PointsToLine_management("temp_pair_points"," temp_pair_line")

        #Erase the line by the road area. If all of the line is erased then the path between the two points is intrinic to the road and so is ok to spatially join.
        arcpy.Erase_analysis("temp_pair_line",p_area,"temp _pair_line_Erase",p_precision)

        #Test to see if all of the road has been erased by finding the number of records in the erased layer
        erasecount = arcpy.GetCount_management("temp_pair_line_Erase")
        erasecount = int(erasecount.getOutput(0))

        #Test the erased layer. If there are no erase features the point is good and can be joined with the outline point, otherwise the loop continues.
        if erasecount == 0:
            #The conditions are met stop the loop
            nearloop = 1
            #Join the two points
            #Create a layer from the feature class containing the two points
            arcpy.MakeFeatureLayer_management("temp_pair_point s","temp_pair_points_layer")
            #Select the centreline point
            arcpy.SelectLayerByAttribute_management("temp_pair _points_layer","NEW_SELECTION","CLID IS NOT NULL")
            #Setup a search cursor for this featureclass
            rows2 = arcpy.SearchCursor("temp_pair_points_layer")

            for row2 in rows2:
                #get the unique ID of the selected centreline vertex
                joinCLID = row2.getValue("CLID")

            #Write this(Join) to the outline vertex
            arcpy.CalculateField_management("RoadOutline_Point s_Layer","CLID",joinCLID)

        else:
        #The conditions are not met continue the loop
        nearloop = 0

    #This outline vertex is complete, increment the counter by 1 to start the next point
    count = count + 1
I know this is pooly constructed code but it does work. My problem is that the performance is terrible. For each outline vertex the above process takes approx. 50 seconds on a decent machine. 1 sqkm of data will have about 9000 road edge vertices which would take around 5 days....I don't understand how it is taking so long it is not to far from the individual process being perfromed manually in real time. Looking at the processes the 'heaviest' is the near analysis, but i timed this (from toolbox) and it took 5 seconds. I have timed all of the processes inddividually and for each point they add up to about 11 seconds...any ideas as to why it is taking 5x longer through python?

Incidentally the nested loop which searches through to find a correctly joined centreline almost always only iterates through once as the closest is almost always ok, so this will not be eating up time on a regular occurance.

Any help greatly appreciated.

Andrew
Tags (2)
0 Kudos
5 Replies
ChrisSnyder
Regular Contributor III
Could you re post your code using codetags (just select the code and click the "#" button in the message composition GUI. That way the all-important indentation will come across and the code will be easier to read.
0 Kudos
AndrewTewkesbury
New Contributor
Thanks Chris, hopefully it should make a bit more sense now.
0 Kudos
ChrisSnyder
Regular Contributor III
That is pretty hard core geoprocessing! Some suggestions/comments:

1) Instead of arcpy.SelectLayerByAttribute_management("RoadOutline_Points_Layer","NEW_SELECTION",id_sel) use the MakeFeatureLayer tool (and the SQL parameter) - much faster to execute SQL selections because there is no "SelctionType" (CLEAR_SELECTION, SWITCH_SELECTION, etc) overhead I think.

2) Try using the "in_memory" workspace - this should increase the speed of erase, near, etc .by ~30% assuming the input and output FCs are both in the in_memory workspace. See http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//002w0000005s000000.htm

3) You should consider running the Near tool for all the features all at once (instead of just point by point). Then loop through the results. If I'm not mistaken, this would probably be the #1 time saver in your process. Then do the erase all at once too. This would require some geometry writing using an insert cursor to write out the "connect the two dots" line features (use the Near output to define the relationships).

4) Don't run the line arcpy.DeleteField_management(p_centrelinepoints,["NEAR_DIST","NEAR_FID"]). Might save space on disk, but it takes time to delete fields.

5) Instead of the erase, maybe a Select By Location is all that is required (if the line gets selected - selected count = 1 because it intersects with the road area, isn't that the same as an erase returning an empty feature class?).

6. Imbeded cursors are slow. Consider using a Python distionary to emulate imbedded cursor behavior: http://forums.arcgis.com/threads/8428-Setting-a-Single-Record-to-Selected-with-Python?p=25930&viewfu...

Hope something here helps!
0 Kudos
by Anonymous User
Not applicable
I'm experiencing a lot of slowness as well and this thread made me re-examine my code.  I'm currently reading Autocad files and writing three of the components into Oracle via SDE.  The points, polygons and polylines. The polygons are temporary because I spatial join them with the points so I can have the point data associated with the polygons.  My question is along the lines of the in-memory workspaces or the temporary MakeFeatureLayer_management().  The documents says it lasts for the left of the session.  Exactly how long is a session and is it something I need to free when done?

http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00170000006p000000.htm

I'm wondering if my memory usage is in a death spiral because I'm reading about 600 to 1000 cad files.
0 Kudos
ChrisSnyder
Regular Contributor III
A "session" is however long the .exe (python.exe, ArcMap.exe, whatever.exe) runs... That is, unless you exlicitly delete an in_memory featureclass using the gp.Delete_managment() tool, it will persist until you close the application that created the in_memory featureclass. Usually I just use the same name/variable for the in_memory featureclass (inMemFC = "in_memory\\overwrite_me") and the "gp.overwriteoutput = True" setting so that in a loop I just overwrite the in_memory layer from the previous loop. If you don't choose to overwrite the in_memory featureclass, at least be sure to delete the old one(s) before the end of the loop. If you don't, and you have a lot of loops/data, you can easily fill up your RAM.

Feature layers can be overwritten as well (they are just references to data on disk - or datasets in the in_memory workspace). I found that in v9.3 at least, a feature layer will persist a lock on the source FC and prevent you from deleting the FGDB that contains the source dataset that the feature layer references. Delete the feature layer, and then you can delete the FGDB.
0 Kudos