This content has been marked as final. Show 19 replies
Are your datasets in different coordinate systems (projections)? This may account for the difference. I think some tools perform best when the data is NOT in decimal degrees.
Thanks for your reply - both datasets are in the same projection (Albers Equal Area Conic), which is in meters. Any other thoughts? Thanks again!
Since you are looping through your features, are you sure that some of the subsequent loops aren't in effect overwritting some of the previous selections/counts?
Can you post the code you are using?
Yea, definitely... I left out my time-keeping bits and a some irrelevant processes in the beginning, but here's the rest:
print "Start time:",time.ctime()
StartSecs = time.clock()
from arcpy import env
env.workspace = "C:/AF_WorkingFiles/AgHealth_Iowa/Databases/AHS_NitrateWells.mdb"
# User-defined variables...
Wells2 = "C:/.../FC_Wells2" #Input wells feature class
Wells3 = "C:/.../FC_Wells3" #Output feature class
AFOs = "C:/.../AFOs"
AFO_Confine = "C:/.../AFO_ConfinementOnly"
AFO_Feedlot = "C:/.../AFO_FeedlotOnly"
AFO_Mixed = "C:/.../AFO_MixedOnly"
Hogs = "C:/.../Hogs"
# Delete fields if they exist...
# Run near tool...
# Create feature layer...
# Add count fields...
L2 = ["Count_10kmAFOs","Count_10kmConfmnts","Count_10kmFeedlots", \
for item in L2:
# Create ancillary feature layers...
# ASSIGN COUNTS TO FIELDS
# Set variables...
SelNew = "NEW_SELECTION"
SelDist = "WITHIN_A_DISTANCE"
SelAddTo = "ADD_TO_SELECTION"
SelClear = "CLEAR_SELECTION"
# Assign counts to each row...
cur1 = arcpy.UpdateCursor("Wells2Lyr")
for row in cur1:
UnqKey = row.Concat_DatasetPrimkey
# Select well at row...
WhereClause = '[Concat_DatasetPrimkey]' + " = '" + UnqKey + "'"
# Select ancillary layers within 10km...
# Count selected records in each layer and assign to appropriate fields...
row.Count_10kmAFOs = int(arcpy.GetCount_management("AFOLyr").getOutput(0))
row.Count_10kmConfmnts = int(arcpy.GetCount_management("ConfineLyr").getOutput(0))
row.Count_10kmFeedlots = int(arcpy.GetCount_management("FeedLyr").getOutput(0))
row.Count_10kmMixed = int(arcpy.GetCount_management("MixedLyr").getOutput(0))
row.Count_10kmHogs = int(arcpy.GetCount_management("HogLyr").getOutput(0))
# Clear all selected records...
# Create final feature class from results...
# Delete cursor...
del cur1, row
Can you repost using the code tag button (#)? Makes it easier to ready as indents are critical in Python...
And does anyone know why arcpy vs. manual selection would produce different counts, and/or why my arcpy results were correct most of the time, but incorrect some of the time?
The tool SelectLayerByLocation and spatial joins are different algorithms. There are differences in what the tolerances used are and the algorithms. To get consistent results, you should use the same tool with the same options and tolerances.
I must emphasise Chris's comment about placing your code in the code tags (#) as this makes reading your code easier. This is even more important with Python as indentation is everything! Look at your code you posted how are we to tell when code is outside a loop? Anyway...
I have two ideas:
- Is the code where you clear existing selections outside the For loop? Can't tell as there is no formating...
- In your selection code arcpy.SelectLayerByLocation_management("AFOLyr",SelDist,"Wells2Lyr",10000,SelAddTo) you are doing a selection and adding it the existing selection. Was that your intention? Are you not trying to do a new selection for each iteration of the cursor? If so replace SelAddTo with SelNew.
Sorry - I knew it would be difficult to read, but I didn't know how to get around it (i.e., I didn't know about the # button ...I'm new to posting forum code). My apologies! Here's another go at it...
NOTE: In the code below I have simplified what's happening. This code is where I began, and produced the same count discrepancies as the more complex code. Perhaps this is a better to start since it produced the same problematic result...
CURTIS: I use SelectLayerByLocation in my code and then use SelectByLocation manually in ArcMap for the check.
DUNCAN: Regarding the selection type, I use "Add to" because I have also selected one row in the table. So, for each row in the table, I select the row (i.e. a single feature), add to this selection by selecting ancillary data within 10,000m of this feature, then count the selected ancillary data and write this value to the appropriate field in the row. Below I have included only one ancillary data layer, because as stated above, this produces the same count discrepancies. When selecting features within 10,000m from multiple ancillary data layers, my "GetCount" tool is specific to each ancillary data layer - therefore it returns the count for each layer and not the count of everything selected. Hope this helps clear up my reasoning for using "Add to" - if not just let me know!
I appreciate your continued help with this!
# Imports... import arcpy import sys import traceback from arcpy import env # Environments... env.workspace = "C:/AF_WorkingFiles/AgHealth_Iowa/Databases/AHS_NitrateWells.mdb" # User-defined variables... Wells2 = "C:/.../FC_Wells2" #Input feature class Wells3 = "C:/.../FC_Wells3" #Output feature class AFOs = "C:/.../AFOs" try: # Delete fields if they exist... #(left out) # Create feature layer... arcpy.MakeFeatureLayer_management(Wells2,"Wells2Lyr") # Add count fields... L2 = ["Count_10kmAFOs","Count_10kmConfmnts","Count_10kmFeedlots", \ "Count_10kmMixed","Count_10kmHogs"] for item in L2: arcpy.AddField_management("Wells2Lyr",item,"LONG") # Create ancillary feature layer... arcpy.MakeFeatureLayer_management(AFOs,"AFOLyr") # Set variables... SelNew = "NEW_SELECTION" SelDist = "WITHIN_A_DISTANCE" SelAddTo = "ADD_TO_SELECTION" SelClear = "CLEAR_SELECTION" # Assign counts to each row... cur1 = arcpy.UpdateCursor("Wells2Lyr") for row in cur1: UnqKey = row.Concat_DatasetPrimkey # Select feature at row... WhereClause = '[Concat_DatasetPrimkey]' + " = '" + UnqKey + "'" arcpy.SelectLayerByAttribute_management("Wells2Lyr",SelNew,WhereClause) # Select ancillary layer within 10km... arcpy.SelectLayerByLocation_management("AFOLyr",SelDist,"Wells2Lyr",10000,SelAddTo) # Count selected records in each layer and assign to appropriate fields... row.Count_10kmAFOs = int(arcpy.GetCount_management("AFOLyr").getOutput(0)) cur1.updateRow(row) # Clear all selected records... arcpy.SelectLayerByAttribute_management("Wells2Lyr",SelClear) arcpy.SelectLayerByAttribute_management("AFOLyr",SelClear) # Create final feature class from results... arcpy.CopyFeatures_management("Wells2Lyr",Wells3) except: #(error handling specifics listed here) finally: del cur1, row
Don't know if this will help but if you add a selectbylocation tool to model builder fill it in and export it to Python one sees how the tool parameters should be completed.
For selectbylocation the distance also includes its units as shown below:
arcpy.SelectLayerByLocation_management("layer1","WITHIN_A_DISTANCE", "layer2","100 Meters", "ADD_TO_SELECTION")
Still don't get your logic to why you are adding to a selection, for this line you are adding to the selection of "AFOLyr"
# Select ancillary layer within 10km... arcpy.SelectLayerByLocation_management("AFOLyr",SelDist,"Wells2Lyr",10000,SelAddTo)
which you then clear with this line:
Re. comment 1: The ArcGIS10 Resource Center gives an example without units (http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//001700000072000000 - see bottom of page), and I read somewhere (maybe on that page??) that if no unit is given it uses the linear unit of the input coordinate system (meters in my case). All data I'm using is in the same projection, so everything is in meters.
Re. comment 2: For example, lets says you are in ArcMap and you select records from "Schools" where "MascotColor" = red. If you would then like to find "Streets" within 10km of the selected "red" features, you would have to use "add to current selection." Even though the streets data is in a different dataset, Arc sees all selections - i.e., from all datasets - and therefore you have to add to your current selection (which is within "Schools") even though you are making a selection from "Streets." If you use "new selection" it will clear all selections in all datasets to make the new selection, and it will no longer be based on "red" records only (i.e., it will be based on all mascot colors). Therefore, to keep my selection of the single row/feature, I have to "add to" when selecting ancillary data. Does this make any sense?
The odd bit is that my code works part of the time, and therefore is incorrect only part of the time. It just seems strange to me that it would work sometimes and not others. I feel like if it were something in the way I use a tool, none of my results would be right. Thoughts?
Ok I just thought may be the missing units could be a potential issue, but it does not sound like it.
I understand your explanation but I disagree. Whilst Arcmap reports the total number of features selected in the statusbar your code is working at the layer level. Your code selects a single well based upon the DataSetPrimkey and this selection is used to do a within distance selection on the layer "AFOLyr". So for argument sakes it selects 100 features in the AFOLyr layer. Yes Arcmap will display a selection of 101 in the statusbar but you have 1 well selected and 100 AFOLyr features. Your get count code is counting the number of selected features in the AFOLyr layer so get count will return a value of 100. Your code then clears the selection on each layer.
The NEW_Selection clears the selection on the AFOLyr layer as that is what you are selecting on, this does not clear the single selected well point.
So why I keep banging on about the selection type is that your discrepancies (which is something to be concerned about) may be due to existing selections on your dataset which you are unintentionally adding to.
Humour me, make it a NEW selection and see what your results are, the worse case is that they still have discrepancies which indicates something else is at fault?
I tried running the code on a small data subset using "new selection," and I received the same results :-/ I've noticed, however, that running "Select Layer By Location" in the ArcMap Python window and running "Select By Location" manually from the Selection drop-down menu create the count differences I'm seeing. So, I think there is some underlying difference between these two tools regarding how they draw circles. For a single feature I tested, "Select Layer By Location" highlighted two features up to 80 meters outside of my specified distance, and "Select By Location" did not highlight these features. I plan to inquire about it at the ESRI UC this summer... maybe someone there will have an answer?! Thanks for your persistence with trying to help me figure this one out!
I just spoofed up some point data with a single point selected and carried out a selectbylocation with a within distance of 100m on a line layer. For me the selectbylocation tool from the menu selects exactly the same polylines as the selectbylocation geo-processing tool when called from the Python window. So I cannot replicate your problem.
Soo... may be the datasets are corrupt in some way? Things to consider are:
- What are their storage format, do they differ? Are you trying to select data in a shapefile using a geodatabase dataset?
- If you are using geodatabases, is it an old 9.1 database I know they have lower precision? If in arcsde, export to shapefiles and try those.
- Spatial index has become corrupted, remove it and add it again
- Geomtries are corrupted, use the check geometry tool
- Have you set geoprocessor environments in ArcMap (go to geoprocessing > environments and clear anything that may have been set in a previous session)
Looks like you are not the only person having this problem, look here.
Huh, strange you received the same counts with both methods. Did you use the two "different" tools (i.e., SelectByLocation and SelectLayerByLocation)? How large was your spoof dataset? About 2/3 of my data is in agreement between the two tools, and when disagreement occurs it's seemingly random. Could there be a chance your dataset was too small??
The storage format is the same for all the data I'm using: a personal geodatabase created in ArcCatalog 10.0 - so everything's a feature class. Regarding spatial index and geometries, I've actually "re-created" my well points dataset a couple of times when my clients have made changes, including one time recently, and the count problem still occurred. The ancillary data was downloaded from a state website.
I would love to upload the data for your testing, but unfortunately it's private and can't be shared. I do appreciate you being willing to look at it though!
The link you provided (re: people w/ a similar problem) didn't connect to the appropriate page; it took me to a Developer Summit login page for voting on something. It's definitely relieving to know others have experienced similar issues though! Maybe try sending the link again?
And on a random, new-forum-user note... is there a way to get email notification when someone replies to your thread? I'm "subscribed" to this thread but never receive alerts on when posts occur.
Again, many thanks!
Oh, and my geoprocessing environments are good/cleared of any unwanted settings...
Interesting that link has stopped working, I always double check those sort of things, as there is nothing more annoying than a broken link! It looks like ESRI are removing ideas off the website, may be when they are a bit unpalatable for them?
You might need to go into your forum profile settings and tick some box on to get the emails, can't remember or its your spam filter blocking them?
I did test the behaviour of SelectByLocation and SelectLayerByLocation on a small dataset (if you consider 38,000 polylines small) and experienced no difference in selection count.
It looks like there is something fundamentally different between the two tools as alluded to by curtvprice. You should bring this up with their support and they will probably want some of the data to try and replicate it.
One last idea (which won't fix the problem but may indicate which one you have more faith in) is to create a buffer layer from your points and use those instead of doing a within distance selection. You'll have a defined geometry which you can see rather than some black box processing done by the tool... Good luck!
I have tried creating buffers for a subset of the data and had the same problem. The frustrating thing is, it's not really an issue of choosing the 'better' tool; I need to automate this process, and Python only supports the SelectLayerByLocation tool. When I run each of the two tools by hand in ArcMap, I get the same problem that's reflected when I run SelectLayerByLocation in code, so I think it may be true that these two tools just work differently. SelectByLocation seems to produce the 'true' results, however I cannot implement SelectByLocation by hand for thousands of records and manually imput the count from each selection into a table - it really has to be automated, which leaves me no choice but to go with SelectLayerByLocation :-/ I wonder why the two tools work so differently... and I wonder why the tool worked for you. Maybe I'll test out a polyline dataset since that's what you used.
Anyhow, many thanks, and happy Friday!!
It very well may be that these two different 'SelectByLocation' tools (when specifying the 'WITHIN_A_DISTANCE' option) are using two different buffering methods: A geodesic distance vs. a euclidean distance. I suspect that the toolbox tool is using the geodesic distance method while the (older) menu-based tool may still be using the euclidean method.
Which result tool gives the "correct" answer?
Is your dataset in a Geographic coordinate system or is it projected?
Some more info:
Both methods are correct - just depends on how you look at it.
Thanks for your input (and the links)! Most of your comments have already been addressed in previous posts, for instance I've tried using "COMPLETELY_WITHIN" (buffers) instead of "WITHIN_A_DISTANCE" (of points), all data are in the same projected coord. system (and stored in a personal GDB), and the "correct" tool seems to be SelectByLocation (based on its return of all points visually within the buffer feature). (I realize "correct" tool is somewhat arbitrary since you cannot compare each tool to a "true" value, and instead can only compare them to each other. My "correct" tool assignment was based on visual inclusion as viewed in ArcMap, where the data frame coord. system is the same as that of my data.) I think you're right in that there could definitely be something going on with differences in geodesic dists vs. euclidean dists. Is possible for this difference to be >80 meters though? For example, I've found cases where SelectLayerByLocation included a feature that was >80m beyond the perimeter of the buffer feature (as measured in ArcMap with the distance tool)...