GetCount not working properly

4110
19
05-04-2012 07:03 AM
AbbyFlory
New Contributor
Hi,

I am using PyScripter to write code that does the following:

1) Turn an input feature class into a feature layer ("Input_FL"),
2) Turn an ancillary-data feature class into a feature layer ("Ancill_FL"),
3) For each row in "Input_FL", select "Ancill_FL" within 10km,
4) Set Count = arcpy.GetCount_management(Ancill_FL).getOutput(0),
5) Update specific field in row with Count value,
6) Once all rows have finished, create new feature class from feature layer "Input_FL".

Upon checking the results, the majority of counts are correct, but some are off by one or two.  My "checking" consists of performing the selection manually in ArcMap and viewing the selected items counted in the attribute table (along with visual inspection).

Has anyone else experienced this problem?  And does anyone know why arcpy vs. manual selection would produce different counts, and/or why my arcpy results were correct most of the time, but incorrect some of the time? 

Thanks in advance for any help!
Tags (2)
0 Kudos
19 Replies
AbbyFlory
New Contributor
Duncan:
Re. comment 1: The ArcGIS10 Resource Center gives an example without units (http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//001700000072000000 - see bottom of page), and I read somewhere (maybe on that page??) that if no unit is given it uses the linear unit of the input coordinate system (meters in my case).  All data I'm using is in the same projection, so everything is in meters.

Re. comment 2: For example, lets says you are in ArcMap and you select records from "Schools" where "MascotColor" = red.  If you would then like to find "Streets" within 10km of the selected "red" features, you would have to use "add to current selection."  Even though the streets data is in a different dataset, Arc sees all selections - i.e., from all datasets - and therefore you have to add to your current selection (which is within "Schools") even though you are making a selection from "Streets."  If you use "new selection" it will clear all selections in all datasets to make the new selection, and it will no longer be based on "red" records only (i.e., it will be based on all mascot colors).  Therefore, to keep my selection of the single row/feature, I have to "add to" when selecting ancillary data.  Does this make any sense? 

The odd bit is that my code works part of the time, and therefore is incorrect only part of the time.  It just seems strange to me that it would work sometimes and not others.  I feel like if it were something in the way I use a tool, none of my results would be right.  Thoughts?
0 Kudos
DuncanHornby
MVP Notable Contributor
Abby,

Ok I just thought may be the missing units could be a potential issue, but it does not sound like it.

I understand your explanation but I disagree. Whilst Arcmap reports the total number of features selected in the statusbar your code is working at the layer level. Your code selects a single well based upon the DataSetPrimkey and this selection is used to do a within distance selection on the layer "AFOLyr". So for argument sakes it selects 100 features in the AFOLyr layer. Yes Arcmap will display a selection of 101 in the statusbar but you have 1 well selected and 100 AFOLyr features. Your get count code is counting the number of selected features in the AFOLyr layer so get count will return a value of 100. Your code then clears the selection on each layer.

The NEW_Selection clears the selection on the AFOLyr layer as that is what you are selecting on, this does not clear the single selected well point.

So why I keep banging on about the selection type is that your discrepancies (which is something to be concerned about) may be due to existing selections on your dataset which you are unintentionally adding to.

Humour me, make it a NEW selection and see what your results are, the worse case is that they still have discrepancies which indicates something else is at fault?

Duncan
0 Kudos
AbbyFlory
New Contributor
Duncan:
I tried running the code on a small data subset using "new selection," and I received the same results 😕  I've noticed, however, that running "Select Layer By Location" in the ArcMap Python window and running "Select By Location" manually from the Selection drop-down menu create the count differences I'm seeing.  So, I think there is some underlying difference between these two tools regarding how they draw circles.  For a single feature I tested, "Select Layer By Location" highlighted two features up to 80 meters outside of my specified distance, and "Select By Location" did not highlight these features.  I plan to inquire about it at the ESRI UC this summer... maybe someone there will have an answer?!  Thanks for your persistence with trying to help me figure this one out!
0 Kudos
DuncanHornby
MVP Notable Contributor
Abby,

I just spoofed up some point data with a single point selected and carried out a selectbylocation with a within distance of 100m on a line layer. For me the selectbylocation tool from the menu selects exactly the same polylines as the selectbylocation geo-processing tool when called from the Python window. So I cannot replicate your problem.

Soo... may be the datasets are corrupt in some way? Things to consider are:

  • What are their storage format, do they differ? Are you trying to select data in a shapefile using a geodatabase dataset?

  • If you are using geodatabases, is it an old 9.1 database I know they have lower precision? If in arcsde, export to shapefiles and try those.

  • Spatial index has become corrupted, remove it and add it again

  • Geomtries are corrupted, use the check geometry tool

  • Have you set geoprocessor environments in ArcMap (go to geoprocessing > environments  and clear anything that may have been set in a previous session)

You could also try uploading a zipped copy of your data for us to see if we can replicate the problem?

Looks like you are not the only person having this problem, look here.

Duncan
0 Kudos
AbbyFlory
New Contributor
Duncan,

Huh, strange you received the same counts with both methods.  Did you use the two "different" tools (i.e., SelectByLocation and SelectLayerByLocation)?  How large was your spoof dataset?  About 2/3 of my data is in agreement between the two tools, and when disagreement occurs it's seemingly random.  Could there be a chance your dataset was too small?? 

The storage format is the same for all the data I'm using: a personal geodatabase created in ArcCatalog 10.0 - so everything's a feature class.  Regarding spatial index and geometries, I've actually "re-created" my well points dataset a couple of times when my clients have made changes, including one time recently, and the count problem still occurred.  The ancillary data was downloaded from a state website.

I would love to upload the data for your testing, but unfortunately it's private and can't be shared.  I do appreciate you being willing to look at it though! 

The link you provided (re: people w/ a similar problem) didn't connect to the appropriate page; it took me to a Developer Summit login page for voting on something.  It's definitely relieving to know others have experienced similar issues though!  Maybe try sending the link again?

And on a random, new-forum-user note... is there a way to get email notification when someone replies to your thread?  I'm "subscribed" to this thread but never receive alerts on when posts occur. 

Again, many thanks!

Abby
0 Kudos
AbbyFlory
New Contributor
Oh, and my geoprocessing environments are good/cleared of any unwanted settings...
0 Kudos
DuncanHornby
MVP Notable Contributor
Abby,

Interesting that link has stopped working, I always double check those sort of things, as there is nothing more annoying than a broken link! It looks like ESRI are removing ideas off the website, may be when they are a bit unpalatable for them?

You might need to go into your forum profile settings and tick some box on to get the emails, can't remember or its your spam filter blocking them?

I did test the behaviour of SelectByLocation and SelectLayerByLocation on a small dataset (if you consider 38,000 polylines small) and experienced no difference in selection count.

It looks like there is something fundamentally different between the two tools as alluded to by curtvprice. You should bring this up with their support and they will probably want some of the data to try and replicate it.

One last idea (which won't fix the problem but may indicate which one you have more faith in) is to create a buffer layer from your points and use those instead of doing a within distance selection. You'll have a defined geometry which you can see rather than some black box processing done by the tool... Good luck!

Duncan
0 Kudos
AbbyFlory
New Contributor
Duncan:

Thanks!

I have tried creating buffers for a subset of the data and had the same problem.  The frustrating thing is, it's not really an issue of choosing the 'better' tool; I need to automate this process, and Python only supports the SelectLayerByLocation tool.  When I run each of the two tools by hand in ArcMap, I get the same problem that's reflected when I run SelectLayerByLocation in code, so I think it may be true that these two tools just work differently.  SelectByLocation seems to produce the 'true' results, however I cannot implement SelectByLocation by hand for thousands of records and manually imput the count from each selection into a table - it really has to be automated, which leaves me no choice but to go with SelectLayerByLocation 😕  I wonder why the two tools work so differently... and I wonder why the tool worked for you.  Maybe I'll test out a polyline dataset since that's what you used. 

Anyhow, many thanks, and happy Friday!!

Abby
0 Kudos
ChrisSnyder
Regular Contributor III
It very well may be that these two different 'SelectByLocation' tools (when specifying the 'WITHIN_A_DISTANCE' option) are using two different buffering methods: A geodesic distance vs. a euclidean distance. I suspect that the toolbox tool is using the geodesic distance method while the (older) menu-based tool may still be using the euclidean method.

Which result tool gives the "correct" answer?

Is your dataset in a Geographic coordinate system or is it projected?

Some more info:
----------------
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00080000001s000000
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//000800000019000000
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00s500000022000000.htm

Both methods are correct - just depends on how you look at it.
0 Kudos
AbbyFlory
New Contributor
Hey Chris,

Thanks for your input (and the links)!  Most of your comments have already been addressed in previous posts, for instance I've tried using "COMPLETELY_WITHIN" (buffers) instead of "WITHIN_A_DISTANCE" (of points), all data are in the same projected coord. system (and stored in a personal GDB), and the "correct" tool seems to be SelectByLocation (based on its return of all points visually within the buffer feature).  (I realize "correct" tool is somewhat arbitrary since you cannot compare each tool to a "true" value, and instead can only compare them to each other.  My "correct" tool assignment was based on visual inclusion as viewed in ArcMap, where the data frame coord. system is the same as that of my data.)  I think you're right in that there could definitely be something going on with differences in geodesic dists vs. euclidean dists.  Is possible for this difference to be >80 meters though?  For example, I've found cases where SelectLayerByLocation included a feature that was >80m beyond the perimeter of the buffer feature (as measured in ArcMap with the distance tool)...

Abby
0 Kudos