Adding random spatial error to addresses using python?

2286
4
01-03-2014 03:35 AM
MichelleDeBartolo-Stone
New Contributor
Hi.  I have geocoded a list of ~2,500 survey respondent addresses and now want to add random spatial error to each location based on underlying population density while also retaining points within their original census block and keeping points out of water bodies.  I am not familiar with Python and am wondering if this is something I could do using Python within ArcGIS 10.2.  ...or maybe there is a better way?   

Development varies from urban to large tracts of rural areas- and geocoding addresses to the census block centroid isn't ideal in the rural areas. 

Essentially, my thoughts are:
(1) Add random spatial error to each point: 360 degrees, distance based on population density.  e.g. if pop. density is < 1 person/sq mile, then distance offset would be within 2 mile radius; if pop. density is 1 < x < 50, then distance offset would be within 1 mile radius...
(2) Keep point within the boundary of the census block from which it originates
(3) Keep point out of water bodies.

I appreciate any thoughts, advise, help that may be offered.  Thank you.
Michelle
Tags (2)
0 Kudos
4 Replies
GeraldineEGGERMONT
New Contributor II
Hi Michelle,

Shortly put you could use a SearchCursor that iterates over your point adresses and first creates for each point a custom buffer zone with a radius linked to the population density, limited to the underlying census polygon and excluding water bodies; then creates a random point within this zone using Create Random Points. And merge all points together when your cursor is done.
I could provide you with details if you want but this is going to be very slow, it requires ArcInfo/ArcGIS Advanced and will maybe not be easy to do if you're not familiar with Python.
I would suggest you to geocode your addresses to a least accurate level (e.g. street instead of address or city instead of street). Could this be an option for you?

Geraldine
0 Kudos
MichelleDeBartolo-Stone
New Contributor
Hi Michelle,

Shortly put you could use a SearchCursor that iterates over your point adresses and first creates for each point a custom buffer zone with a radius linked to the population density, limited to the underlying census polygon and excluding water bodies; then creates a random point within this zone using Create Random Points. And merge all points together when your cursor is done.
I could provide you with details if you want but this is going to be very slow, it requires ArcInfo/ArcGIS Advanced and will maybe not be easy to do if you're not familiar with Python.
I would suggest you to geocode your addresses to a least accurate level (e.g. street instead of address or city instead of street). Could this be an option for you?

Geraldine



Hi Geraldine,
Thank you for your response.  I could geocode to the census block centroid, but in rural areas I would lose more spatial information than I would ideally like... but this may be the best option.  I am not familiar with SearchCursor.  If it is not too much trouble to provide details on your suggestion, I would like to take a look to see if it is something I could take on or if it is worth taking on.  I forgot to mention that i would need to retain a unique ID for each point associated with every survey respondent.  Thank you again for your response.
Sinerely,
Michelle
0 Kudos
GeraldineEGGERMONT
New Contributor II
Hello Michelle,

You should first prepare your census and address feature classes (this can be done without Python):

  • Remove water bodies from census with Erase

  • Add a field representing the radius of the buffer to census, and calculate it based on the population density field

  • Join the radius field to the address fc with a Spatial Join.


Then create your SearchCursor like this:

addresses = r"c:\data.gdb\addresses"
census = r"c:\data.gdb\census"
fields = ["ADDRESS_ID", "RADIUS"]

with arcpy.da.SearchCursor(addresses, fields) as cursor:
    for row in cursor:


then for each row (= each point):

  • Create an variable containing the ADDRESS_ID (ID = row[0]), i.e. the first field in your cursor, i.e. ADDRESS_ID)

  • Buffer the point using radius as distance

  • Select the underlying census polygon using Make Feature Layer + Select by Location

  • Clip the buffer with the selected census polygon. The output is going to be the constraining feature class for the Create Random Points tool

  • Create a random point in this zone with Create Random Points

  • Add an ID field to the random point (same format as your ADDRESS_ID field from the address fc)

  • Calculate the field, assign it the ID variable

  • Clear the census selection.


You could either append each point to an empty point feature class you create before starting the cursor, or merge all points together when the cursor is done.
In order to make your script run faster, you should use whenever possible geometry objects instead of feature classes as inputs and outputs of the tool.
If you want to start learning Python, there is a free tutorial (3 hours) here. There are other trainings (not always free) and plenty of examples in the Online Help. Converting a ModelBuilder to a Python script sometimes help figuring out the correct syntax.

Good luck and don't hesitate to come back to me if you have questions,

Geraldine
0 Kudos
MichelleDeBartolo-Stone
New Contributor
Hello Michelle,

You should first prepare your census and address feature classes (this can be done without Python):

  • Remove water bodies from census with Erase

  • Add a field representing the radius of the buffer to census, and calculate it based on the population density field

  • Join the radius field to the address fc with a Spatial Join.


Then create your SearchCursor like this:

addresses = r"c:\data.gdb\addresses"
census = r"c:\data.gdb\census"
fields = ["ADDRESS_ID", "RADIUS"]

with arcpy.da.SearchCursor(addresses, fields) as cursor:
    for row in cursor:


then for each row (= each point):

  • Create an variable containing the ADDRESS_ID (ID = row[0]), i.e. the first field in your cursor, i.e. ADDRESS_ID)

  • Buffer the point using radius as distance

  • Select the underlying census polygon using Make Feature Layer + Select by Location

  • Clip the buffer with the selected census polygon. The output is going to be the constraining feature class for the Create Random Points tool

  • Create a random point in this zone with Create Random Points

  • Add an ID field to the random point (same format as your ADDRESS_ID field from the address fc)

  • Calculate the field, assign it the ID variable

  • Clear the census selection.


You could either append each point to an empty point feature class you create before starting the cursor, or merge all points together when the cursor is done.
In order to make your script run faster, you should use whenever possible geometry objects instead of feature classes as inputs and outputs of the tool.
If you want to start learning Python, there is a free tutorial (3 hours) here. There are other trainings (not always free) and plenty of examples in the Online Help. Converting a ModelBuilder to a Python script sometimes help figuring out the correct syntax.

Good luck and don't hesitate to come back to me if you have questions,

Geraldine




Geraldine,
Thank you for your detailed response.  I will attempt this and see how far I can get.  Thank you also for directing me to the Python tutorial- that will be very helpful. 
Michelle
0 Kudos