Latest Contributions by MarkJanikas

‎02-10-2015

Correction: "LocalMoran.py"

‎02-10-2015

The $ARCHOME\ArcToolbox\Scripts directory contains all of the code for the Python Tools. "LocalI.py" is the one you are looking for. Best wishes, MJ

‎02-10-2015

Thanks for spotting this Can... it was in fact, an error in our documentation. Rest assured that the values are calculated correctly in the code... but you can always check yourself as it is in Python :-). We will update the documentation ASAP. Again, many thanks. Mark Janikas Product Developer

‎11-15-2013

Hi Dan, While k-function has had some speed improvements over the last few releases, it is an inherently slow algorithm. How many point features do you have? The 999 permutations are causing the delay... but it shouldn't run out of memory... so at some point it should finish. At 10.2.1 we have moved a portion of our core calculations to C++ to improve performance. k-function is set for this improvement in the next release [i.e. post 10.2.1]. We got 50 times the speed for some of our clustering algorithms... so perhaps k-function will see a dramatic improvement as well in the future. That being said, the nature of the algorithm will always be poor from a Big O perspective.

‎08-06-2013

The patience paid off. 550 hours and it finally came up with an output! At least it didn't run out of memory! The Local Stats shouldn't, as they process a Row-Compressed Sparse Weights Structure that does not require any transpose to calculate... so each row is processed independently and subsequently deleted from memory before proceeding to the next. In regards to the previous comment about extrapolating the processing time... The number of features has an order constant processing time ONLY if the number of neighbors is fixed... Like using k-nearest neighbors.... if you are using distance w/ the default then you could have wildly different number of neighbors for each feature (It also needs an additional loop to calculate the default... one to calc the distance that assures every feature has at least one neighbor... and then the calc loop using said distance.) We have a change request to allow k-nearest neighbors and delaunay on-the-fly in the tool itself, but for the former we would have to add another parameter... for now you have to use a Spatial Weights File (SWM)... which would have a 2GB limit for 32-bit, so you may be out of luck on that end. Best bet for a faster solve of Hot-Spot Analysis on a dataset that large is to choose a smaller distance band... it will solve exponentially faster AND that statistic actually does not require that each feature has a neighbor because each feature is considered a neighbor of itself. For Cluster-Outlier, it wouldn't work. Best, MJ

‎10-23-2012

I am the dev for the product. Thanks to all for the replies.... Yes, if the weights are different, then the results are going to be different. This is particularly evident with Inv Distance, as we apply a hybrid to avoid weights greater than 1... I.e. if dist < 1, then w = 1, else, w = 1/d^{exponent} We also use the randomization assumption for the variance, but there is no "Monte Carlo".... That would be a "permutation" approach, which we do not use due to the extreme computational cost. We also apply a "two-sided" alternative hypothesis... so, if you get the same weights into R in the form of a listw: 1. Make sure you honor the row standardization approach in both products 2. Try and use "Fixed Distance" to assure that our alternative IDW doesnt cause the issue 3. Make sure the alternative hypothesis is two.sided I am attaching a zip file that contains CA counties and the SWM/GAL file necessary to compare. Just run Moran inside ArcGIS using the caQueen.swm, and then run the R script to use spdep with caQueen.gal. You will note that they are the same. The image is in the zip file as well as below if you want to just take my word for it... but, the R script should show you how to call moran.test in a manner consistent with ArcGIS. Thanks much, MJ [ATTACH=CONFIG]18654[/ATTACH] [ATTACH=CONFIG]18653[/ATTACH]

‎09-24-2012

As a follow up... I am being informed that this bug has been fixed in SP5... and I am positive that it does not persist in 10.1. As such, please try to lay down SP5 and see if it fixes the problem. MJ

‎09-27-2011

Here is some code that places your NbyN matrix into a GWT sparse format.... import locale as LOCALE fo = open("weights.txt", "r") fw = open("weights.gwt", "w") header = fo.readline() fw.write("%s" % header) data = fo.readlines() for row in data: rowVals = row.split() rowID = rowVals[0] weights = rowVals[1:] for colID, weight in enumerate(weights): w = LOCALE.atof(weight) if w != 0.0: fw.write("%s %i %s\n" % (rowID, colID+1, weight)) fo.close() fw.close() I put your format into "weights.txt" and get out "weights.gwt". This will allow you to use it directly in most of our tools. However, if you want to construct a SWM format... it is perhaps best to change the header line to: fw.write("%s %s %s\n" % (header.strip(), "NID", "WEIGHT")) Then open the output file in excel and save in DBF format. Lastly, call the "Generate Spatial Weights Matrix" tool in ArcGIS and use the "Convert from Table" conceptualization in order to give you the binary SWM format. There are many ways to skin this cat and I really like that you are attempting to use PySAL for it. Unfortunately, I have been to busy with release to get to working with conversions in the latest and greatest PySAL. Perhaps Ill wrap some code into my WeightsUtilities to convert full NumPy arrays to SWM formats?....

‎03-21-2011

Hello mjcohn, You are quite right that we do NOT currently have an option for what you are looking for. We are going to be coming out with a Group Similar Features Tool in the next release that will perform traditional and spatially constrained clustering based on kmeans and minimum spanning tree algorithms... the output from this tool will provide groups that, like yours, would be great to place as input for spatial weights. In short, we have discussed this, but it is not on our current dev plans. If you know Python, you could construct a text based weights file and use it as input into the Spatial Weights Tool (as dbf).. or directly as input as a GWT type file. The format would be: UniqueIDFieldName UniqueID NeighborID Weight (In your case, always a one) ... ... ... IMO, that would be your quickest solution. Ill talk with the rest of the SS crew and perhaps we will have what you are asking as a sample... or perhaps in a future release.

‎03-18-2011

Hi Saul, Unfortunately, ArcGIS does not accept GAL files. If you are going to use text based weights the format is: UniqueIDFieldName ---> first line tells us the name of the unique ID field 1 5 .20000 ----> ID NeighborID Weight 1 4 .20000 ... ... ... I think GeoDa has a way to store the weights in GWT format... which I think is very close to what you need for our software... you may need to adjust the header. You can also create spatial weights using the Generate Spatial Weights Matrix tool. You will get Distance based, knearestNeighs, Delaunay Triangulation as well as Polygon Contiguity. The resulting *.swm file can be used in our tools and is quite fast to read and write... as they are binary files. Hope this helps! Take Care, MJ

‎09-30-2010

Hi kaerben, In order for me to understand why your process is stalling at 0%, I would need to know what you have defined as your neighborhood. Can you respond with the "Conceptualization of Spatial Relationships" you are using... and if it is distance based... what is the cutoff? Are you using Polygon Contiguity? Have you attempted to use a Spatial Weights Matrix ahead of time? This tool prints out some information as to the sparseness of your weights matrix... which would enlighten the problem significantly. Also, what version of the software are you using? As far as limitations are concerned... I believe that the upper limit on most CPUs for Spatial Auto and General G is roughly 30 million non-zero linkages in a Spatial Weights Matrix. So, if you construct a Spatial Weights Matrix using k-nearest neighbors and 60 as the value then you would be right at that threshold (60 * 500,000 = 30 million). Most applications require far less than 60 neighbors... so you may have some wiggle room. The local stats (Hot-Spot and Local Moran) are far more robust and should solve with more non-zero linkages... but they take a bit longer to calculate because they need to write features using an UpdateCursor (which is slow... but being sped up greatly in 10.1) Right off the bat, I would suggest running the tools with a Spatial Weights Matrix using 8 nearest neighbors... just to get an idea of the time required... even if that is not the concept you are interested in. Just to give a (very) loose comparison for you... for a Random Set of 500,000 point features it took me: 1 minute 59 seconds to construct a Spatial Weights Matrix (*.swm file) with 8 Nearest Neighbors 4 minutes 20 seconds to run the Spatial Autocorrelation tool with given *.swm file 16 minutes 6 seconds to run Hot-Spot (most of the extra time due to previously stated UpdateCursor) All of our algorithms for the Global and Local Stats in 10.0 are based on sparse math (you also get this math using a Spatial Weights Matrix in 9.3)... and if you request a solution where the spatial weights are dense (i.e. all features have lots of neighbors)... then the tool can break down when the dataset is large. I am thinking this is what is happening with your data. Perhaps you took the defaults and the distance cutoff is giving you features with perhaps thousands of neighbors. In 9.3 and beyond, the default cutoff is the distance that assures that every feature has at least one neighbor. When the polygons are irregular in size, you often get a vastly uneven amount of neighbors. Moving along... if you are using the spatial weights matrix tool all of the algorithms for finding neighbors EXCEPT Polygon Contiguity are extremely fast... quite competitive with any other package. This speed has been added to the tools directly in 10.0... so if you are using 10.0, you could possibly skip the Spatial Weights construction... but of course, you wont get k-nearest neighs or Delaunay. We are currently working on a much faster implementation of Polygon Contiguity (scheduled for 10.1)... which is definitely needed as an improvement. Lastly, I am quite willing to help you out directly if you want to send me your data. If any of the attributes are sensitive... then you could feel free to delete the fields before forwarding. Due to the size of your dataset... you would need to ftp or use something like DropBox in order to get it to me... the latter has a 2GB limit free... I have not used it yet... but we could try it out together if you are so inclined. As the programmer, I am at your service: mjanikas@esri.com Best wishes, MJ

‎09-21-2010

Hello wisfool, I am the programmer for the tool and I would be glad to help you out. Would it be possible to send me your data? We have quite a few harness tests for "Ripley" and "Reduce" and we are not getting the error. If you cannot send me the data, then perhaps I could get in touch with you remotely to try and debug. You can reach me at my coords: mjanikas@esri.com Thanks so much, MJ

Online Status	Offline
Date Last Visited	‎11-11-2020 02:23 AM

My Ideas

Latest Contributions by MarkJanikas

Re: Question with the Tool: Cluster and Outlier Analysis (Anselin Local Moran's I) (Spatial Statistics)

Re: Question with the Tool: Cluster and Outlier Analysis (Anselin Local Moran's I) (Spatial Statistics)

Re: Question with the Tool: Cluster and Outlier Analysis (Anselin Local Moran's I) (Spatial Statistics)

Re: Tool Processing times

Re: Hot spot analysis on ArcGIS 10.1 running for 2 weeks straight with no end in sigh

Re: Moran's I computed in ArcGIS is different from that in R language (for same datas

Re: Hotspot (Getis Ord Gi*) and ISA script Error:"<type 'exceptions.AttributeError'>:

Re: Any way to import a self defined neighborhood matrix?

Re: Spatial Weights Matrix

Re: weights matrix (.gal) failing

Re: spatial autocorrelation/hot spot anlaysis has upper limit for sample size?

Re: BUG in Ripleys K Function

Re: Question with the Tool: Cluster and Outlier An...