This content has been marked as final. Show 11 replies
Thank you very much for your reply. I will try what you suggested and I should be coming back here with either some results for discussions (or more questions!) if that's okay.
Just a couple of things regarding the interpolation. My area of study is data scarce and I didn't have a choice. I would have only three points of NDVI where the data was actually collected. I think I will take the risk and still use the interpolation/NDVI analysis - or do you think the results would be worthless?
Hi again Bill, I hope you'll see this.
I'm trying to get my head around some of your instructions, but with no success. Now I have the indicator grid for the overlapping datasets, but I'm very confused with how to use it in the calculations. I don't think I quite understood the following:
�?�Wherever you see an ordinary unary or binary arithmetic operation in the formula (including, but not limited to, negation, addition, subtraction, multiplication, division, and powers), use the same one in Map Algebra.
�?�Wherever you see a sum over cases compute a zonal sum. If the sum is divided by N it's really an average so just use a zonal average for the whole thing
That went straight over my head - really sorry.
If you have the chance to clarify that - it would be so much appreciated. I will keep on trying in the meantime.
Thanks a lot for that, Bill. You've been a great help. Very much appreciated.
Following the instructions you gave me above, I still have some doubts if what I'm doing is correct.
I followed the steps, but some there's some confusion:
In the formula xi_i = (x_i - m)/s, for instance, the zone grid is x_i, correct? I'm not sure how to obtain that zone grid (and I'm sure this is really simple!). Is this anything to do with the indicator grid I created initially?
Another question is regarded the zonal average of [xi]*[eta] that you mentioned. Do I obtain that by simply multiplying those two grids and dividing it by N?
I think I've figured it out now. And yes, it was very simple. But feel free to make any comments. I'm new to all this, so any reassurance is more than welcome.
I'm getting very strange results for the correlations, so I'm assuming there's something wrong with the way I'm creating the zone grid, unfortunately. I don't seem to get this one constant value at every cell that you mentioned above.
Is the formula for the standardized values xi_i = (x_i - m)/s the one that should be used to create this grid?
I've used the indicator to produce grids of the means ([mX] and [mY], and standard deviations ([sX] and [sY]). But in the formula above, what will be the x_i? I know this should be a very straight forward operation, but I'm sure what I'm doing is wrong.
My question remains:
if I�??m computing Z as ([x] - [m])/[s] for grids [x] (the original data), [m] (a constant grid equal to the mean of [x] throughout the zone), and [s] (a constant grid equal to the sd of [x] within the zone),
by [x] (original data) do you mean:
1) �?? the sum of [x] using the INDICATOR grid (created earlier) in the �??feature zone�?� field on SA Zonal Statistics, VALUE in the SA �??zone field�?�, and the ORIGINAL DATA [X] as the �??input raster�?� field?
If that�??s the case, my result will be a range of values (high to low), which will give me 2 values for a correlation in the end. Note that I will be doing the same thing for the [Y] original value, which will also give me �??high to low�?� values.
2) �?? if by [x] (original data) you mean the raw raster data (say, my NDVI data), I will also have �??high and low�?� values.
Basically, my only question is: what exactly is [x] in the formula ([x] - [m])/[s]?
Sorry for being such a hard work.
This is a great example of how incorporating R Statistical Software into ArcGIS would be a huge benefit. R has no problem calculating correlations on data that are provided as an "array", i.e. rows and columns. There is no need to reinvent the wheel, although it does make for an interesting excercise.
My advice would be to abandon ArcGIS for the heavy lifting and just download R. It's notoriously ugly to program, but with a few Google searches I bet you could write correlation scripts in a couple of hours. Using Python, R, and rpy2 I've written script tools to automate this procedure. But....
One issue I haven't been able to resolve is the reliability of correlation coefficients when the data are spatially autocorrelated, which may be a problem.
I'll get off my soapbox now, but this is an area in which I would love to see more improvement.
In ArcInfo workstation this was as easy as typing the following at the grid prompt.
correlation grid1 grid2
ESRI - why have we lost functions with ArcGIS?
What is the difference between using the above procedure and using band statistics to calculate the correlation between rasters? I am hoping to find the correlations between my environmental variables prior to further analysis in maxent. Any thoughts would be greatly appreciated.
Like Jessica I am also trying to find which of my environmental layers are highly correlated before using them in Maxent. The correlation matrix that it produces has many values outside of the -1 to 1 range, can anyone tell me what the problem is there? (I'm using the worldclim 1-19 layers, 30arc seconds, clipped to Europe).