Moran's I explanation

13709
9
12-29-2010 07:15 AM
ToddMcDonnell
New Contributor II
I am using the Moran's I tool to check for spatial autocorrelation in the residuals generated from the OLS model to determine if geographically weighted regression might be appropriate for the data.

I am wondering what the Moran's I tool uses to calculate the z-score? I have read the documentation and it seems that it determines the difference between each residual value and the mean of all the residuals, and then checks to see if these magnitudes are clustered or dispersed (or random).

Can someone verify my interpretation and/or provide additional explanation?
0 Kudos
9 Replies
JeffreyEvans
Occasional Contributor III
A quick Google on Moran's-I provides considerable material on your question. ESRI does nothing special in calculating z(i).

Just a few.

http://cran.r-project.org/web/packages/ape/vignettes/MoranI.pdf

http://en.wikipedia.org/wiki/Moran%27s_I

http://www.spatialanalysisonline.com/
0 Kudos
LaurenScott
Occasional Contributor
Hi Todd,
Our mathematics for the Global Moran's I tool is given here:
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/How_Spatial_Autocorrelation_Global_Mor...

This document also provides information about interpretation and FAQs.  I hope this helps 🙂

Your post said something about testing your OLS regression residuals in order to determine if GWR is appropriate for your data.  Please keep in mind that spatial Autocorrelation in your OLS residuals almost always means you are missing a key explanatory variable from your model.  GWR is a regression method that deals with non-stationarity... it is not a fix for misspecification nor a method specifically designed to address spatially autocorrelation residuals.

In case you might be interested, we have lots of resources about the tools in the Spatial Statistics toolbox: www.bit.ly/spatialstats.  We have a sample script, for example, called Exploratory Regression.  The documentation for that tool includes strategies for finding a properly specified OLS model.


Anyway, I hope very much that this information is helpful to you.
Best wishes,
Lauren M Scott, PhD
Esri
Geoprocessing, Spatial Statistics
0 Kudos
toddsams
New Contributor III
Thanks Lauren,

Non-stationarity can result in spatial autocorrelation, correct? 

So, if you use GWR to accommodate the non-stationarity, it should help reduce spatial autocorrelation, right?

Todd
0 Kudos
JeffreyEvans
Occasional Contributor III
Todd,
Nonstationarity is a second-order spatial effect and is commonly referred to as "pseudo-autocorrelation". Often nonstationarity in ecological models is caused by a spatial dependency on an auxiliary variable. Moran's-I is a statistic used to identify first-order spatial effects. It is important to understand these distinctions before interpreting the results of a exploratory spatial data analysis. You can think of first-order effects as global and second-order as local. In traditional geostatistics there are a few models of nonstationarity. The most conservative is stationary mean and variance and the most relaxed is a stationary mean. You can explore nonstationarity using the local LISA or Getis-Ord (Gi*) statistics. 

I do disagree with Lauren in that if your residuals appear heteroscedastic this is often indicative of autocorrelation or non-independence in your model but not always spatial dependency. It is a fallacy that residual structure indicate model misspecification. You can also have the variable that is causing a spatial dependency present in the regression and still observe structure in your residuals, as you would expect. The mere presence of variables exhibiting first or second order effects does not detrend the effect in an OLS. GWR is a regression model that is appropriate when you have a second order spatial effect and was designed specifically to account for nonstationarity. GWR does not "reduce spatial autocorrelation" but rather incorporates it. If autocorrelation is due to first-order effects then a spatial regression or spatial autoregressive model is called for (or even a OLS with a spatial lag term). Since GWR is a local regression, akin to ridge or kernel regression, it does a poor job at estimating first order trend. However, with second order spatial effects one would expect to have a fairly non-linear relationship, making GWR very attractive. Alternative methods would be conditional autoregressive model (CAR), non-parametric (Random Forests) or semi-parametric (Spline regression) models.

Spatial statistics is a rather large sub-discipline of statistics and a complex one at that. While ESRI has developed some very nice tools for exploratory analysis I would highly encourage you to read some of the primary literature and consider using a statistical package (e.g., R, S+) that is better suited for this type of analysis.

Some good places to start in understanding spatial statistics are:

Fortin, M-J & M. Dale (2006) Spatial Analysis: A Guide for Ecologist. Cambridge University Press.

Haining, R. (2003) Spatial Data Analysis: Theory and Practice. Cambridge University Press.

Isaaks, E.H. & R.M. Srivastava (1989) An Introduction to Applied Geostatistics. Oxford University Press.
0 Kudos
toddsams
New Contributor III
Thank you for the descriptive information and references Jeffrey.

In response to Lauren's comment that "GWR is not a method specifically designed to address spatially autocorrelated residuals"...

It would seem that if a process is nonstationary, this could be enough to result in spatial autocorrelation amongst the residuals in the global model. For example, if the z-score from Moran's I shows significant clustering amongst the residuals AND the Koenker OLS diagnostic is significant, this might lead one to believe that the clustering in the residuals is resulting from a nonstationary process, and can be resolved using GWR, correct?

This line of reasoning also seems to correspond with the ESRI documenation:
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/000851_Use_the_Spatial_Autocorrelation...
0 Kudos
toddsams
New Contributor III
Thanks Bill. This information is helpful.
0 Kudos
PeterMarkus1
New Contributor II
In the morans´i there is the z-score how can I finde a more detail explanation how the z-score is defindet an how does it work?

I write a master thesis an therefore I need an explanation of the z-score in the morans´i tool!

I´m happy about any notes!

Thanks!
0 Kudos
LaurenScott
Occasional Contributor
These help topics should help:
http://resources.arcgis.com/en/help/main/10.1/#/What_is_a_z_score_What_is_a_p_value/005p000000060000...
http://resources.arcgis.com/en/help/main/10.1/#/How_Hot_Spot_Analysis_Getis_Ord_Gi_works/005p0000001...

The Gi* result (shown in the math for the second link) is the z-score.  The z-score is a standard deviation.
I hope this provides what you need.
Best wishes,
Lauren

Lauren M Scott, PhD
Esri
0 Kudos
PeterMarkus1
New Contributor II
I have found a good explanation at Fotheringham, A Stewart, Martin E Charlton, and C Brunsdon. 2000. Quantitative geography: perspectives on spatial data analysis. London: SAGE Publications Ltd.
0 Kudos