Question about Emerging hot Spot Analysis results

1467
6
07-07-2022 09:00 PM
Vanilla2020
New Contributor III

Hi @LynneBuie @LaurenGriffin and folks from Esri Spatial Statistics Team, 

I am using the Emerging hot Spot Analysis tool to analyze time-series of raster imagery data and have the following questions about the results from this tool: 

1) There are several variables in the Emerging hot Spot Analysis result (Output Features) as shown in the screenshot below. However, neither the tool documentation nor online resources give specific explanations for them. I am particularly confused about the variable TREND_BIN (trend bin). Is it the same one as the EMERGING_{ANALYSIS_VARIABLE}_TREND_BIN variable or the {ANALYSIS_VARIABLE}_TREND_BIN variable that are added back to the Input Space Time Cube as described in this page: https://pro.arcgis.com/en/pro-app/latest/tool-reference/space-time-pattern-mining/emerginghotspots.h...?

I don’t fully understand what this variable EMERGING_{ANALYSIS_VARIABLE}_TREND_BIN measures. Based on the documentation, “The result category used to classify each location as having a statistically significant upward or downward trend for hot/cold spot z-scores.” Are the z-scores from the Getis-Ord Gi* statistic? For statistically significant positive Gi* statistic z-scores, the larger the z-score is, the more intense the clustering of high values (hot spot). For statistically significant negative z-scores, the smaller the z-score is, the more intense the clustering of low values (cold spot). So, what does the upward or downward trend for those z-scores imply? How does it help to understand the trends in spatial clustering in the data?   

Vanilla2020_0-1657252328531.png

2) I would like to visualize the variable EMERGING_{ANALYSIS_VARIABLE}_HS_BIN, which is added back to the Input Space Time Cube. I tried to use Visualize Space time Cube in 2D, but none of the Display Theme offers displaying this variable. I also tired to use Add Multidimensional Layer to load the Space time cube and select this variable in the Select Variables table as described in this page (https://pro.arcgis.com/en/pro-app/2.6/help/data/imagery/working-with-a-multidimensional-raster-layer...) however, it didn’t allow me to apply the Unique Values Symbology to this variable.

Does anyone have any ideas or suggestions? Thank you very much!

 

0 Kudos
6 Replies
LaurenGriffin
Esri Contributor

Hi! Sorry for the delay in our reply. Both @LynneBuie  and I were swamped with the Esri User Conference this week. But I'm so glad you brought this to our attention. The documentation is not clear at all, as you mention. The table, here, does indeed correctly document the fields that are added to the Input Space Time Cube (STC) after you run Emerging Hot Spot Analysis (EHSA), but it doesn't document how to interpret the fields that appear in the EHSA output layer results (as you found).  Hopefully, this will help:

The EHSA tool runs Gi*, comparing each bin and its space-time neighbors to the Global Window (Entire Cube, Neighborhood Time Step, or Individual Time Step) to determine if the local values are significantly larger (hot spot) or significantly lower (cold spot).

LaurenGriffin_0-1657926733962.png

 

Once all bins in the cube have been assessed (for the categories shown in the graphic above), it runs Mann-Kendal on the Gi* Z-Score values to get the trend for each location (column).

LaurenGriffin_1-1657926799414.png

The PATTERN field in the EHSA output layer is one of potentially 17 different categories documented here. If you read the description for each category, you'll notice that to classify each location, the tool needs to know the count and percentage of significant bins and if they are hot or cold... and if they are at the top of the cube (most recent time steps) or not... And for most of the patterns this is enough, but to determine if the hot bins are intensifying or the cold bins are diminishing it computes Mann-Kendall on the Gi* z-score values. The TREND_Z, TREND_P, and TREND_BIN values in the EHSA output layer are those Mann-Kendall results.

If that doesn't answer your first question, please ask again and I'll do my best to clarify.

For your second question, you want to visualize the EMERGING_{ANALYSIS_VARIABLE}_HS_BIN. The 3rd column of the variable table indicates that particular variable is available in 3D. To visualize it, there are two steps:

1) Insert a NEW SCENE (select New Global Scene if it makes sense to show the curvature of the earth, otherwise select New Local Scene).

LaurenGriffin_2-1657928287414.png

2) Run the Visualize Space Time Cube in 3D tool.

LaurenGriffin_3-1657928324659.png

Then you can navigate around the 3D scene with your mouse.

LaurenGriffin_4-1657928426343.png

I'm not sure if this will be helpful to you or not (it's NOT based on raster input), but @KevinButler-Analysis and I created a Learning Path for a workshop we did for UCGIS. 

I hope this answers your questions. If not, please let me know and I'm happy to try again.

Best wishes,

Lauren Griffin, Esri

 

0 Kudos
Vanilla2020
New Contributor III

Hi Lauren, 

Thank you very much for taking the time to replying to my question. With your help, I was able to visualize the EMERGING_{ANALYSIS_VARIABLE}_HS_BIN (my second question). Thank you!

For the first question, I still would like to clarify this: You said "Mann-Kendall statistics was applied to Gi* z-score values to see if hot bins are intensifying or the cold bins are diminishing."

For statistically significant positive Gi* statistic z-scores, the larger the z-score is, the more intense the clustering of high values (hot spot). For statistically significant negative z-scores, the smaller the z-score is, the more intense the clustering of low values (cold spot). So if a bin is a hot spot, a statistically significant increase in the Gi*  z-scores over time means the hot spot is intensifying. However, if a bin is cold spot, a statistically significant increase in the Gi* z-scores over time means its trend is diminishing. 

If I only look at the classification of the variable EMERGING_{ANALYSIS_VARIABLE}_HS_BIN, the areas classified as Up Tread (99% Confidence) can be either intensifying hot spots (increasing Gi* z-scores) and diminishing cold spots (increasing Gi* z-scores).  So my point is looking at this variable itself cannot distinguish where hot spot is intensifying or cold spot is diminishing. Could you explain more on this? Thank you very much!

Vanilla2020_0-1658164165365.png

 

0 Kudos
LaurenGriffin
Esri Contributor

Your interpretation of the Gi* z-scores is exactly right! Technically, though, you could have a z-score of 5.0 for time period N-2, 4.5 for time period N-1, and 4.0 for time period N (where N is the total number of time steps in the cube). All of those z-scores would be statistically significant at the 0.01 level (ignoring the FDR correction for simplicity), but the pattern would not be an intensifying hot spot because the z-scores are getting smaller. Even if the z-scores are increasing, Mann-Kendal would be able to sort out if the increases are significant (little tiny increases probably wouldn't be considered "intensifying", but rather "persistent", and Mann-Kendal considers all the values in the column, not just the last couple ones). If you look at the tool in the Geoprocessing pane toolbox, you can right click on it and see the source code. I can try to find the section of the code that provides the exact descriptions for each of the patterns, or you can see if you can find it. Let me know if you want additional information. Again, thank you so much for pointing out the need to improve our documentation on the Emerging Hot Spot Analysis tool outputs! An issue has been created, so better documentation will be available in a future software release.

Best wishes,

Lauren

0 Kudos
Vanilla2020
New Contributor III

Thanks Lauren, I think now I understand how Gi* and Mann-Kendal together to classify each bin as one of the 17 categories. So if I want to create maps based on the emerging hotspot analysis results, which variables you would recommend being used? I feel that the most important one is the variable PATTERN which indicates the 17 categories. So for the variable EMERGING_{ANALYSIS_VARIABLE}_HS_BIN, what kind of inference I can make out of it? (since it just tells the up/down trend for the Gi* z scores but not directly about any intensifying or diminishing of hot/cold spots)

Thanks very much again! 

0 Kudos
LaurenGriffin
Esri Contributor

Hi again,

The EMERGING_{ANALYSIS_VARIABLE}_HS_BIN variable is one of 7 values for each bin in the Space Time Cube and can be visualized in 3D:

3: Statistically significant clustering of high/large values at the 0.01 level

2: Statistically significant clustering of high/large values at the 0.05 level

1: Statistically significant clustering of high/large values at the 0.10 level

0: no statistically significant clustering

-1: Statistically significant clustering of low/small values at the 0.10 level

-2: Statistically significant clustering of low/small values at the 0.05 level

-3: Statistically significant clustering of low/small values at the 0.01 level

That variable does not include the Mann-Kendal trend information. The 2D output layer from Emerging Hot Spot Analysis (EHSA) shows the patterns and encapsulates the hot spot and trend information.

To me, it is difficult to draw conclusions from 3D maps (unless your data set is very small). The 2D maps (like the default output from the EHSA tool) seem more useful, but it really very much depends on the questions you're hoping to answer with your analysis.

If you haven't seen it already, consider watching this video, (at about 5:41 I give some examples of interpreting the EHSA results). The entire learning path might be interesting to you, though (my colleague Kevin provides some great examples of Time Series Clustering and Forecasting).

Best wishes!

Lauren Griffin, Esri

This presentation was originally given at a workshop in partnership with University Consortium for Geographic Information Science (ucgis.org). For the first time in history, we are experiencing a global pandemic and analyzing it as it happens. Because the spread of any infectious disease occurs ...
Vanilla2020
New Contributor III

That is helpful. Thank you very much for your time and help! 

0 Kudos