Blog discussion - GeoAI toolbox (AutoML and Text Analysis tools in ArcGIS Pro 3.0)

NicholasGiner1 · ‎08-31-2022

We recently released two blogs discussing the new GeoAI toolbox in ArcGIS Pro 3.0.

This discussion is for questions related to the blogs or the tools themselves, or to share how you are using these tools or any feedback you have on them.

PriyankaTuteja · ‎11-16-2022

@jiewilliam Wow!! that's big number!!

I think 31M records is too much to be processed on a 16GB machine. I do not think RAM is sufficient in that case. I would suggest to get the prediction results in smaller chunks.

ChristopherBerry1 · ‎01-31-2023

Hi Nicholas,

Thanks for the Blog. I am reasonably new to ML and have previously had some limited experience with RF. I have begun trialling the AutoML tool in Arc and have a number of questions I was hoping I might be able to get you to help me with.

When running the application with the 'Algorithms' section left blank the only alogrithm being used initally was a decision tree. When I tried manually selecting other models it would not process and give me an error (as seen in the below Grab 1). I think I have resolved this by increasing the time limit from the default 60 to 240, would this have been the issue? Now when I select all the algorithms, or leave the mode field blank, a Decision Tree and a Light GBM algorithm are used (As seen below in Grab 2). Is this still ignoring the other model types? I also note that in my example multiple decision trees are being used (See Grab 2). Are these just Decision Trees with different hyperparameters?
In Grab 2, and in your example, there is an ensemble model. Is this model a combination of all the models or a selection? Is there any way to investigate the statistics of this model with regards to the other models it has used, the weighting etc?
Following on from model training is it the 'best' model from the ML Leaderboard used in the *.dlpk file? Is there a way to select the algorithm you want to use, for example if I wanted to use the ensemble as oppose to the Decision Tree?

In Predict using AutoML the fields are somewhat different to the Train tool. For the 'Input Prediction Features' section, if I chose my points shapefile which has been used in training, how does it know which field to predict? Is this somehow defined in the *.dlpk file?

Thanks again for your blog and any help you may be able to offer in helping me grasp the fundamentals of the tool.

GRAB 1

GRAB 2

SurajBaloni · ‎02-02-2023

Hi Christopher,

I will try to answer some of your questions:

1. If you leave the algorithm feild blank, then the tool will try to fit all available models on your data. However, if some of the model training fails during the process then that model is ignored and the tool proceeds to fit the next model on your data. In your case it looks like the a decision tree and a LightGBM models were trained and the other models couldnt be fitted.

Second part of the first question - Yes. These are different variations of decision tree with different hyperparameters and different variable combinations.

2. Yes. Ensemble is a combination of multiple models. The composition, performance and other details of the ensemble model can be found in the report that the tool generates. You can also use this report to compare other aspects of the evaluated models. Please note that the report is an additional optional parameter which you can choose to generate (Please lookout for the report parameter under additional outputs section of the training tool).

3. Yes. The best performing model is what is saved in the *dlpk. At the moment, there is no option to choose any other model apart from the best model for inference.

4. The variables that were used during the training are saved in the dlpk and when you provide a feature class/shape file to the inference tool, the tool tries to find the variables with the same names in the provided shape file. In case the names of your fields in inference shape file are different from what you had used during the training, then you will need to provide a mapping of the fields in the Map Explanatory variables section of the Inference tool.

Hope this helps.

~ Karthik

Edit - You can also try to run the tool in Advanced mode. The tool might pickup more models when you choose this mode.

ChristopherBerry1 · ‎02-08-2023

Hi Karthik,

Thanks very much for your detailed response.

I haven't been able to find any details of the compisition of the ensemble. Will this always be what is used? I would imagine an ensemble is always going to give a bettter result than an individual model?

SurajBaloni · ‎06-18-2023

Hi Christopher,

An ensemble will always be created but it does not necessarily mean that it will be the best model. There will be a few cases where the individual models work better compared to ensemble. But in majority of the cases ensemble will be better than individual.

In order to get the composition, you will need to generate the report. In the report, if you click on the ensemble model, the composition will be displayed. You can generate the report by populating the parameter Report (In additional outputs) before you run the tool. The tool output window will then show you a link to the report that was generated.

~

Karthik

TranDuy · ‎05-11-2023

Hi Nicholas,

I am running the Train using AutoML tool in ArcGIS Pro, it works in my PC (version 3.0) but in my laptop with ArcGIS Pro 3.1, it's not working, although I used the same dataset. I have attached the problem here. Can you please let me know how to fix this? Many thanks

AutoML trained model issue.jpg

Many thanks,

Duy

SurajBaloni · ‎06-18-2023

Hello Duy,

Would you be able to share the dataset? We will have to try recreate this issue.

~

Karthik

xingchenc · ‎09-18-2023

Hi @NicholasGiner1

Very interested in this rideshare demands prediction demo(EsriDevEvents/ds-usa-plenary-automl-2023: This demo highlights new AutoML capabilities within the Ge...), but unable to downlaod the ArcGIS Pro project package file listed on the webpage, need to login with an Esri credential, can you please make it public?

NicholasGiner1 · ‎01-16-2024

Hi @xingchenc - thanks for reaching out about this. I've reshared the link publicly, please let me know if you have any issues accessing it.

DevSummit2023_AutoML_PlenaryDemo_April2023.ppkx

thanks,

Nick Giner

xingchenc · ‎01-09-2024

@NicholasGiner1

Hi Nicholas,

I would like to report one possible bug, when I am using the AutoML tool in a disconnected environment or when current portal is set to a portal with self-signed certificate, the tool will return error messages.

Upon checking the source code, I noticed it trying to connect to the portal, can you explain why this is necessary? Becuase I didn't see any parameters in the tool that related to the portal.