development tips for AGOL notebooks?

davedoesgis · ‎04-28-2022

I'm trying to create a couple of live data feeds with Python. I would normally build them with arcpy and Windows scheduled tasks, but IT permissions issues are putting up roadblocks. It looks like AGOL Notebooks might suit my needs, and I will probably run an ArcPy kernel (we have credits). I have some specific questions, but any tips or tricks moving to the AGOL Notebook environment are appreciated.

If you're using arcpy to do processing, how do you publish to AGOL? In Pro, I would have a local feature class in an APRX map layer, which I use to create a service definition draft and service definition. My symbology and metadata all tag along for the ride. In AGOL notebooks, I'm envisioning my feature classes and rasters will be variables, such as the output from a geoprocessing task, but then what's the preferred way to publish that to AGOL?

Where do you store temp data? I uploaded a mostly empty zipped FGDB to my notebook's files and unzipped it. I can read and write there, so I guess that works, but I'm not sure this is the best way to use this tool. No specific question here, just wondering about general strategies for temp data storage. Do you tend to use more in-memory feature classes and rasters in this environment? Or do you publish temp data to AGOL and then consume it?

How about development workflow? Do you develop in a local IDE (e.g.: Visual Studio Code, Spyder, PyCharm) or Pro Notebook and push that out to your AGOL Notebook? Or do you just develop in AGOL? I think I would go crazy without a debugger...

How about source code management in Git? I see AGOL Notebooks have snapshots, but wondering if anyone has tips for checking code into Git.

I could keep going, but any strategies, things you struggled with, or clever hacks are appreciated. My brain is on overload thinking about moving from IDE-based arcpy code to the AGOL Notebook environment.

thanks!

MobiusSnake · ‎04-29-2022

A few tips:

Re: an IDE, absolutely go that way. Developing in an IDE is way easier than a notebook, get it working there first with the debugger and then push it to the notebook. Use source control from your local copy.
In-memory feature classes are definitely the way to go provided you don't need to rely on domains/etc. For files, you can use the notebook environment's /tmp directory.
Use the ArcGIS API for Python wherever possibly over ArcPy equivalents. This might not be relevant for your specific task since you're publishing but I use scheduled notebooks for analysis a lot and reading/writing to services is better with the new API. (An exception would be for AGOL analysis tools with ArcPy equivalents - e.g. if you need a spatial join use in-memory feature classes and ArcPy's spatial join over the AGOL version, for credit reasons.)
If you need to store credentials, keep them in a file in the notebook's /arcgis/home directory and not the notebook.
For configuration and logging, create a hosted feature service with two tables, one with config/key values and the other for writing log messages to. You can then build a dashboard that reads from your log table.

davedoesgis · ‎04-29-2022

@MobiusSnake - Thanks, this is encouraging. I love the idea of logging to a table and building a dashboard. Checking the Notebook task output is super clunky.

How do you publish from ArcPy to AGOL? I've done it via creating .sddraft and .sd files, but it's super clunky. I was hoping there's a more integrated way if I'm running my code on AGOL. Can I at least use in-memory feature classes and rasters, or do I need to persist them?

About that /tmp directory, that's a nice easter egg. It doesn't show up in the file listing. Does AGOL clean that up for us, or do we still have to manage it?

MobiusSnake · ‎04-29-2022

I haven't had to publish services within AGOL so I can't really say much about that specific workflow.

I clean up the /tmp folder myself but not sure if it's done automatically at some interval. I know if you write a file to it then close and re-open the notebook within a short period of time your file will still be there though, so it's not immediate clean-up.

RaviNarayanan · ‎04-29-2022

@davedoesgis

what type of services do you publish? you can check out publishing capability available in ArcGIS API for Python, for task such as publishing a feature layer from FGDB, Shp, Service Definition etc.

https://developers.arcgis.com/python/guide/accessing-and-creating-content/#Publishing-an-item-as-a-w...

Would be interested to hear more on your comment about accessing task output being clunky and see how see how we could improve the experience here.

Anything you write under /arcgis/home will be persisted and can be accessed anytime. Other file locations are temporary and content will be lost once the container goes away.

davedoesgis · ‎04-29-2022

I guess I called two things clunky in my reply. I need a thesaurus 🙂

Regarding seeing the task output, that is referencing how many mouse clicks it takes to just see the latest output compared to something like tailing a log file on a server. You have to log in to AGOL, find and open the Notebook, click tasks, click on your task, click the last run. I guess it's not that hard, but compared to a server log file I can access via UNC path, I just open it from my Notepad++ history (if it isn't open already) and hit Ctrl-End.

Regarding publishing out of an ArcPy environment, I just find the whole .sddraft and .sd files process cumbersome. I have to run getWebLayerSharingDraft(), set a bunch of properties, run exportToSDDraft(), and then run StageService_server(). And then I don't think I even have the item ID or a reference to what was created, so more code to find it before I can further work with it. Depending on how many properties I want to tweak, I'm looking at minimum ~40 lines of code. In the Pro GUI, I right click on a map layer and choose share as web layer. There's a few properties to fill out and you're on your way. It seems like it's as simple as running a GP tool (e.g.: clip is one line of ArcPy code), so I was surprised to see how much there is to do behind the scenes when you're coding it.

Creating or overwriting a feature layer with a zipped FGDB is pretty easy in the 'arcgis' package. But you still have to make sure your FGDB only contains the FCs to upload and zip it up before you can get started. It's not too bad, but I sure wish there was a way to just publish an unzipped feature class with one command.

And if you have a variable that stores an in-memory feature class (i.e.: output from a GP tool), is there some way to publish those? I suspect we're looking at more code to store them before you get to the workflows I mentioned above.

I'm looking at possibly going from an FGDB feature class to a pandas dataframe (using pd.DataFrame.spatial.from_featureclass). Once I have that, is there a pretty streamlined workflow to update a FeatureLayer with that?

RaviNarayanan · ‎05-03-2022

Hi @davedoesgis Thanks for this additional info. We will take your feedback into consideration when making future enhancements.

Tasks can also be accessed/managed via ArcGIS API for Python. Here is a sample code:

from arcgis.gis import GIS
from arcgis.gis import tasks
gis = GIS("home")
# tasks are a user resource and can accessed as below
tasks = gis.users.me.tasks
item = gis.content.get("mynotebooksitemId")
my_notebook_task = tasks.search(item,True,"ExecuteNotebook")
# get status of a task run from its properties.
my_notebook_task_property = my_notebook_task[0].runs[0].properties
# you can get the result as an HTML
taskId = my_notebook_task_property['task']['id']
runId = my_notebook_task_property['runId']
task_result = item.resources.get(taskId +'_'+ run_id + '.json')

Regarding creation of feature layer from in-memory layer and pandas data frame, there are some references here:

Data IO with SeDF - Accessing Data

Data IO with SeDF - writing data

thanks,

Ravi

davedoesgis · ‎05-03-2022

Thanks, Ravi. Those links are going to fill in a lot of my missing knowledge. It looks pretty easy to go from SEDF to hosted feature layer via sdf.spatial.to_featurelayer. What's the simplest way to append an SEDF to a hosted feature layer? (Or a total overwrite works, too.)

I've seen that FeatureLayer.append() takes various input formats, but SDF is not one of them. It looks like FeatureCollection is supported, and I can export an SDF to that. To append local data to a hosted feature layer, I'm envisioning a path like: FGDB feature class >> SEDF >> Feature Collection json >> FeatureLayer.append(). Does that seem like the simplest route?

Other ideas that I like less:

This sample converts an SDF to a list of dict objects, which are appended with FeatureLayer.edit_features(adds=<list>). I understand it, but looping through all the fields seems more cumbersome.
If I upload a zipped FGDB item and create a hosted feature layer from it, overwriting the former will update the latter. This is more of an overwrite than an append, which is probably ok, but has quite a few steps to save out a new FGDB, zip it up, upload it, clean up temp data, etc.

thanks!

xlt208 · ‎10-25-2023

Hi @davedoesgis!

In the October 2023 update of ArcGIS Online, we added a "manage tasks" experience on the ArcGIS Notebooks home page that allows you to manage all notebook tasks accessible to you. We hope this new experience can help you quickly check notebook tasks 😀.

Let us know if you have any other questions.

Thank you!

Lingtao
Product Engineer for ArcGIS Notebooks