Scheduled Notebook sometimes exceeds execution timeout - help on debugging

607
4
07-13-2023 02:30 AM
Labels (1)
MappyIan
Occasional Contributor II

Hi there, I'm fairly new to Notebooks and need some help debugging one that fails sporadically.

I've created a Notebook (in ArcGIS Online) that has four main steps:

  1. backs up a hosted feature layer view to a temporary FGDB
  2. appends the data from the temporary FGDB to a separate hosted feature layer
  3. updates the records added to the hosted feature layer to add a date/timestamp
  4. deletes the temporary FGDB

I've scheduled this Notebook to run every 15 minutes (which is the greatest frequency possible).

It's been running for a month (I initially set it off on June 14th), but sometimes fails to complete successfully.  Twice in the last week it's failed to run on multiple successive runs and has been disabled automatically by ArcGIS Online.  I've added try/except statements around all bits of code that are doing something, e.g. in the function to export an item to a temporary FGDB as shown below:

def backup_item_as_fgdb(itemid):
item = gis.content.get(itemid)
try:
result = item.export(item.title + "_backup_" + now_dt(), "File Geodatabase", tags= "VEL occupancy backup", snippet="Backup of {} taken on {}".format(item.title,dt.datetime.now().strftime("%c")))
print("VEL occupancy data successfully exported to FGDB.")
except:
print("An error occurred exporting the data to an FGDB: " + item.title)
try:
# move exported file to relevenat folder (in my ArcGIS Online account)
result.move(agol_backup_folder)
print("Temporary FGDB backup file successfully moved to folder: " + agol_backup_folder)
except:
print("An error occurred moving the FGDB backup to the specified folder: " + agol_backup_folder)
return result.id

But when it fails, I don't get any useful messages out of the task details view for the failed task, the errors block just says:

    "errors": [
      "",
      "[ERROR] - Terminating execute notebook job 32a37804e23d4deb834897fe28694e6a as scheduled notebook execution timeout 15 minutes exceeded."
    ],

 

Rather than printing the exceptions do I need to do something else with them?  Do I need to return them?

I'd like to know at which of the four main steps it's failing so I can investigate further.  Is anyone able to give any advice on how I can debug this better?  Something somewhere is taking more than 15 minutes to run (which it shouldn't be) but it only happens sporadically and I really want to find out where the problem is so that I can be confident that this will run successfully all the time as it's our main backup/archive process for this project.

Any help would be greatly appreciated.  Full task details of the failed task included below in case it's any help.

Thanks in advance - Ian

 

{
  "result": {
    "jobId": "32a37804e23d4deb834897fe28694e6a",
    "type": "executeNotebook",
    "status": "FAILED",
    "username": **my username was here**",
    "startTime": 1689190241118,
    "endTime": 1689191226757,
    "messages": [
      "Input Notebook Path:  /arcgis/home/.tasks/32a37804e23d4deb834897fe28694e6a/c13203e4fe554e8094d233ed5ed84db8.ipynb",
      "Output Notebook Path:  /arcgis/home/.tasks/32a37804e23d4deb834897fe28694e6a/output.ipynb",
      "Start processing time: 2023-07-12 19:30:42.046459"
    ],
    "errors": [
      "",
      "[ERROR] - Terminating execute notebook job 32a37804e23d4deb834897fe28694e6a as scheduled notebook execution timeout 15 minutes exceeded."
    ],
    "inputs": {
      "itemId": "c13203e4fe554e8094d233ed5ed84db8",
      "updatePortalItem": true,
      "saveInjectedParameters": false,
      "notebookParameters": "{}",
      "runId": "826766592cd04b60888579ca881d75ee",
      "taskId": "39f24117f0e5434bbb0c0a925829b54f"
    },
    "results": {},
    "customAttributes": {
      "isCancelled": false
    },
    "jobError": null,
    "jobType": null,
    "serverId": null,
    "notebookId": null,
    "itemId": null,
    "openNotebookProgress": null,
    "notebookUrl": null
  }
}

 

0 Kudos
4 Replies
JoeBullard
New Contributor II

Hi, did you get anywhere with this? I'm facing a similar item export issue and it's driving me round the bend! There's seemingly no pattern to when it happens and am finding it impossible to debug. My notebook usually takes 2-3mins but at times it will timeout at 60mins and fail as a result

0 Kudos
MappyIan
Occasional Contributor II

Hi @JoeBullard 

Sorry to hear you're having the same/similar problem.  I know what you mean about it driving you round the bend, it's so frustrating.

I did make a bit of progress in that I narrowed down the issue and worked out that it's the first step in my four step backup routine that is the problem, which is the step that exports the layer to a temporary FGDB.  I have absolutely no idea why this fails sometimes, and like you I can't spot any pattern.

To try and mitigate the issue I added "try" and "except" statements around the line of code that exports the dataset to FGDB.  If an exception is caught during the export, I just get it to try and export to FGDB again.  

This has resolved most of the issues I was experiencing.  It's rare that both the first and second attempts to export the dataset as FGDB fail, but it does sometimes happen which then causes the Notebook to be terminated.

I never managed to get to the bottom of why the export to FGDB sometimes fails, and I didn't log a support case with ESRI as it's such an intermittent problem and impossible to replicate.

Hope this helps.

0 Kudos
JoeBullard
New Contributor II

Thanks for this - yes have resorted to try/except blocks  but still seems to happen on occasion. I'm going to raise a ticket I think.

0 Kudos
MappyIan
Occasional Contributor II

Hi @JoeBullard, if you get anywhere raising a support ticket with ESRI I'd be very grateful if you could update this thread as I'd really like to get to the bottom of the issue.

0 Kudos