ArcGIS Enterprise 11.1-11.2: Restoring tile cache breaks scene layers

2326
17
Jump to solution
10-11-2023 02:17 AM
NicolasGIS
Occasional Contributor III

Hello,

I am facing an issue since we upgraded from 11.0 to 11.1: when restoring our tile cache datastore using webgisdr on standby environment, scene layers restored are corrupted and broken.


Though the restore is said to be successful, when trying to visualize them, it fails.
All requests querying for tiles description of the service fail:
=> Hosted/Buildings_3D/SceneServer/layers/0/nodes/root?f=json
returns:

 

 

{
    "error": {
        "code": 500,
        "message": "Server error unable to process request",
        "details": []
    }
}

 

 

But validation of the datastores from ArcGIS Server Manager or ArcGIS Server admin interface is successful. Webgisdr log say it successful as well.

On ArcGIS Server Manager in debug mode, this 500 errors seems to be related to 404 from CouchDB:

NicolasGIS_0-1697027288545.png

This issue occurs on both ArcGIS Enterprise deployments we are running. The procedure to restore on standby is implemented since several versions and we never faced any issue (except at 11.0 with restoring relational datastore cf. https://community.esri.com/t5/high-availability-and-disaster-recovery-questions/cannot-restore-arcgi... that was fixed later with a patch).

I opened a case 6 months ago after the upgrade with support (case 03338742) and so far no progress...
What we tried:
- manually restoring the tile cache datastore from production tile cache backup: same results
- manually restoring the tile cache datastore from webgisdr backup: same results
- restoring on a datastore content folder stored on C:\ of the VM to exclude drive performance issue: same results

Interesting facts:
- It's not systematic - sometimes all the scene services are broken, sometimes only some of them but never all of them were successfully restored. But most of the time, they are all broken.
- Procedure repeated tens of times with always the same result
- Publishing a new scene layer on the standby datastore with broken restored services works: the tile cache datastore is functional
- Publishing a new scene layer on the production tile cache datastore and restoring it fails as well (it is not restriceted to all scene layers published from previous versions)

Worth to mention, that we could potentially loose all our work as there is currently no way to export scene layers as a backup solution because of "BUG-000143562 - Exporting a multipatch hosted feature layer as a file geodatabase from ArcGIS Online results in a corrupted geodatabase" so there is currently no way to retrieve tile cache datastore content...

I would really appreciate your inputs @ChristopherPawlyszyn and @JonathanQuinn ...

Thanks

17 Replies
ChristopherPawlyszyn
Esri Contributor

Thanks @NicolasGIS. Can you try unregistering both the tile cache and relational data stores from the ArcGIS Server site, stop the ArcGIS Data Store service, rename (or delete) the arcgisdatastore directory, reregister the data stores, then attempt the restore again?

 

We're still iterating through some fixes internally but I am unsure of when they'll be able to make it in a public patch, so this is merely a temporary workaround until that time.


-- Chris Pawlyszyn
0 Kudos
NicolasGIS
Occasional Contributor III

Hello @ChristopherPawlyszyn,

Apologize for my late reply. I have been working hard on trying to make it work.
Unfortunately, it does not. I wonder if there is not another BUG on top of the "DIRECT_FILE_BACKUP" one.

If, as you suggested, I tried to unregister the datastore and restore it manually using "restoredatastore" command following this article:
https://support.esri.com/en-us/knowledge-base/how-to-restore-arcgis-data-store-000021700

I have the following error:

 

Error encountered: Machine 'https://GISSTORE.COMPANY.COM:2443/arcgis/datastoreadmin' returned an error. 'Failed to restore 'tile cache' data store.
Caused by: null'

 

Unfortunately, using this method, I don't have anything in the logs. Seems like the fact that ArcGIS Server in the DEBUG mode is not taken into account.

I tried also several times to restore the full webgisdr backup using the following workflow:
- Initializing new VMs from scratch
- Installing and configuring ArcGIS Enterprise 11.1 components on each VM using ArcGIS powershell DSC as if it was prod
- At the end, the standby ArcGIS Enterprise deployment is functionnal
- Restore 11.1 webgisdr done from production with tileCache backup configured as "REPLICATION_BACKUP" (I doubled check the metatadata file and it is a "REPLICATION_BACKUP" tileCache)

After that, relational datastore is successfully restored but the tileCache one fails after a while with all the time this error:

 

{
    "jobId": "329f74b3-45d2-40f2-a267-599f1941214e",
    "errorMessage": "Failed to register tile cache data store.\nCaused by: The specified GIS Server site already has a tile cache data store.",
    "description": "Deploy data store snapshot 20231125-020011-08-FULL from \\\\myser\\WebGISSite1701272254236\\dataStore\\a3729eb8-a53e-4ac5-88a4-d90a56801b3b",
    "lastModified": "2023-11-29 18:52",
    "status": "failed"
}

 

In the arcgis datastore server logs:

 

<Msg time="2023-11-29T18:20:33,793" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Applying WebGIS DR snapshot for tile cache data store...</Msg>
<Msg time="2023-11-29T18:20:37,821" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Stopping nonsql database </Msg>
<Msg time="2023-11-29T18:20:37,821" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">NoSQL database was stopped </Msg>
<Msg time="2023-11-29T18:33:01,1" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">prepare for tile cache data store configuration...</Msg>
<Msg time="2023-11-29T18:33:01,1" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Configure vm.args...</Msg>
<Msg time="2023-11-29T18:33:01,7" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Prepare default.ini...</Msg>
<Msg time="2023-11-29T18:33:01,18" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Prepare SSL certificates...</Msg>
<Msg time="2023-11-29T18:33:01,303" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Configure D:\arcgis\arcgisdatastore\nosqldata\etc\local.ini...</Msg>
<Msg time="2023-11-29T18:33:02,335" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">configuring tile cache data store...</Msg>
<Msg time="2023-11-29T18:33:04,850" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">create user usr_83dhi...</Msg>
<Msg time="2023-11-29T18:33:05,884" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">create tile cache site store if needed ...</Msg>
<Msg time="2023-11-29T18:33:08,350" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Register new title cache data store...</Msg>
<Msg time="2023-11-29T18:33:17,834" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Performing restore on a existing tile cache data store...</Msg>
<Msg time="2023-11-29T18:33:17,850" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Found backup set specified:DEFAULT_LOCAL</Msg>
<Msg time="2023-11-29T18:33:17,850" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Validate storage store for the source backup store...</Msg>
<Msg time="2023-11-29T18:33:17,850" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Validate source backup set...</Msg>
<Msg time="2023-11-29T18:33:17,881" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">prepare for tile cache data store configuration...</Msg>
<Msg time="2023-11-29T18:33:17,881" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Configure vm.args...</Msg>
<Msg time="2023-11-29T18:33:17,881" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Prepare default.ini...</Msg>
<Msg time="2023-11-29T18:33:17,881" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Prepare SSL certificates...</Msg>
<Msg time="2023-11-29T18:33:17,991" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Configure D:\arcgis\arcgisdatastore\staging\tileCache\tcs_2jikygfh\etc\local.ini...</Msg>
<Msg time="2023-11-29T18:52:43,162" type="WARNING" code="110379" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Failed to restore 'tile cache' data store. null</Msg>
<Msg time="2023-11-29T18:52:43,162" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">com.esri.arcgis.datastore.common.DataStoreException
	at com.esri.arcgis.datastore.model.couchdb.CouchDBManager.restore(CouchDBManager.java:8270)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.b(DataStoreAdminClient.java:4857)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.restore(DataStoreAdminClient.java:3796)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.a(DataStoreAdminClient.java:12829)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.deployWebGISSnapshot(DataStoreAdminClient.java:11771)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient$e.run(DataStoreAdminClient$e.java:10906)
	at java.base/java.lang.Thread.run(Unknown Source)
</Msg>
<Msg time="2023-11-29T18:52:43,162" type="SEVERE" code="111002" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Failed to import data to your replicated site. Failed to restore 'tile cache' data store.
Caused by: null</Msg>
<Msg time="2023-11-29T18:52:43,162" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">com.esri.arcgis.datastore.common.DataStoreException: Failed to restore 'tile cache' data store.
Caused by: null
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.restore(DataStoreAdminClient.java:3801)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.a(DataStoreAdminClient.java:12829)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.deployWebGISSnapshot(DataStoreAdminClient.java:11771)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient$e.run(DataStoreAdminClient$e.java:10906)
	at java.base/java.lang.Thread.run(Unknown Source)
</Msg>
<Msg time="2023-11-29T18:52:43,162" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Stopping nonsql database </Msg>
<Msg time="2023-11-29T18:52:43,162" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">NoSQL database was stopped </Msg>
<Msg time="2023-11-29T18:52:48,85" type="WARNING" code="110804" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Failed to register tile cache data store. The specified GIS Server site already has a tile cache data store.</Msg>
<Msg time="2023-11-29T18:52:48,85" type="DEBUG" code="9999" source="Data Store" process="9136" thread="36" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">com.esri.arcgis.datastore.common.DataStoreException: The specified GIS Server site already has a tile cache data store.
	at com.esri.arcgis.datastore.model.couchdb.CouchDBManager.registerDataStore(CouchDBManager.java:661)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.registerTileCacheDataStore(DataStoreAdminClient.java:13919)
	at com.esri.arcgis.datastore.client.DataStoreAdminClientForServer.registerDataStore(DataStoreAdminClientForServer.java:2843)
	at com.esri.arcgis.datastore.client.DataStoreAdminClientForServer.registerDataStore(DataStoreAdminClientForServer.java:2744)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.a(DataStoreAdminClient.java:13026)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.deployWebGISSnapshot(DataStoreAdminClient.java:11771)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient$e.run(DataStoreAdminClient$e.java:10906)
	at java.base/java.lang.Thread.run(Unknown Source)
</Msg>

 

When this occurs, I unregister the datastores from ArcGIS Server manager and uninstall ArcGIS datastore from the VM, rename installed folder "ArcGIS", rename the "arcgisdatastore" content folder,
reinstall 11.1 datastore, do not configure it, and try a manual restore using "restoredatastore" but it fails as mentionned before. Also, it seems to me that the error:

 

Failed to register tile cache data store. The specified GIS Server site already has a tile cache data store.

 

is misleading because it occurs after the following error and I don't think it is the root cause of the restore failing:

 

Failed to restore 'tile cache' data store. null

 

I also tried the following workflow:
- Initializing new VMs from scratch
- Installing and configuring ArcGIS Enterprise 11.0 components on each VM using ArcGIS powershell DSC as if it was prod
- At the end, the standby ArcGIS Enterprise 11.0 deployment is functionnal
- Restore the last 11.0 webgisdr backup done from production. At the time "REPLICATION_BACKUP" was the default
- At the end, webgisdr restore work fine and my 11.0 deployment is perfectely. So I guess we can exclude an issue with my workflow as it works perfectely at 11.0
- Upgrade deployment to 11.1
- Restore 11.1 webgisdr done from production with tileCache backup configured as "REPLICATION_BACKUP"

After that, relational datastore is successfully restored but the tileCache one fails with the same error as with "restoredatastore" command:

 

<Msg time="2023-12-01T12:21:24,172" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Applying WebGIS DR snapshot for tile cache data store...</Msg>
<Msg time="2023-12-01T12:21:24,277" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Performing restore on a existing tile cache data store...</Msg>
<Msg time="2023-12-01T12:21:24,326" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Found backup set specified:DEFAULT_LOCAL</Msg>
<Msg time="2023-12-01T12:21:24,326" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Validate storage store for the source backup store...</Msg>
<Msg time="2023-12-01T12:21:24,326" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Validate source backup set...</Msg>
<Msg time="2023-12-01T12:21:24,421" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Failed to cleanup the staging directory for tilecache data store...</Msg>
<Msg time="2023-12-01T12:21:24,421" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">prepare for tile cache data store configuration...</Msg>
<Msg time="2023-12-01T12:21:24,421" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Configure vm.args...</Msg>
<Msg time="2023-12-01T12:21:24,433" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Prepare default.ini...</Msg>
<Msg time="2023-12-01T12:21:24,433" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Prepare SSL certificates...</Msg>
<Msg time="2023-12-01T12:21:24,792" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Configure D:\arcgis\arcgisdatastore\staging\tileCache\tcs_2jikygfh\etc\local.ini...</Msg>
<Msg time="2023-12-01T12:26:02,844" type="WARNING" code="110379" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Failed to restore 'tile cache' data store. null</Msg>
<Msg time="2023-12-01T12:26:02,844" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">com.esri.arcgis.datastore.common.DataStoreException
	at com.esri.arcgis.datastore.model.couchdb.CouchDBManager.restore(CouchDBManager.java:8270)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.b(DataStoreAdminClient.java:4857)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.restore(DataStoreAdminClient.java:3796)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.a(DataStoreAdminClient.java:12829)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.deployWebGISSnapshot(DataStoreAdminClient.java:11771)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient$e.run(DataStoreAdminClient$e.java:10906)
	at java.base/java.lang.Thread.run(Unknown Source)
</Msg>
<Msg time="2023-12-01T12:26:02,844" type="SEVERE" code="111002" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">Failed to import data to your replicated site. Failed to restore 'tile cache' data store.
Caused by: null</Msg>
<Msg time="2023-12-01T12:26:02,844" type="DEBUG" code="9999" source="Data Store" process="1152" thread="32" methodName="" machine="GISSTORE.COMPANY.COM" user="" elapsed="" requestID="">com.esri.arcgis.datastore.common.DataStoreException: Failed to restore 'tile cache' data store.
Caused by: null
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.restore(DataStoreAdminClient.java:3801)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.a(DataStoreAdminClient.java:12829)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient.deployWebGISSnapshot(DataStoreAdminClient.java:11771)
	at com.esri.arcgis.datastore.client.DataStoreAdminClient$e.run(DataStoreAdminClient$e.java:10906)
	at java.base/java.lang.Thread.run(Unknown Source)
</Msg>

 

Could you please help ? What do you think could be the issue ? Should I open another ticket ?

Thanks,

Nicolas

0 Kudos
NicolasGIS
Occasional Contributor III

Hi @ChristopherPawlyszyn ,

Any update ? Any idea why is the workaround not working in my case ?

Thanks

0 Kudos
malte
by
New Contributor II

I am having the same issue. All .couch files in the  folders under arcgisdatastore\nosqldata\data\shards have exactly 13 kb, the ones from the old datastore are much larger. Unregistering the datastore and reregistering unfortunately didn't work for me.

0 Kudos
NicolasGIS
Occasional Contributor III

Hi @malte,

How did you find the issue ? Trying to restore from webgisdr or a tileCache restore ?

What error do you get ?

Is the workflow suggested by @ChristopherPawlyszyn working on your side ?

Thanks,

Nicolas

0 Kudos
malte
by
New Contributor II

Hi,

I try to restore from a webgisdr backup. I don't get any error messages, only when I try to access a scene layer. None of the solutions mentioned worked for me.

0 Kudos
NicolasGIS
Occasional Contributor III

TLDR for new readers stumbling upon this long thread and facing the same issue.

BUG-000162528 - Upgraded scene layers cannot be restored to a replicated deployment at ArcGIS Enterprise version 11.1.

The official workaround is this one:

 

The following changes should be performed on all ArcGIS Data Store machines with the tile cache data store configured:

Modify the 'tilecache_backup_type' property in '<arcgisdatastore>/etc/datastore.properties' to 'REPLICATION_BACKUP'.
Restart the ArcGIS Data Store service.
Once the changes are complete for the machines in the site:

Take a new backup from the primary deployment.
Restore to the secondary deployment.
Verify scene layers are accessible on the secondary deployment.

 

But as you may read in this thread in more details, I was never able to make it work.

Also, I do get the get the same issue at 11.2 so I think the BUG title should be updated as well.

0 Kudos
NicolasGIS
Occasional Contributor III

Hello @ChristopherPawlyszyn ,

It's been 4 months since your last reply and after days/weeks of trials I still can't restore our production ArcGIS Enterprise neither upgrading. Workaround does not work as mentionned several times. Support has given up on me with the workaround.

Any progress ? Hints ? Timeline ? 

Thanks,

Nicolas

/cc @CedricDespierreCorporon 

0 Kudos