GeoEvent Gateway automatically stopping - GeoEvent Manager webpage not loading

420
3
11-09-2023 06:58 AM
Nikhil_Kommidi
New Contributor

Hello,

Recently our GeoEvent manager webpage stopped loading and when I checked our services control panel, the GeoEvent gateway service stopped. When I try to start it, it stops automatically within less than a minute. The GeoEvent server, ArcGIS server and ArcGIS datastore services are running fine. Both our platform services: Zookeeper and Spark are started and are currently running. I have gone through the error logs of ERROR [WrapperSimpleAppMain:GatewayManager@162] - Error starting the zookeeper.
java.nio.file.FileAlreadyExistsException: C:\ProgramData\Esri\GeoEvent-Gateway\zookeeper-data and ERROR [WrapperSimpleAppMain:GatewayManager@109] - Error starting Gateway.
java.nio.file.FileAlreadyExistsException: C:\ProgramData\Esri\GeoEvent-Gateway\zookeeper-data under gateway.log file, and Shutdown failed: Timed out waiting for signal from JVM.
ERROR | wrapper JVM did not exit on request, terminated under wrapper.log file.

Ours is a production environment. Kindly help!

Thanks,

Nikhil. 

0 Kudos
3 Replies
JeffSilberberg
Occasional Contributor III

First off I hope you have opened a ticket with support on this -- 

Second, if it were me, after getting a good snapshot/backup,  I would move the file that it's complaining about existing out of the path of execution and try to restart.  But that's just a guess.  Again you need a support ticket for things like this. 

 

 

0 Kudos
Nikhil_Kommidi
New Contributor

Hi Jeff,

Thanks for the reply! I got in touch with the support team. Things that they suggested were stopping and starting the server and geoevent server and the gateway. With that turning out not to be the solution, they suggested to go with admin reset for the geoevent server. This is certainly not the choice for us as we are on the production server, though we have the backup, we are still trying to find a solution without having to go into downtime. 

My primary take was to check the network ports on the firewall, but found nothing really. Also was wondering what could potentially trigger any of these ports used by the geoevent server to be closed without doing anything. 

Nikhil. 

0 Kudos
RJSunderman
Esri Regular Contributor

@Nikhil_Kommidi -- It would be good to know which release of GeoEvent Server you are running and whether it is running on a stand-alone ArcGIS Server (not federated with an Enterprise portal), or if an Enterprise portal is part of the architecture, what role the ArcGIS Server used to run GeoEvent Server plays.

I'm confident that the ArcGIS Server platform services you mention are not part of the problem. The ArcGIS Server 'Synchronization_Service' used to run an instance of Zookeeper is only referenced by GeoEvent Server when initializing its configuration following a fresh product installation (or following an administrative reset). The ArcGIS Server platform service is checked only to see if there is an old GeoEvent Server configuration in there which might be imported and upgraded to your current release. After that GeoEvent Server uses an instance of Zookeeper managed by the GeoEvent Gateway and does not make further use of the ArcGIS Server platform service.

ArcGIS Server is constantly communicating with the GeoEvent Gateway. If the GeoEvent Gateway service is stopped (or has crashed) the GeoEvent Server service needs to be stopped. GeoEvent Server cannot run without its Gateway managing the Apache Kafka message broker and Zookeeper distributed configuration store.

If you are looking at the .../GeoEvent/data/log/wrapper.log and see that the GeoEvent Server's JVM has been shutdown, that means GeoEvent Server is not running (regardless of the state of the GeoEvent Server service shown in the Windows MMC Services console). If the JVM is not running, your GeoEvent Server is not receiving, adapting or processing real-time data. You also won't be able to launch the GeoEvent Manager web application.

It is likely that when you try to start the GeoEvent Gateway it attempts to coordinate its Kafka topics with the Zookeeper configuration. When that fails the GeoEvent Gateway cannot initialize and shuts down. The Kafka and Zookeeper managed by GeoEvent Gateway are very tightly coupled. Kafka cannot do its job without Zookeeper (and vice versa).

If you see indications that a file beneath C:\ProgramData\Esri\GeoEvent-Gateway\zookeeper-data already exists and this is interfering with the GeoEvent Gateway initializing either its Kafka or Zookeeper ... I can only guess that something has corrupted the Gateway's runtime files. I can offer the advice that creating system restore points using a VM snapshot (for example) is not a reliable way to backup your GeoEvent Server. A snapshot of a VM is not "application consistent" for Esri software. GeoEvent Server in particular may fail to restart following a revert to a VM snapshot if real-time data was actively being ingest, adapted, processed, and/or disseminated when the VM snapshot image was taken. When running normally the GeoEvent Gateway is actively writing data to disk -- a VM snapshot may capture an inconsistent replica or internal state which causes one or more Kafka topics to become corrupted.

I do not like recommending an administrative reset as it is the most destructive remedial step you can perform, particularly prior to the 11.1 release when the reset obliterates any Input, Output, GeoEvent Service, GeoEvent Definitions and other configurable elements you have created using GeoEvent Manager. However, if files which exist beneath C:\ProgramData\Esri\GeoEvent-Gateway are interfering with stopping and restarting the GeoEvent Gateway, an administrative reset really is your only option.

0 Kudos