Service configuration questions - instances, isolation, & SOC.exes

10896
12
10-01-2010 09:03 AM
TomCohen
Occasional Contributor
Hi,

There are many discussions in the forums about configuring min / max service instances for ArcGIS Server map services (set in the pooling tab in ArcCatalog / Manager) and I have spent a lot of time reading these threads and various pieces of ESRI documentation, but I still have a number of outstanding quetions that no one seems to answer.  If anybody can provide some input on these I would be extremely grateful!

1) The max number of instances (SOC.exes) is variously recommended as 3 - 4 per cpu core or 1 - 2 per cpu core (depending on which ESRI document you read...).  Assuming 3 - 4 per cpu core, is this per map service, or across all services?  This may seem like a dumb question but I just want to be sure.

2) How is max instances (see above) affected by high / low isolation?  If I have high isolation (1 thread per SOC.exe process), then 4 max instances must translate to 4 SOC.exes running (plus the system SOC process).  If I have low isolation with 4 threads per process set in the processes tab, and I set max instances to 4 in the pooling tab, will I end up with 1 SOC.exe (+1 system)?  If yes, does this mean that I can have 16 max instances (4 SOC.exe * 4 threads per process) on a single core machine?  This is the most important issue for me.

3) Are there any guidelines on the maximum safe number of threads per process in low isolation?

4) All documentation that I have seen on low isolation says something like "low isolation is less stable as if the container process fails the service instances that share the container also fail".  However the documentation never says what the impact of this is - will the SOM start a new SOC and complete any outstanding operations, or will the user receive an error?  Also, how likely is the container process to fail?  Could this happen from something like an SDE connection being closed by a firewall and a service instance crashing?

Thanks for anyone who can contribute on this
Tom
Tags (2)
12 Replies
ScottNoldy
New Contributor III
I have exactly the same questions.  I'm posting so this thread doesn't drop down too far.
PedroPeña
New Contributor II
We're searching and asking about the same question for a few months and we haven't managed to clear it. All this stuff is very dark and ambiguous, and even from ESRI Spain couldn't give us a satisfactory answer...
I'm following this post with a lot of interest. And I think that I'm not the only one...
Can anyone of the ESRI technical staff clarify something?
Thanks!!
by Anonymous User
Not applicable
Hi Tom,

  Below are some answers to your questions, but some ambiguity will always remain. This is because there are so many variables (Hardware, Amount of Data Published, Data Sources, Network Infrastructure). To properly tune your system you have to monitor it. You may find depending on the load you are handling that you need to hold strict to the recommendations, or you can exceed these recommendations for your purposes.


1) The max number of instances (SOC.exes) is variously recommended as 3 - 4 per cpu core or 1 - 2 per cpu core (depending on which ESRI document you read...).  Assuming 3 - 4 per cpu core, is this per map service, or across all services?  This may seem like a dumb question but I just want to be sure.


You are correct that past documentation states that 3-4 Instances per cpu was the recommendation, and newer documentation suggests 1-2 Instances per cpu. These recommendations are to maximize throughput, and come from performance testing done by Esri. The best source for this type of information is the Enterprise GIS Resource Center.

The instances per CPU are referring to total ArcSOC.exe processes running on the machine, also known as SOC Capacity.


2) How is max instances (see above) affected by high / low isolation?  If I have high isolation (1 thread per SOC.exe process), then 4 max instances must translate to 4 SOC.exes running (plus the system SOC process).  If I have low isolation with 4 threads per process set in the processes tab, and I set max instances to 4 in the pooling tab, will I end up with 1 SOC.exe (+1 system)?  If yes, does this mean that I can have 16 max instances (4 SOC.exe * 4 threads per process) on a single core machine?  This is the most important issue for me.


Yes, you could have 16 GIS Service Instances, on 4 actual ArcSOC.exe Processes.


3) Are there any guidelines on the maximum safe number of threads per process in low isolation?


I have not seen any recommendations on using Low Isolation, let alone the number of threads to allocate. You may have some success with this, I would only do this on services that handle simple requests, as well as test and monitor for stability.


4) All documentation that I have seen on low isolation says something like "low isolation is less stable as if the container process fails the service instances that share the container also fail".  However the documentation never says what the impact of this is - will the SOM start a new SOC and complete any outstanding operations, or will the user receive an error?  Also, how likely is the container process to fail?  Could this happen from something like an SDE connection being closed by a firewall and a service instance crashing?


You are correct that changing isolation can affect stability of the ArcSOC.exe process. A single process can be configured to handle/run up multiple Map Serivce Instances. So when you set the Service Max to 4, this could be handled by the 1 process. If you opt to do this, I would only implement this with services that would handle small simple requests.

The ArcSOC.exe process will fail if the firewall kills a connection to SDE, no matter the isolation level. The SOM will treat the service just like any other, and it should start a new process if the ArcSOC.exe process crashes. Errors could vary depending on when the ArcSOC.exe crashes, and what requests are being made.

The bottom line is that Esri makes a recommendation for optimal performance, for the server to handle a consistent heavy load. Your needs may be a bit different. Through some testing and monitoring you may find that you are able to stretch the recommendations, or that you need to strictly follow them. Some of the best information, tools for testing, and recommendations can be found on the Enterprise GIS Resource Center.

I hope this helps,
Andrew
PedroPeña
New Contributor II
Thank you, Andrew, this helps us for sure.
A doubt: then, for example, if we're limited by cores of CPU to 3-4 simultaneous services intances (the old rule) and 16 are the maximum number of services instances running at the same time (recommendation for a Quad Core) in a single machine, a high isolation configuration will be better and desiderable than a low config., always if we´ve this machine plenty of RAM, because there is no problem with 16 ArcSOCs.exe running at the same time; the limitation comes more from the number of processors (of course, ArcGIS Server licensing cost too!!) than other facts of hardware like memory. And for this, to avoid server processing degradation, we must change the default value (unlimited) to 16 of the Capacity machine in the Host Properties.
One more time,
Thanks!
Pedro.
LeoDonahue
Occasional Contributor III
The one item that is missing from this thread is "pooled services" vs "non-pooled services".

Assumption:
Single machine deployment, single socket, dual core CPU. 
4 service instances per SOC cpu = 8 max service instances
1 map service with min of 0 and max of 8 instances assigned.

Pooled Services in high isolation:
each service instance runs in its own process
(could potentially have 8 SOC.exe processes running  or 0 - depending on use)

Pooled Services in low isolation:
multiple service instances can run in one process
(ESRI docs indicate you can have anywhere from 8 - 256 instances per process)

Non-Pooled Service - isolation does not apply
(number of simultaneous users = number of service instances possible)

If your application is using Pooled Services and does not change application state, then "my opinion" is that Pooled Services in high isolation is ok because the WebADF releases the service instance back to the pool when it doesn't need it. That being said, and given the assumption above, you could have a lot of users of one map service on a dual core system and have decent performance.

Per ESRI docs, 4 factors come into play:


  1. Usage time of the service

  2. The number of overall possible users

  3. The frequency of requests to a service

  4. The intensity of the processing required per request

by Anonymous User
Not applicable
Nice Response Leo, and excellent points!

Pooled Services are most efficient and while there could be a max of 2 instances for the service, they can handle requests from many clients.

Non-Pooled Services are less efficient and maintain a 1 to 1 relationship between server process and client. So max of 2 here would only allow 2 clients to connect. Non-pooled services are used more with versioned editing, but this workflow has been replaced with more efficient Feature Services in API Applications.

*Don't use Non-Pooled Services unless you absolutely have to!

As for Pedro..

While the number of Services running can be beyond beyond 16, it all comes down to how many request the hardware can handle at a specific point in time. I could have 30 Services running on a single quad core machine, and it could run fine with moderate requests... but if all 30 services were very busy I'd have a traffic jam at the cpu level!

In your case if you have 10 Services with a Min 1, Max 2 (Potential of 20 instances busy) you could set the SOC Capacity to 16 to avoid a traffic jam with too many ArcSOC.exe trying to use CPU. Busier services will use the Max 2 instances, and less busy services will only use 1. (The SOM will control/balance this) Setting the capacity at 16 will ensure only 16 ArcSOC.exe processes are running on the container at a given time. While this can keep you at a "x" per CPU level, you may find errors in your log that the Server is at capacity.

In my opinion, I would just be mindful of the number of services published to the server and appropriately plan what content is in each service. (Just cause we have an mxd with Layers, it doesn't always mean it has to be a new service!) Continue to monitor the server for performance and see if any time outs are recorded in the Server Statistics that can be viewed from ArcCatalog. If you end up with a few too many services, be understanding that you may be asking the hardware to do too much work and it may slow down when very busy. Use the knowledge you have now to help limit the load, and if needed test setting a SOC capacity to limit the workload.

While licensing more CPU is always something to consider, there are times when another server is needed to handle the load. I consider a GIS Server to be like an additional co-worker who makes maps and does real GIS analysis. As the department and demand grows, I may need to hire more help!

Cheers,
Andrew
AndresCastillo
MVP Regular Contributor

excellent @Andrew Stauffer

JamalNUMAN
Legendary Contributor

But what does the instance physically mean? Is it the GIS service or the ArcSoc.exe?

 

Instance per machine?!

Instance per process?!

 

What is the instance?

----------------------------------------
Jamal Numan
Geomolg Geoportal for Spatial Information
Ramallah, West Bank, Palestine
0 Kudos
TomCohen
Occasional Contributor
Thanks very much for your responses Andrew (& Leo), this has helped clarify some of the more complex aspects of ArcGIS Server for me.

While waiting for some feedback on this thread I decided to put some of my questions to the test. I developed a performance testing tool and hammered the server via the REST interface under various configurations. For my (very simple) services I found that low isolation worked very well and significantly reduced memory usage:

With 10 services running in high isolation and min / max instances at 1 / 2 for each service, I found that lots of simultaneous requests (distributed randomly across the services) resulted in about 17 SOC.exe processes with each consuming around 84mb. In low isolation (4 threads per process) the same test gave me 10 services with some running up to around 228mb (high, but less than 4 x 84). My services are so straightforward (and pooled) that I have no major reservations about low isolation.

One peculiarity came out of my test: I ran 8 simultaneous requests against a single service in low isolation at 4 threads per process, and min / max instances at 1 / 8. I expected to see two SOC.exe processes with high memory usage, but instead I only saw 1 SOC.exe (high memory) and the requests appeared to be serviced 4 at a time (when I say 8 simultaneous requests, 8 requests from a web client went live at the same time). I thought this could potentially be the SOM limiting throughput, though any other suggestions anyone has would be much appreciated (I was using 9.3.0)!