Where to store imagery for best performance: in SDE?

Anonymous User · ‎06-01-2016

Hi all. I did a bit of readingto find out how to get the best speed and quality for aerial imagery. We have 3" aerials that arrived as geoTIFFs. I kept them in geoTIFF with 80% JPG compression in YCBCR color space to build pyramids and overviews. Apparently JPG is much faster than lossless or wavelet (ECW/JPG2k) compression to serve per these links:

http://blog.cleverelephant.ca/2015/02/geotiff-compression-for-dummies.html

https://blogs.esri.com/esri/arcgis/2010/12/21/rasters-get-speed-save-space/

Additionally, bilinear convolution was used since imagery is continuous data. (i.e. not nearest neighbor as used for discrete data)

http://gis.stackexchange.com/questions/17328/what-resampling-technique-should-be-used-when-projectin...

Now, my main question is this: I am inquiring where to store this in order to publish a service that is as fast as possible. The SDE or a file geodatabase? I intend to consume the imagery as a tiled service in web viewers to get the fastest perfromance. We have server 10.2.2. Eventually this year we'll migrate to 10.4.

This link below states imagery referenced in a mosaic dataset (which is what should be used) should NOT be stored in a database, out of performance reasons, but as standalone files in the file system, and it also states overviews should not be stored in SDE either:

https://community.esri.com/docs/DOC-8061 (it's down a few pages, it's one of the responses from Esri)

In other words, if I understand correctly, the original image files themselves and the overviews should NOT be copied into a geodatabase. The mosaic dataset should point to the plain original files. What is not clear however is whether the mosaic dataset should be on the SDE or not, before publishing to a service. Again, I think I've got the quality and size optimal from the parameterization above but I just want to know now where to store the imagery for max. speed.

Also, I noticed discussion of projection but that Esri recommends keeping imagery in its original projection. Our imagery was delivered to us in Georgia State Plane Ft NAD83. (no mention of which NAD realization in the metadata).

https://blogs.esri.com/esri/apl/2014/06/24/designing-and-optimizing-image-services-for-high-performa...

However based on this link above, I am wondering if for faster performance, if we stored imagery in web Mercator (reprojected) it would serve up approx. a second faster? Because of course otherwise ArcGIS Server reprojects on the fly from State Plane to Web Mercator in our webviewers, all of which have Esri basemaps. I am wondering whether forcing webviewers to State plane by loading our vector layers first would make Esri’s servers broadcast the Esri basemaps to the viewer in State plane? By loading our layers first to set the projection? Or if there's another way to get around this programmatically and get request Esri basemaps in a State plane projection vs web mercator? On this last point I shall research more. For now I shall leave it in its original projection. At least it's in mercator so the performance penalty should be lesser, according this document above.

NeilAyres · ‎06-01-2016

I can't comment on image datasets that are gazzillions of Gb in size, but usually what I do is...

Keep the imagery in whatever format it came in. The exception here is ecw format which I raster copy to tiff, using YcbCr 100% jpeg as the compression, as you suggested.

These all live in a subdirectory system directly below an fgdb in which I build the mosaic dataset which points into this file system. I also put the overviews inside the fgdb (using the define overviews tool).

When publishing, load the mosaic into an mxd, then set the data frame coord sys to Web Mercator (WKID 3857?).

For speed, up may want to consider making the whole thing as a tile cache, as long as the imagery is not being constantly updated. Tile caches are very fast, but usually end up taking more physical disk space than the original imagery.

RebeccaStrauch__GISP · ‎06-01-2016

We keep our in FGDB/mosaic datasets. In addition to what Neil suggested,m if you have access or can purchase the Image Server extension, that can improve speed, access and on-the-fly functions for services, beyond what can be done from just the mosaic datasets and tile caches. (in my opinion). It can also, in theory, handle reprojects on the fly, but mosaic datasets can do this to.

ArcGIS for Server | Image Extension

If you haven't seen this one yet...

Image Management Workflows | ArcGIS Resource Center

CodyBenkelman · ‎06-01-2016

Kevin

I only have time for a brief reply so let me know if you need more.

For fastest performance, you'll want to convert the imagery to raster tile cache. Note this is 8 bits/pixel only, 3 colors only, and you can't change which image is on top if you have overlapping imagery - but I assume you do not have overlapping data. Also note this DOES involve rewriting your data into a new format, so you're resampling and duplicating data (and you can then decide if you want to take the original data offline), but if performance is your #1 priority, then serving cached imagery (same format as Esri base maps) is definitely the fastest.

We did a 60 minute webinar that discusses details, tradeoffs, and workflows here: Esri Training | Sharing Cached Imagery in ArcGIS

One detail that may be worth discussing further (and testing) is fixing everything to GA State Plane. Most web viewers work in Web Mercator, but if a) you're building a custom viewer and b) you want to use only GA State Plane and c) fast performance is #1 priority, you can build your tile cache in GA State Plane and configure your viewers to use that projection. But note that in this scenario the Esri base maps will NOT perform as fast, since you're reprojecting them...

But as noted there are tradeoffs, so if you want to test serving the original imagery before committing to the caching process, your summary above is correct, and this does perform well (I can point you to examples, e.g. NAIP data at http://www.arcgis.com/home/item.html?id=3f8d2d3828f24c00ae279db4af26d566 ). You may find it will be fast enough, and serving from the original data will preserve the best image quality (there is some loss of quality whenever you resample e.g. to create the tile cache)

Leave images in original projection and original format on disk - make sure your server has a fast connection to the source data (e.g. direct attached storage, ideally an SSD disk, not network attached storage)
Access the imagery using a mosaic dataset. If you already have an enterprise database, you can use SDE, but a file geodatabase performs just as fast.

For more information on all of this, you can start with this landing page http://esriurl.com/ImageManagement and follow links to the Image Management Guidebook.

Anonymous User · ‎06-01-2016

It's a terabyte of imagery. Yes I already built tiles (overviews) and I want to serve it as a tiled service for max. performance. I hope I did it right. It built the .ovrs. I also built pyramids. With parameters as given above. Only took a few days, much faster than we thought it would take. Maybe Arc 10.2 sped the process up or maybe our new hardware from this year was a massive upgrade.

Cody thank you for your reply, very helpful. Yes I set viewers to our projection. So, that's good.. Esri's reprojecting to fit our viewer not the other way around..Sounds good. Because I imagine Esri has extremely fast hardware and network in the datacenter.

And SSD array for our server, hmm. Gosh that'd be fast but to store imagery on that would be probably far beyond our budget. We have NAS now. Fast fiber interconnects and RAID striping and 10k disks so they are as fast as can be for spinning discs. Also 96 gis of RAM on brand new HP blades with good specs, and a fast Comcast external connection and fiber intranet. Perhaps on the next upgrade cycle we can acquire SSDs.

ModyBuchbinder · ‎06-01-2016

A few more tips from my experiance.

If your files are tiff cpmressed in YcbCr (I think 75% is very good too) and with the size of anout 5k X 5k it is working the best.

Than try to put the FGDB itself (not the rasters) locally (it is not too big) and compress it (if you do not add more rasters).

All this might help a little more.

Anonymous User · ‎06-02-2016

A database compress? Like how you have to do with versioned SDE?

ModyBuchbinder · ‎06-04-2016

Compress file geodatabase.

Make it read only and much faster.

Anonymous User · ‎06-03-2016

OK so it sounds like fastest thing to do is NOT store in the SDE, but to store the imagery and tile overviews and pyramids all in a file geodatabase and publish this? And bring esri basemaps in reprojected to fit our system of state plane?

CodyBenkelman · ‎06-03-2016

Kevin

From your description and use of terminology, I am concerned that there is still confusion.

No, imagery should not be stored IN the geodatabase. "Imagery" includes full res source images, AND overviews, AND Pyramids. Those should all be stored as files on disk (typically Tiff), and this is IF you are serving a dynamic image service using a mosaic dataset. The mosaic dataset IS inside the GDB but it is only a data structure that points at the image files - a small fraction of the data volume of the imagery.
The FASTEST performance is (typically) from a cached tile service, which is different from a dynamic image service using a mosaic dataset. Do not confuse Overviews as Tile Cache. They are different formats, different structures, and served differently.
My qualifier above ("typically" fastest) presumes all data is in a single projection (web mercator OR state plane) but if you're using our base maps (web mercator) in a web client configured for state plane, those will be reprojected, and that will degrade performance - should not be a BIG slowdown, but it will be slower than a web map in web mercator. Given that requirement, it is very possible that serving imagery as a dynamic image service will be just as fast as cached imagery, and in that case you can save yourself a) the time and b) the disk space and c) the loss of image quality due to resampling that is required to generate raster tile cache.

My recommendation (if you have ArcGIS Server and the Image Extension which I believe you indicated you do) is to test this configuration first, and see if it's "fast enough":

Build mosaic dataset in state plane projection
ensure images have pyramids
build overviews on the mosaic dataset
serve as an image service
connect into your web client, with web client configured as state plane

Test without Esri base maps showing. THen add Esri base maps and determine if performance is slower, and also if the performance is fast enough.

IF this shows performance problems, then I'd create a raster tile cache for a *subset* of your area (in state plane), and serve that as a cached map service. Refer to the LTS link I gave you above if you don't know how to do this. Then test this subset to verify if it's faster.

I'd also give the user the ability to turn off the Base maps, and that should speed performance when the user is simply viewing your imagery and does not need the base maps (hidden under the imagery)

Side note - this should not be slower than ~1.5 to 2 seconds to repaint a screen when panning and zooming. If it is, we should look at other causes outside ArcGIS Server - e.g. disk to server bottlenecks, network bandwidth anywhere between the server and your office. Unless the service is improperly configured, it will not take longer to serve imagery from a dynamic image service, and should be well under 1 second for raster tile cache w/o reprojection.

Cody B.