High performance computing with ArcGIS

7109
8
03-22-2015 05:22 PM
PatriciaCarbajales-Dale
New Contributor III

Hello,

We are trying to set up geospatial services in our high performance computing cluster running Linux. Somebody at ESRI talked to one of our faculty and mentioned the possibility of installing ArcGIS Server with ArcGIS Engine on top. I'm doubtful this will be the solution for using multiple nodes for faster performance and processing of geospatial data.

I know of other solutions with Python, GDAL, and R. Just wondering if anybody has tried and has been successful implementing something like this with ESRI products.

Thank you in advance,

Patricia

0 Kudos
8 Replies
SarahAmbrose
Esri Contributor

Hi Patricia,

Have you checked out gis-tools-for-hadoop? It’s available on GitHub, and may be what you are looking for. There is also a Big Data group on GeoNet where you can ask questions about GIS Tools for Hadoop.

PatriciaCarbajales-Dale
New Contributor III

Hi Sarah,

Thank you for your answer. Yes, we have worked briefly with the gis tools for hadoop. However, somebody at the Dev Summit mentioned that the current tools are going away in the near future. Any confirmation on this will be extremely helpful.

Thank you,

Patricia

0 Kudos
SarahAmbrose
Esri Contributor

Hi Patricia,

There are no plans to depreciate/remove the GIS Tools for Hadoop.

Let me know if you have any more questions,

Sarah

0 Kudos
PatriciaCarbajales-Dale
New Contributor III

Thanks so much, Sarah. This is very helpful!

MiguelMorgado
Occasional Contributor

Hi Patrica, what kind of data do you want to serve?

If what you want is setup map services to be used by web applications AGS in high availability with load balancer will be the solution.

If what you want to use is geoprocessing tools then probably you need to look in a diferent way.

Can you provide more details?

0 Kudos
PatriciaCarbajales-Dale
New Contributor III

Hi Miguel,

Thank you for your reply. The goal is to work with large datasets and perform intensive analysis and processing. I do no have any use for web applications, just pure analysis of large data or hundreds/thousands of geospatial files. For example: a process where we have 25,000 raster files and we want to reproject them, aggregate fields, and calculate some statistics. The process in one desktop can take up to 3 months. If we can distribute the process ( for example, using traditional high-throughput computing), we can run the same model in just hours.

Let me know if this makes sense or you need more details.

Regards,

Patricia

0 Kudos
MiguelMorgado
Occasional Contributor

Ok, Understood. Well you can setup AGS in HA easily with hundreds of VMs processing data, however I doubt this will be the best option available.

I will use python tools or other specific software.

M

0 Kudos
curtvprice
MVP Esteemed Contributor

miguelmorgadoKD2, is this still your opinion even if the tools involved aren't easily available in other software, for example, more specific geoprocessing tools like the hydro raster tools?

0 Kudos