Workflow: Get started in Data Pipelines

800
4
06-15-2023 11:47 AM
Labels (3)
DuncanMackey
Esri Contributor
3 4 800

To help get started with using Data Pipelines, here is an example workflow you can follow to learn about a variety of different tools and features. In this workflow you will create a feature layer in ArcGIS Online containing new housing building permits enhanced with median household income in Seattle, Washington, USA.

The steps to complete this workflow include loading data from two different types of sources, cleaning up and filtering the data, performing a spatial join, and finally writing to a feature layer in ArcGIS Online. To learn more about Data Pipelines, visit the documentation.

Let's get started!

Step 1: To begin building the data pipeline, you’ll connect to the building permit dataset from Seattle’s open data website.

  1. Navigate to the Inputs list and choose Public URL to add it to the diagram. Configure the following parameters:
    1. Set URL to “https://data.seattle.gov/api/views/76t5-zqzr/rows.csv?accessType=DOWNLOAD
    2. Set Data format to “CSV or delimited”
    3. Click the Preview button to inspect the dataset.

Step 2: Filter the dataset to only include new residential building permits.

  1. Add a filter by attribute tool from the Tools list and configure:
    1. Connect the Filter by attribute tool to the Public URL input from the previous step. Here is an image of what your data pipeline should look like after adding Filter by attribute:DuncanMackey_0-1686840251093.png
    2. Click the Build new query button to add a filter expression
    3. In the Query builder dialog that appears, select Expression and click Next
    4. Configure the query as follows:
      • PermitClassMapped is Residential

      • PermitTypeMapped is Building

      • PermitTypeDesc is New

DuncanMackey_1-1686840389295.png

6. Click Add to save the filter expression

7. Click the Preview button on the Filter by attribute tool pane to inspect the results

Step 3: Create a new point geometry from longitude and latitude fields to spatially enable the dataset.

  1. Add a Create geometry tool from the Tools list and configure as follows:
    1. Set Geometry type to "point”
    2. Set Geometry format to "XYZ"
    3. Set X field to "Longitude"
    4. Set Y field to "Latitude"
    5. Click the Preview button and navigate to the Map preview tab to explore the dataset.

Step 4: Add the census income survey dataset to the data pipeline. This dataset is hosted in Living Atlas.

  1. Add a Feature layer source from the Inputs list
    1. Click Select item
    2. Click My content and select Living Atlas in the dropdown menu that appears
    3. Search for and select Bureau of Labor Statistics Monthly Unemployment (latest 14 months)
    4. Click Confirm
    5. Set Layer to "Tract"

Step 5: Keep only the required data by selecting and renaming the median household income field.

  1. Add a Select fields tool from the Tool list
    1. Connect to Tract input added in the previous step
    2. Select "B19049_001E" (the income field we are interested in) and "shape"
  2. Add an Update fields tool
    1. Connect to Select fields tool added in the previous step
    2. Select "B19049_001E"
    3. Set New field name to median_household_income_12_months
    4. Your Updates parameter should now look like this:DuncanMackey_2-1686841144602.png

Step 6: Now join the two datasets together with a spatial join using the building permit dataset’s newly created Point field, and the census dataset’s existing Polygon field.  

  1. Add a Join tool from the Tools list, and configure as follows:
    1. Set Target dataset to "Create Point"
    2. Set Join dataset to "Update fields: 1"
    3. Enable Use spatial relationship
    4. Set Spatial relationship to "Within"
    5. Confirm that the geometry fields auto-populated

Step 7: Finally, write the result of the join to a feature layer.

  1. Add a Feature layer output from the Outputs list
    1. Confirm that Geometry field is set to GEOMETRY
    2. Set a unique Output name for the new feature layer
    3. Click Run in the diagram toolbar. This will process all of the data in both initial datasets, perform each operation, and write the data to the new feature layer. 
    4. Once complete, click Output results and open the new layer in Map Viewer to explore the results.

Here's an example of what your final diagram will look like if you have been following along:

DuncanMackey_3-1686841606402.png

To recap, we've loaded data from two different sources, done some basic transformations, filtered and spatially enhanced a dataset, and spatially joined the two datasets. We then outputted the results of all this work to a new feature layer, ready for further use in Online. Nice!

Thanks for following along, and let us know if you would like to see additional content like this. We can't wait to see what you do with Data Pipelines!

 

 

 

4 Comments
ClayDonaldsonSWCA
Occasional Contributor II

Very exciting - is there plans to integrate the analysis tools available in web maps? Would be amazing to use data pipelines to buffer, intersect, etc.

SarahAmbrose
Esri Contributor

Thanks for the feedback @ClayDonaldsonSWCA 

We're working towards building a model like experience for analysis as well, using the same components and interface. Currently, we plan to keep Data Pipelines as it's own application, focused on data prep. Analysis, and model builder will be in the Map Viewer, so that you can perform more interactive analysis and take advantage of things like filtering, symbology, and charting. You will be able to use your layers created through Data Pipelines in analysis, so the two apps will be companions to one another.

Sarah Ambrose
Product Engineer, Web Analysis

JulienCHARBONNAUD
New Contributor III

Hi,

Is it planned in the developments to have in output format not a layer but a csv format, for use in updating CSV files of Survey123 survey forms.


THANKS

Julien

BethanyScott
Esri Contributor

Hi @JulienCHARBONNAUD ,

Thank you for reaching out.

Currently, Data Pipelines is centered around data ingest of different data sources and writing them to hosted layers in ArcGIS Online.

I'm interested in learning more about your workflow. Could you please contact me directly at bscott@esri.com so we can discuss in more detail?

Thanks again!

Bethany