Get multiple JSON from a bucket using AWS API service and convert/save to single CSV

2812
0
04-29-2022 09:06 PM
ChristopherCharles-Noriega
New Contributor II

I need assitance with my Python script.

So the code below uses the Boto3 library to get a JSON file from the AWS API and converts/saves it to a CSV.

But what I need help with is getting multiple JSON files and converting/saving them all to a single CSV file, I've achieved this in the past (see bottom block code below) but I'm unsure how to do this with this particular API AWS script.  

Note that I will be using the script to get more than 100 files

Can anyone help? View the code below. 

So the first file is 'data/JSON_GetAll_page_1.txt', but I need to get:

data/JSON_GetAll_page_2.txt

data/JSON_GetAll_page_3.txt

data/JSON_GetAll_page_4.txt

data/JSON_GetAll_page_5.txt

View code block 5.

 

import boto3

client = boto3.resource('s3',
                        aws_access_key_id = '',
                        aws_secret_access_key = '',
                        region_name = ''
)

import os
data_path = "/saved_csvfile/"
import json

content_object = client.Object('bucket', 'data/JSON_GetAll_page_1.txt')
file_content = content_object.get()['Body'].read().decode('utf-8')
json_content = json.loads(file_content)
print(json_content['payload'])
import pandas as pd
df = pd.DataFrame(json_content['payload'])
print('Converting from JSON to CSV file ...')
for i in range(2,6):
    content_object = client.Object('bucket', 'data/JSON_GetAll_page_{i}.txt')
    file_content = content_object.get()['Body'].read().decode('utf-8')
    new_df = pd.DataFrame(json_content['payload'])

    df = pd.concat([df, new_df])
print('Successfully Converted')
df.to_csv('JSON_GetAll_page_1-5.csv',index=False)
df.to_csv(os.path.join(data_path, 'JSON_GetAll_page_1-5.csv'))
print('CSV Saved!')

 

 

I have done this in the past with this script:

 

from arcgis.gis import GIS
import pandas as pd
import requests
import os

gis = GIS("home")
url = 'https://()JSON_GetAll_page_1.txt'
data_path = "/arcgis/home/data/"
res = requests.get(url)

df = pd.DataFrame(res.json()['payload'])
print('Converting from JSON to CSV file ...')

for i in range(2,3):
    # res = requests.get(f'https://()JSON_GetAll_page_{i}.txt')
    new_df = pd.DataFrame(res.json()['payload'])

    df = pd.concat([df, new_df])
print('Successfully Converted')

df.to_csv('JSON_GetAll_page_1.csv',index=False)
df.to_csv(os.path.join(data_path, 'JSON_GetAll_page_1.csv'))
print('CSV Saved!')

 

 

Tags (5)
0 Kudos
0 Replies