Split comma separated values to multiple rows

861
21
Jump to solution
03-15-2024 05:59 AM
Ed_
by MVP Regular Contributor
MVP Regular Contributor

Based on the sample data below I am trying to split each comma separated value in `MUNICIPALI` column to a new as shown in desired output. However, I am getting an error that `Projects_City` cannot open, it does not exist, but then I thought line 6 and onwards will create it. How can I fix this issue? On a side note, the comma separated values are a lot more than the sample data. For, example, one of them rows have 18 or something comma separated values so, will line 10 work for that? 

Code:

 

 

 

 

import arcpy
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
out_df = r"O:\MyProject.gdb\Projects_City_Updated" # Doesn't exist, expect it to be created in line 6
fldlst = ["OBJECTID", "PRJID", "MUNICIPALI"]
with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
    with arcpy.da.SearchCursor(in_df, fldlst) as searchCursor:
        for row in searchCursor:
            for attrbPart in row[3].split(","):
                insertCursor.insertRow(( row[0], row[1], row[2], attrbPart))

 

 

 

 

 

Error:

 

 

 

 

Traceback (most recent call last):
  File "o:\path\SplitCommas2Row.py", line 6, in <module>
    with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
RuntimeError: cannot open 'O:\MyProject.gdb\Projects_City_Updated'

 

 

 

 

 

Update

After creating the table, please see updated code below, I am getting the following error, how can I fix it?

 

Code:

 

 

 

import arcpy

# To allow overwriting outputs change overwriteOutput option to True.
arcpy.env.overwriteOutput = True
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
arcpy.management.CreateTable(
    out_path=r"O:\MyProject.gdb",
    out_name="Projects_City_Updated",
    template=in_df,
    config_keyword="",
    out_alias=""
)

out_df = r"O:\MyProject.gdb\Projects_City_Updated"

fldlst = ["PRJTID", "MUNICIPALI", "ZIPCODE"]

with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
    with arcpy.da.SearchCursor(in_df, fldlst) as searchCursor:
        for row in searchCursor:
            for attrbPart in row[3].split(","):
                insertCursor.insertRow(( row[0], row[1], row[2], attrbPart))

 

 

 

 

Error:

 

 

 

Traceback (most recent call last):
  File "o:\SplitCommas2Row.py", line 22, in <module>
    for attrbPart in row[3].split(","):
IndexError: tuple index out of range

 

 

 

Sample Data:

OBJECTIDPRJTIDMUNICIPALIZIPCODE
11A1
22A, B1,2
33A, B, C1,2,3
44A, B, C, D1,2,3,4
55A1
66C3

 

Desired output:

OBJECTIDPRJTIDMUNICIPALIZIPCODE
11A1
22A1
32B2
43A1
53B2
63C3
74A1
84B2
94C3
104D4
115A1
126C3
Question | Analyze | Visualize
Tags (1)
0 Kudos
21 Replies
Ed_
by MVP Regular Contributor
MVP Regular Contributor

Hi @JoshuaBixby hope all is well, so the code works fine however there's a slight issue and that's there are municipalities in the data sets that are made of two or more words so they have space(s) in their names. And so the code for some reason removes the spaces from the names. How can I fix that?  Moreover, there are some municipalities whose names are composed of 4 words such as `Lauderdale By The Sea` 

Input data sample row:

Ed__0-1710867223209.png

Output (from script) data sample row:

Ed__1-1710867280536.png

 

 

 

Question | Analyze | Visualize
0 Kudos
JoshuaBixby
MVP Esteemed Contributor

I updated the code in my original post, try it out and let me know.