comma delimited field

4629
8
Jump to solution
02-18-2014 02:56 PM
AmyKlug
Occasional Contributor III
Hello!

I am working with a table and I would like to get this:

number(stored as text)        Code        Year
0500, 0600, 0700               ABC1         2012

to this:

number           Code          Year
0500               ABC1        2012
0600               ABC1        2012             
0700               ABC1        2012

I have gotten to where I can insert each number/space/comma into a new row but obviously I need it to break it at the commas. Ideas?
Tags (2)
0 Kudos
1 Solution

Accepted Solutions
XanderBakker
Esri Esteemed Contributor
Hi Amy,

I think this will do the job:

import arcpy, os  # access your table(s) intbl = r'C:\path\to\your\FileGeodatabase.gdb\inputtable' outtbl = r'C:\path\to\your\FileGeodatabase.gdb\outputtable'  # optionally create output table, using the input table as schema # CreateTable_management (out_path, out_name, {template}, {config_keyword}) outpath, outname = os.path.split(outtbl) arcpy.CreateTable_management(outpath, outname, intbl)  # have a close look at the fieldnames, some names may be restricted  # and are changed by ArcGIS when imported from for instance Excel flds = ['number','Code','Year']  # do a search cursor on the input table and an insert cursor on the output table outcurs = arcpy.da.InsertCursor(outtbl, flds) with arcpy.da.SearchCursor(intbl, flds) as incurs:     for row in incurs:         numbers = row[0].split(',')         for number in numbers:             # for each 'number' a record is written             outcurs.insertRow((number.strip(), row[1], row[2]))  del outcurs


Kind regards,

Xander

View solution in original post

0 Kudos
8 Replies
XanderBakker
Esri Esteemed Contributor
Hi Amy,

I think this will do the job:

import arcpy, os  # access your table(s) intbl = r'C:\path\to\your\FileGeodatabase.gdb\inputtable' outtbl = r'C:\path\to\your\FileGeodatabase.gdb\outputtable'  # optionally create output table, using the input table as schema # CreateTable_management (out_path, out_name, {template}, {config_keyword}) outpath, outname = os.path.split(outtbl) arcpy.CreateTable_management(outpath, outname, intbl)  # have a close look at the fieldnames, some names may be restricted  # and are changed by ArcGIS when imported from for instance Excel flds = ['number','Code','Year']  # do a search cursor on the input table and an insert cursor on the output table outcurs = arcpy.da.InsertCursor(outtbl, flds) with arcpy.da.SearchCursor(intbl, flds) as incurs:     for row in incurs:         numbers = row[0].split(',')         for number in numbers:             # for each 'number' a record is written             outcurs.insertRow((number.strip(), row[1], row[2]))  del outcurs


Kind regards,

Xander
0 Kudos
RobertBorchert
Frequent Contributor III
Export it to a text file. open it in word.

Replace the commas with tabs and the space with tabs if that is correct for you.

Add the column headers at the top and import it back in.

Hello!

I am working with a table and I would like to get this:

number(stored as text)        Code        Year
0500, 0600, 0700               ABC1         2012

to this:

number           Code          Year
0500               ABC1        2012
0600               ABC1        2012             
0700               ABC1        2012

I have gotten to where I can insert each number/space/comma into a new row but obviously I need it to break it at the commas. Ideas?
0 Kudos
JamesCrandall
MVP Frequent Contributor
Just to be different and offer an alternative, you could employ the numpy and pandas libraries.  I just tend to do lots of dev outside of the ESRI stack for processing things, so this might be handy.


import arcpy
import pandas as pd
import numpy as np

input_table = r'H:\Documents\ArcGIS\Default.gdb\fliptab'

##convert the input_table into a numpy array
nparr = arcpy.da.TableToNumPyArray(input_table, ['number', 'code', 'year'])

##convert the array into a pandas data frame
df = pd.DataFrame(nparr.tolist(), columns=['number', 'code', 'year'])

##deconstruct the column containing the commas and transpose into rows
s = df['number'].apply(lambda x: pd.Series(x.split(','))).stack()
s.index = s.index.droplevel(-1)
s.name = 'number'

##join the original dataframe and the temp series
del df['number']
outdf = df.join(s)

##convert the df into a numpy array
out_nparr = np.array(outdf.to_records(), np.dtype([('code', '|S10'), ('year', np.int32), ('number', '|S10')]))
outTable = r'H:\Documents\ArcGIS\Default.gdb\fliptab_output'

##finally covert the numpy array back into the gdb table
arcpy.da.NumPyArrayToTable(out_nparr, outTable, ("code", "year", "number"))      
XanderBakker
Esri Esteemed Contributor
While we're at it, using part of the idea James suggested (the numpy part) and mixing it with some list comprehensions I came up with this:

import arcpy, numpy
intbl = r'C:\Project\_Forums\CommaDelimitField\test.gdb\intable'
outtbl = r'C:\Project\_Forums\CommaDelimitField\test.gdb\tst01'
flds = ('number_','Code','Year_')

# list comprehensions 
lst_in = [row for row in arcpy.da.SearchCursor(intbl, flds)]
lst_out = [(num, row[1], row[2]) for row in lst_in for num in row[0].split(', ')]

# use numpy to store the table
npa = numpy.array(lst_out, numpy.dtype([('number', '|S10'), ('code', '|S10'), ('year', numpy.int32)]))
arcpy.da.NumPyArrayToTable(npa, outtbl, ("number", "code", "year"))



Kind regards,

Xander
AmyKlug
Occasional Contributor III
Hi Amy,

I think this will do the job:



You are correct sir! Thanks again Xander


EDIT:

Oh hey, I didn't see the last 2 posts. Nice!
0 Kudos
JamesCrandall
MVP Frequent Contributor
While we're at it, using part of the idea James suggested (the numpy part) and mixing it with some list comprehensions I came up with this:

import arcpy, numpy
intbl = r'C:\Project\_Forums\CommaDelimitField\test.gdb\intable'
outtbl = r'C:\Project\_Forums\CommaDelimitField\test.gdb\tst01'
flds = ('number_','Code','Year_')

# list comprehensions 
lst_in = [row for row in arcpy.da.SearchCursor(intbl, flds)]
lst_out = [(num, row[1], row[2]) for row in lst_in for num in row[0].split(', ')]

# use numpy to store the table
npa = numpy.array(lst_out, numpy.dtype([('number', '|S10'), ('code', '|S10'), ('year', numpy.int32)]))
arcpy.da.NumPyArrayToTable(npa, outtbl, ("number", "code", "year"))



Kind regards,

Xander


Nice!  I pulled from one of our impmentations with lots of pandas DataFrame processing, but I like this suggestion for just numpy work.
0 Kudos
XanderBakker
Esri Esteemed Contributor
Nice!  I pulled from one of our impmentations with lots of pandas DataFrame processing, but I like this suggestion for just numpy work.


Hi James, it's posts like yours (thinking outside the box) that lets me learn something new every day... (+1 for that).

Kind regards,

Xander
0 Kudos
JamesCrandall
MVP Frequent Contributor
Hi James, it's posts like yours (thinking outside the box) that lets me learn something new every day... (+1 for that).

Kind regards,

Xander


Well I appreciate that!  I really like seeing the ways to write more "pythonic" --- it's so difficult coming from the C#/.NET/ArcObjects world to realign my thought process.
0 Kudos