Joining Large files (900,000 records)

4167
14
04-03-2017 11:52 AM
BrianRizzo
New Contributor II

Join operation takes about 12 hours to complete in Pro.  (900,000 records with about 10 variables).  Desktop operation takes about 6 hrs.  Is this normal?  If so, how does this speak to BIG data as this is not really representative of BIG data.

platform - recent I7, 16 GB, SSD.

0 Kudos
14 Replies
BrianRizzo
New Contributor II

Joshua, Kory

Thanks for the replies.  The join is 1:1.  I exported a table from a point file and processed it in SPSS.  This generated an output file that added a new field to the table.  Initially I joined this new table, which was in CSV  form, to the original point file.  The join worked well but saving the join takes a very very long time as indicated previously.  I also  created a GDB table from the CSV and joined this to the point file. Again the join worked but again the save operation took as long as the previous attempt.  Interestingly I tried both solutions in Desktop and Pro.  Desktop process the save in about half the time-6 hours!

I'm writing a script as suggested by Dan and will hope this addresses my problem.

Brian 

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

What about indexes, is the join field indexed in both data sets?

Even without indexes, I still can't get my head around why it would take some long.  I have worked with some million-record or more data sets.  Sure, they are slower than what most people are used to, but 6 hours to export?  That just doesn't make sense to me.

0 Kudos
BrianRizzo
New Contributor II

Thanks everyone for the help.  It turns out that INDEXING BOTH files is critical. Especially when the file sizes become very large.

Thanks everyone.  Sorry to take up your time. I should have been more careful reading the help!  To be honest I'm not sure how indexing works so I had no idea it could have such a big impact.

BrianWade
Occasional Contributor

Just Checking, indexing on the Just joining fields? and also how long did it take after you indexed?

Thanks

0 Kudos
IanMurray
Frequent Contributor
0 Kudos