listing gis files on disc

3684
6
03-12-2014 08:52 AM
BartDe_Bruyn
New Contributor
I have found this script to list shapefiles/geodatabases/dbase files/coverages on a disc.
The script runs well, BUT when i run it a second time, datalines are addded,; so my listing is getting larger every time.
I suppose it is because I use that yield command? Is there a way to avoid this? I am a new user of python/arc python. Can anyoen help me a little bit. I have already read that yield is difficult to understand for new users.
Thanks in advance,
Bart De Bruyn
****************************************************************

import os
import arcpy

workspace = r"T:\test"
output = r"H:\testlist.txt"
outFile = open(output, "w") 
lijn=""
  
def inventory_data(workspace, datatypes):
     """
     Generates full path names under a catalog tree for all requested
     datatype(s).
     Parameters:
     workspace: string
         The top-level workspace that will be used.
     datatypes: string | list | tuple
         Keyword(s) representing the desired datatypes. A single
         datatype can be expressed as a string, otherwise use
         a list or tuple. See arcpy.da.Walk documentation
          for a full list.
     """
     for path, path_names, data_names in arcpy.da.Walk(
             workspace, datatype=datatypes):
         for data_name in data_names:
             yield os.path.join(path, data_name)

for feature_class in inventory_data(r"T:\test", "Any"):

    lijn = lijn + feature_class + "\n"
    outFile.write(lijn)

outFile.close()
Tags (2)
0 Kudos
6 Replies
MathewCoyle
Frequent Contributor
I'm not sure exactly, but try running this. Is it duplicating your entire list each time?

import os
import arcpy

workspace = r"C:\working\roger"
output = r"C:\testlist.txt"

def inventory_data(workspace, datatypes, output):
    """
    Generates full path names under a catalog tree for all requested
    datatype(s).
    Parameters:
    workspace: string
    The top-level workspace that will be used.
    datatypes: string | list | tuple
    Keyword(s) representing the desired datatypes. A single
    datatype can be expressed as a string, otherwise use
    a list or tuple. See arcpy.da.Walk documentation
    for a full list.
    """
    with open(output, "w") as outFile:
        for path, path_names, data_names in arcpy.da.Walk(
                workspace, datatype=datatypes):

            for data_name in data_names:
                outFile.write(os.path.join(path, data_name) + os.linesep)


inventory_data(workspace, "Any", output)




Edit: I think I see the issue, you have too many loops here. Try my revised code above.

Edit2: Silly me the issue is just the lijn variable keeps building up with each loop. Just remove that entirely.

import os
import arcpy

workspace = r"T:\test"
output = r"H:\testlist.txt"


def inventory_data(workspace, datatypes):
    """
    Generates full path names under a catalog tree for all requested
    datatype(s).
    Parameters:
    workspace: string
    The top-level workspace that will be used.
    datatypes: string | list | tuple
    Keyword(s) representing the desired datatypes. A single
    datatype can be expressed as a string, otherwise use
    a list or tuple. See arcpy.da.Walk documentation
    for a full list.
    """
    
    for path, path_names, data_names in arcpy.da.Walk(
            workspace, datatype=datatypes):

        for data_name in data_names:
            yield os.path.join(path, data_name)

with open(output, "w") as outFile:
    for f in inventory_data(workspace, "Any"):
        outFile.write(f + os.linesep)
0 Kudos
BrianO_keefe
Occasional Contributor III

I'm new to Python. The 'workspace' variable is confusing to me. Do I need a file to be located at that location? Or should that point to my GIS DB server?

0 Kudos
BlakeTerhune
MVP Regular Contributor

In this case, I think the workspace variable is simply where to start the searching. You could also look into using arcpy.da.Walk().

0 Kudos
IanMurray
Frequent Contributor

The workspace variable is the root directory you want it to search for spatial data.  So if you set it to "C:/" it would check all folders and subfolders in the C drive for spatial data and write it to a report.  If you have a particular folder within a drive say "D:/ForestryData" that has folders and subfolders to be checked and used that as the workspace it would check all folders and subfolders of "D:/ForestryData" for data.  I'm not sure if you can point it at a database connection or not.

0 Kudos
BrianO_keefe
Occasional Contributor III

I work with Enterprise databases so I have to create SDE connections to datasets within databases. SQL Server. So I don't have a mapped drive letter that will do this. Is there an option for Enterprise level database settings?

0 Kudos
BlakeTerhune
MVP Regular Contributor

It's still just an SDE connection file. You can either save your connection file to a local (or network) drive, or you can refer to the connection files with a relative path in ArcCatalog as simply "Database Connections\MyConnectionFile.sde"

The ArcCatalog connection files are actually stored in a hidden folder at C:\Users\usernamehere\AppData\Roaming\ESRI\Desktop10.2\ArcCatalog

I like to have an explicit location where the connection file is. If you had the ArcCatalog relative path, the connections available depend on the particular user logged in to the machine running the script.

0 Kudos