XSLT stylesheet for parsing and updating metadata.

787
4
04-25-2011 12:51 PM
DennisH
New Contributor II
We have a standard line in placed in the summary of each one of our SDE feature classes that states when the data was last updated.

eg.  "Data Last Updated: 2010-12-04"

Up until now this has been a very tedious and manual process. Can someone provide an example XSLT style sheet that is able to parse the metadata for a string like "Data Last Updated: 2010-12-04" and replace it with a new date.

Any help would be appreciated!
0 Kudos
4 Replies
DennisH
New Contributor II
anyone ???
0 Kudos
denisedavis
Occasional Contributor III
Have you tried asking in the  SDE forum?
0 Kudos
RachelS_
New Contributor III
I am also dealing with this for multiple items. Have you looked at
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Editing_metadata_for_many_ArcGIS_items...
0 Kudos
KimOllivier
Occasional Contributor III
My solution to merely updating the date used a Python module called elementTree, which was written some time ago and may be superceded by better tools, but it does the job at 9.3 well for me.
http://docs.python.org/library/xml.etree.elementtree.html
http://effbot.org/zone/element-index.htm

I export the metadata for a given featureclass into XML. Unfortunately you cannot access the metadata in a geodatabase directly otherwise. There is no tool to unload metadata, you have to do this manually in ArcCatalog at 9.3, fixed at 10.0. If the featureclass was a shapefile it would already be an XML file.

Then I load this template into a Python structure with elementTree that makes it easy to find each component for dates and change it. ElementTree then writes out the edited structure faithfully to a revised XML file. (While I was there I deleted the history)

Finally there is a tool that will load metadata into a geodatabase.
# LoadMetadata.py
# with altered dates for current month
# using element tree
# create original Metadata with same name as layer
# run this to alter to name_et.xml
# reload into filegeodatabase
# Note there is no tool to unload metadata
# 15 March 2010

import arcgisscripting,sys,os
import elementtree.ElementTree as ET
import sys,os,datetime
print
print

def alter(xmlfile,edDate,publishDate,createDate) :
    """
    read xml file for featureclass or table
    change dates to today,loading date and LINZ extract date
    empty processing logs
    write out file with _et suffix
    return file name
    """
    # print xmlfile
    tree = ET.parse(xmlfile)
    ## print tree.getroot().tag, tree.getroot().text,tree.getroot().tail,tree.getroot().attrib

    # Edition Date
    elem = list(tree.iter("resEdDate"))[0] 
    # print elem.tag,elem.text
    elem.text = edDate
    ## print elem.text

    # Reference Date 001 (Creation)
    elem = list(tree.iter("refDate"))[0] 
    # print elem.tag,elem.text
    elem.text = createDate
    ## print elem.text
    # note there may be two of these dates
    # DateTypCd 001 and 002
    # Reference Date 002 (Publication)
    if len(list(tree.iter("refDate"))) > 1 :
        elem = list(tree.iter("refDate"))[1]
        # print elem.tag,elem.text
        elem.text = publishDate
    else :
        # print "Skipping publication date",xmlfile
        pass
    ## print elem.text
    # clear out lineag if it exists
    try :
        lin = list(tree.iter("lineage"))[0]
        # print lin.tag
        lin.clear()
        lin.text = "Cleared"
    except :
        gp.AddMessage("Skipping clear lineage")
    outfile = xmlfile.replace(".","_et.")
    tree.write(outfile)
    return outfile

# ---------------------- main ----------------------

try :
    publishDate = sys.argv[1]
    createDate  = sys.argv[2]
    if createDate == '#' :
        createDate = publishDate
    gp.AddMessage(publishDate+type(publishDate))
except :
    today = datetime.datetime.now()
    firstSat = today.replace(day=1) + datetime.timedelta(5 - datetime.datetime.now().replace(day=1).weekday())
    publishDate = firstSat.strftime("%Y%m%d")
    createDate  = publishDate

# override
# publishDate = '20100804'
# createDate = '20100710'

gp = arcgisscripting.create(9.3)

os.chdir("e:/crs/customer/conservation/metadata")
edDate = str(datetime.datetime.now().date()).replace("-","")
edDate = publishDate # '20100914'
gp.AddMessage(edDate+" edit date")
gp.AddMessage(publishDate+" publish date")
gp.AddMessage(createDate+" create date")

ws = "e:/crs/customer/conservation/corax.gdb"
metasrc = "e:/crs/customer/conservation/metadata"
gp.Workspace = ws

os.chdir(metasrc)
print
print ws
print metasrc
print



lstFC = gp.ListFeatureClasses("*")

for fc in lstFC :
    # print fc
    fcxml = metasrc+"/"+fc+".xml"
    if os.path.exists(fcxml) :
        etxml = alter(fcxml,edDate,publishDate,createDate) 
        gp.MetadataImporter_conversion(etxml,fc)
        print fc,"updated"
        gp.AddMessage(fc+" updated")
    else :
        print fcxml,"not found"
        gp.AddError(fcxml+" not found")
        
lstTab = gp.ListTables("*")
for tab in lstTab :
    # print tab
    tabxml = metasrc+"/"+tab+".xml"
    if os.path.exists(tabxml) :
        etxml = alter(tabxml,edDate,publishDate,createDate) 
        gp.MetadataImporter_conversion(etxml,tab)
        print tab,"updated"
        gp.AddMessage(tab+" updated")
    else :
        print tabxml,"not found"
        gp.AddError(tabxml+" not found")
# geodatabase metadata
etxml = alter(metasrc+"/corax.xml",edDate,publishDate,createDate)
gp.AddWarning("Load corax_et.xml to geodatabase by hand")

There are a lot more tools at 10 to handle metadata.