Table of Contents
As we were writing this book, we developed some utilities to make it easier to manipulate OpenDocument files. We hope they are equally useful to you.
OpenDocument uses the JAR format. Rather than having to unjar each document before running an XSLT transformation on it, we wrote this program, which lets you perform a transformation on a member of a JAR file without having to expand it. It also lets you create a JAR file (without a manifest) as output, if your output is intended to be used as an OpenDocument file.
One other thing prevents us from using a standard transform program such as those provided with Xalan to process the manifest.xml file in an OpenDocument file. The manifest contains a <!DOCTYPE> declaration that references a file named Manifest.dtd. DTD files are used in validating documents. Even though we aren’t doing validation on the files, Xalan tries to resolve the file reference and fails, as Manifest.dtd does not exist in the JAR file before or after expansion. Example C.1, “Entity Resolver to Ignore DTDs” is a custom EntityResolver which eliminates this problem by providing an empty DTD whenever Xalan tries to resolve an external entity whose name ends with .dtd.
[This is file ResolveDTD.java in directory appc in the downloadable example files.]
Example C.1. Entity Resolver to Ignore DTDs
import org.xml.sax.EntityResolver; import org.xml.sax.InputSource; import java.io.StringReader; public class ResolveDTD implements EntityResolver { public InputSource resolveEntity (String publicId, String systemId) { if (systemId.endsWith(".dtd")) { StringReader stringInput = new StringReader(" "); return new InputSource(stringInput); } else { return null; // default behavior } } }
Now that we have overcome the problem of the phantom DTD, we can write the main transformation program, ODTransform.java. It takes the following command line arguments:
This is followed by the name of the input file. If you’re operating on a compressed document, then this is the member name of the OpenDocument file, which you must name with the -inOD argument.
This is followed by the name of a compressed OpenDocument document.
This is followed by the name of the XSLT file that does the transformation.
This is followed by the name of the output file. If it is to be a member of an output OpenDocument file, then you must specify that file name with the -outOD argument.
This is followed by the name of a compressed OpenDocument file that will be produced as output.
This is followed by the name of a parameter to be passed to the transformation, and the value of that parameter.
Thus, if you are transforming a plain file to another plain file, you might have a command line like this:
ODTransform -in content.xml -xsl transform.xslt -out result.txt
To transform the content.xml file inside a document named myfile.odt, producing a non-compressed output file, you might have a command line like this:
ODTransform -inOD myfile.odt -in content.xml -xsl transform.xsl -out result.txt
And, to transform content.xml inside a document named myfile.odt to produce a new content.xml inside a result document named newfile.odt, your command line would be:
ODTransform -inOD myfile.odt -in content.xml -xsl transform.xslt -outOD newfile.odt -out content.xml
When creating an OpenDocument file as output, the program must also create a META-INF/manifest.xml file. The extension given in the
-outODparamater will determine the media-type in the manifest.
And now, Example C.2, “XSLT Transformation for OpenDocument files”, which shows the code, which you will find in file ODTransform.java in directory appc in the downloadable example files.
Example C.2. XSLT Transformation for OpenDocument files
/* * ODTransform.java * (c) 2003-2005 J. David Eisenberg * Licensed under LGPL * * Program purpose: to perform an XSLT transformation * on a member of an OpenDocument file, either * after unzipping or while still in its zipped state. * Output may go to a normal file or a zipped file. */ import javax.xml.transform.TransformerFactory; import javax.xml.transform.Transformer; import javax.xml.transform.sax.SAXSource; import javax.xml.transform.sax.SAXResult; import javax.xml.transform.stream.StreamSource; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.TransformerException; import javax.xml.transform.TransformerConfigurationException; import org.xml.sax.XMLReader; import org.xml.sax.InputSource; import org.xml.sax.ContentHandler; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.SAXException; import org.xml.sax.helpers.XMLReaderFactory; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.util.Hashtable; import java.util.jar.JarInputStream; import java.util.jar.JarOutputStream; import java.util.jar.JarEntry; import java.util.Vector; import java.util.zip.ZipException; public class ODTransform { String inputFileName = null; // input file name, or member name... String inputODName = null; // ...if given an OpenDocument input file String outputFileName = null; // output file name, or member name... String outputODName = null; // ...if given an OpenDocument output file String xsltFileName = null; // XSLT file is always a regular file Vector params = new Vector(); // parameters to be passed to transform public void doTransform( ) throws TransformerException, TransformerConfigurationException, SAXException, ZipException, IOException { /* Set up the XSLT transformation based on the XSLT file */ File xsltFile = new File( xsltFileName ); StreamSource streamSource = new StreamSource( xsltFile ); TransformerFactory tFactory = TransformerFactory.newInstance(); Transformer transformer = tFactory.newTransformer( streamSource ); /* Set up parameters for transform */ for (int i=0; i < params.size(); i += 2) { transformer.setParameter((String) params.elementAt(i), (String) params.elementAt(i + 1)); } /* Create an XML reader which will ignore any DTDs */ XMLReader reader = XMLReaderFactory.createXMLReader(); reader.setEntityResolver( new ResolveDTD() ); InputSource inputSource; if (inputODName == null) { /* This is an unpacked file. */ inputSource = new InputSource( new FileInputStream( inputFileName ) ); } else { /* The input file should be a member of an OD file. Check to see if the input file name really exists within the JAR file */ JarInputStream jarStream = new JarInputStream( new FileInputStream( inputODName ), false ); JarEntry jarEntry; while ( (jarEntry = jarStream.getNextJarEntry() ) != null && !(inputFileName.equals(jarEntry.getName()) ) ) // do nothing ; inputSource = new InputSource( jarStream ); } SAXSource saxSource = new SAXSource( reader, inputSource ); saxSource.setSystemId( inputFileName ); if (outputODName == null) { /* We want a regular file as output */ FileOutputStream outputStream = new FileOutputStream( outputFileName ); transformer.transform( saxSource, new StreamResult( outputStream ) ); outputStream.close(); } else { /* The output file name is the name of a member of a JAR file (which we will build without a manifest) */ JarOutputStream jarStream = new JarOutputStream( new FileOutputStream( outputODName ) ); JarEntry jarEntry = new JarEntry( outputFileName ); jarStream.putNextEntry( jarEntry ); transformer.transform( saxSource, new StreamResult( jarStream ) ); /* Close the member file and the JAR file to complete the file */ jarStream.closeEntry(); createManifestFile( jarStream ); /* Close the JAR file to complete the file */ jarStream.close(); } } /* Check to see if the command line arguments make sense */ private void checkArgs( String[] args ) { int i; if (args.length == 0) { showUsage( ); System.exit( 1 ); } i = 0; while ( i < args.length ) { if (args[i].equalsIgnoreCase("-in")) { if ( i+1 >= args.length) { badParam("-in"); } inputFileName = args[i+1]; i += 2; } else if (args[i].equalsIgnoreCase("-out")) { if ( i+1 >= args.length) { badParam("-out"); } outputFileName = args[i+1]; i += 2; } else if (args[i].equalsIgnoreCase("-xsl")) { if ( i+1 >= args.length) { badParam("-xsl"); } xsltFileName = args[i+1]; i += 2; } else if (args[i].equalsIgnoreCase("-inod")) { if ( i+1 >= args.length) { badParam("-inOD"); } inputODName = args[i+1]; i += 2; } else if (args[i].equalsIgnoreCase("-outod")) { if ( i+1 >= args.length) { badParam("-outOD"); } outputODName = args[i+1]; i += 2; } else if (args[i].equalsIgnoreCase("-param")) { if ( i+2 >= args.length) { badParam("-param"); } params.addElement( args[i+1] ); params.addElement( args[i+2] ); i += 3; } else { System.out.println( "Unknown argument " + args[i] ); System.exit( 1 ); } } if (inputFileName == null) { System.out.println("No input file name specified."); System.exit( 1 ); } if (outputFileName == null) { System.out.println("No output file name specified."); System.exit( 1 ); } if (xsltFileName == null) { System.out.println("No XSLT file name specified."); System.exit( 1 ); } } /* If not enough arguments for a parameter, show error and exit */ private void badParam( String paramName ) { System.out.println("Not enough parameters to " + paramName); System.exit(1); } /* Creates the manifest file for a compressed OpenDocument file. The mType array contains pairs of filename extensions and corresponding mimetypes. The comparison to find the extension is done in a case-insensitive manner. */ private void createManifestFile( JarOutputStream jarStream ) { String [] mType = { "odt", "application/vnd.oasis.opendocument.text", "ott", "application/vnd.oasis.opendocument.text-template", "odg", "application/vnd.oasis.opendocument.graphics", "otg", "application/vnd.oasis.opendocument.graphics-template", "odp", "application/vnd.oasis.opendocument.presentation", "otp", "application/vnd.oasis.opendocument.presentation-template", "ods", "application/vnd.oasis.opendocument.spreadsheet", "ots", "application/vnd.oasis.opendocument.spreadsheet-template", "odc", "application/vnd.oasis.opendocument.chart", "otc", "application/vnd.oasis.opendocument.chart-template", "odi", "application/vnd.oasis.opendocument.image", "oti", "application/vnd.oasis.opendocument.image-template", "odf", "application/vnd.oasis.opendocument.formula", "otf", "application/vnd.oasis.opendocument.formula-template", "odm", "application/vnd.oasis.opendocument.text-master", "oth", "application/vnd.oasis.opendocument.text-web", }; JarEntry jarEntry; int dotPos; String extension; String mimeType = null; String outputStr; dotPos = outputODName.lastIndexOf("."); extension = outputODName.substring( dotPos + 1 ); for (int i=0; i < mType.length && mimeType == null; i+=2) { if (extension.equalsIgnoreCase( mType[i] )) { mimeType = mType[i+1]; } } if (mimeType == null) { System.err.println("Cannot find mime type for extension " + extension ); mimeType = "UNKNOWN"; } try { jarEntry = new JarEntry( "META-INF/manifest.xml"); jarStream.write( "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" .getBytes() ); jarStream.write( "<!DOCTYPE manifest:manifest PUBLIC \"-//OpenOffice.org//DTD Manifest 1.0//EN\" \"Manifest.dtd\">" .getBytes() ); jarStream.write("<manifest:manifest xmlns:manifest=\"urn:oasis:names:tc:opendocument:xmlns:manifest:1.0\">" .getBytes() ); outputStr = "<manifest:file-entry manifest:media-type=\"" + mimeType + "\" manifest:full-path=\"/\"/>"; jarStream.write( outputStr.getBytes() ); outputStr = "<manifest:file-entry manifest:media-type=\"text/xml\" manifest:full-path=\"" + outputFileName + "\"/>"; jarStream.write( outputStr.getBytes() ); jarStream.write("</manifest:manifest>".getBytes() ); jarStream.closeEntry(); } catch (IOException e) { System.err.println("Cannot write file:"); System.err.println( e.getMessage() ); } } /* If no arguments are provided, show this brief help section */ private void showUsage( ) { System.out.println("Usage: ODTransform options"); System.out.println("Options:"); System.out.println(" -in inputFilename"); System.out.println(" -xsl transformFilename"); System.out.println(" -out outputFilename"); System.out.println("If the input filename is within an OpenDocument file, then:"); System.out.println(" -inOD inputOpenDocFileName"); System.out.println("If you wish to output an OpenDocument file, then:"); System.out.println(" -outOD outputOpenDocumentFileName"); System.out.println( ); System.out.println("Argument names are case-insensitive."); } public static void main(String[] args) { ODTransform transformApp = new ODTransform( ); transformApp.checkArgs( args ); try { transformApp.doTransform( ); } catch (Exception e) { System.out.println("Unable to transform"); System.out.println(e.getMessage()); } } }
You need to set the class path correctly in order to use this program; you may either set your CLASSSPATH system environment variable, or set it up in a shell script. Example C.3, “Invoking the Transform Program” shows the bash shell script that I used. You will find it in file odtransform.sh in directory appc in the downloadable example files.
Example C.3. Invoking the Transform Program
java -cp /usr/local/xmljar/xalan-j_2_6_0/bin/xalan.jar:\ /usr/local/xmljar/xalan-j_2_6_0/bin/xercesImpl.jar:\ /usr/local/xmljar/xalan-j_2_6_0/bin/xml-apis.jar:\ /usr/local/xmljar/xalan-exts/utils.jar:\ /usr/local/odbook/ODTransform \ ODTransform $1 $2 $3 $4 $5 $6 $7 $8 $9 ${10} ${11} ${12} ${13} ${14} \ ${15} ${16} ${17}
As an application of the preceding script, we present an alternate method of indenting the unpacked files via a simple XSLT transformation. Example C.4, “XSLT Transformation for Indenting” shows this transformation, which simply copies the entire document tree while setting indent to yes in the <xsl:output> element.
Example C.4. XSLT Transformation for Indenting
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" indent="yes"/> <xsl:template match="/"> <xsl:copy-of select="."/> </xsl:template> </xsl:stylesheet>
We now present a Perl program to invoke this transformation on all the XML files in an unpacked OpenDocument file. We will need to set two paths: one to the transformation script, and one to the location of the preceding XSLT transformation. Make sure you use absolute paths for setting variables $script_location and $transform_location, because find() changes directories as it traverses the directory tree. This is file od_indent.pl in the appc directory in the downloadable example files.
Example C.5. Program to Indent OpenDocument Files via XSLT
#!/usr/bin/perl use File::Find; # # This program indents XML files within a directory. # a simple XSLT transform is used to indent the XML. # # # Path where you have installed the OpenDocument transform script. # $script_location = "/your/path/to/odtransform.sh"; # # Path where you have installed the XSLT transformation. # $transform_location = "/your/path/to/od_indent.xsl"; if (scalar @ARGV != 1) { print "Usage: $0 directory\n"; exit; } if (!-e $script_location) { print "Cannot find the transform script at $script_location\n"; exit; } if (!-e $transform_location) { print "Cannot find the XSLT transformation file at " , "$transform_location\n"; exit; } $dir_name = $ARGV[0]; if (!-d $dir_name) { print "The argument to $0 must be the name of a directory\n"; print "containing XML files to be indented.\n"; exit; } # # Indent all XML files. # find(\&indent, $dir_name); # Warning: # This subroutine creates a temporary file with the format # __tempnnnn.xml, where nnnn is the current time( ). This # will avoid name conflicts when used with OpenOffice.org documents, # even though the technique is not sufficiently robust for general use. # sub indent { my $xmlfile = $_; my $command; my $result; if ($xmlfile =~ m/\.xml$/) { $time = time(); print "Indenting $xmlfile\n"; $command = "$script_location " . "-in $xmlfile -xsl $transform_location -out __temp$time.xml"; $result = system( $command ); if ($result == 0 && -e "__temp$time.xml") { unlink $xmlfile; rename "__temp$time.xml", $xmlfile; } else { print "Error occurred while indenting $xmlfile\n"; } } }
This process may insert newlines in text as well as between elements. In cases where elements contain other elements, this is not a problem, as OpenDocument ignores whitespace between elements. When expanding text elements, though, the extra newlines could cause extra spaces to appear when repacking the document. Thus, you should use this method to indent the XML document only when you do not want to repack the resulting files.
When using XLST with OpenDocument files, you will want to make sure you have declared all the appropriate namespaces. Rather than selecting exactly the namespaces that your document uses, we provide all of the namespaces for OpenDocument in Example C.6, “XSLT Framework for Transforming OpenDocument”, which you may use as a framework for your transformations. This is file framework.xsl in directory appc in the downloadable example files.
Example C.6. XSLT Framework for Transforming OpenDocument
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0" xmlns:config="urn:oasis:names:tc:opendocument:xmlns:config:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0" xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0" xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0" xmlns:presentation="urn:oasis:names:tc:opendocument:xmlns:presentation:1.0" xmlns:dr3d="urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0" xmlns:chart="urn:oasis:names:tc:opendocument:xmlns:chart:1.0" xmlns:form="urn:oasis:names:tc:opendocument:xmlns:form:1.0" xmlns:script="urn:oasis:names:tc:opendocument:xmlns:script:1.0" xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" xmlns:anim="urn:oasis:names:tc:opendocument:xmlns:animation:1.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:xforms="http://www.w3.org/2002/xforms" xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" xmlns:smil="urn:oasis:names:tc:opendocument:xmlns:smil-compatible:1.0" xmlns:ooo="http://openoffice.org/2004/office" xmlns:ooow="http://openoffice.org/2004/writer" xmlns:oooc="http://openoffice.org/2004/calc" > <xsl:template match="/office:document-content"> <xsl:apply-templates/> </xsl:template> </xsl:stylesheet>
If you are creating an OpenDocument file from a file where white space has been preserved, you will have to convert runs of spaces into <text:s> elements, and convert tabs and line feeds into <text:tab-stop> and <text:line-break> elements. This task is not easily done in native XSLT. Example C.7, “Transforming Whitespace to OpenDocument XML” is a Java extension for Xalan which will do what you need. You will note that we create elements and attributes complete with namespace prefix. This is certainly not a recommended practice, but createElementNS() and setAttributeNS() create xmlns attributes rather than a prefixed name. You will find this Java code in file ODWhiteSpace.java in directory appc in the downloadable example files.
Example C.7. Transforming Whitespace to OpenDocument XML
import org.w3c.dom.Document; import org.w3c.dom.Element; import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org.w3c.dom.Text; import org.apache.xpath.NodeSet; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; public class ODWhiteSpace { public ODWhiteSpace () {} public static NodeList compressString( String str ) { ODWhiteSpace whiteSpace = new ODWhiteSpace(); return whiteSpace.doCompress( str ); } private Document tempDoc; // necessary for creating elements private StringBuffer strBuf; // where non-whitespace accumulates private NodeSet resultSet; // the value to be returned private int pos; // current position in string private int startPos; // where blanks begin accumulating private int nSpaces; // number of consecutive spaces private boolean inSpaces; // handling spaces? private char ch; // current character in buffer private char prevChar; // previous character in buffer private Element element; // element to be added to node list /** * Create OpenDocument elements for a string. * @param str the string to compress. * @return a NodeList for insertion into an OpenDocument file */ public NodeList doCompress( String str ) { if (str.length() == 0) { return null; } tempDoc = null; strBuf = new StringBuffer( str.length() ); try { tempDoc = DocumentBuilderFactory.newInstance(). newDocumentBuilder().newDocument(); } catch(ParserConfigurationException pce) { return null; } resultSet = new NodeSet(); resultSet.setShouldCacheNodes(true); pos = 0; startPos = 0; nSpaces = 0; inSpaces = false; ch = '\u0000'; while (pos < str.length()) { prevChar = ch; ch = str.charAt( pos ); if (ch == ' ') { if (inSpaces) { nSpaces++; } else { emitText( ); nSpaces = 1; inSpaces = true; startPos = pos; } } else if (ch == 0x000a || ch == 0x000d) { if (prevChar != 0x000d) // ignore LF or CR after CR. { emitPending( ); element = tempDoc.createElement("text:line-break"); resultSet.addNode(element); } } else if (ch == 0x09) { emitPending( ); element = tempDoc.createElement("text:tab-stop"); resultSet.addNode(element); } else { if (inSpaces) { emitSpaces( ); } strBuf.append( ch ); } pos++; } emitPending( ); // empty out anything that's accumulated return resultSet; } /** * Emit accumulated spaces or text */ private void emitPending( ) { if (inSpaces) { emitSpaces( ); } else { emitText( ); } } /** * Emit accumulated text. * Creates a text node with currently accumulated text. * Side effect: empties accumulated text buffer */ private void emitText( ) { if (strBuf.length() != 0) { Text textNode = tempDoc.createTextNode( strBuf.toString( ) ); resultSet.addNode( textNode ); strBuf = new StringBuffer( ); } } /** * Emit accumulated spaces. * If these are leading blanks, emit only a * <text:s> element; otherwise a blank plus * a <text:s> element (if necessary) * Side effect: sets accumulated number of spaces to zero. * Side effect: sets "inSpaces" flag to false */ private void emitSpaces( ) { Integer n; if (nSpaces != 0) { if (startPos != 0) { Text textNode = tempDoc.createTextNode( " " ); resultSet.addNode( textNode ); nSpaces--; } n = new Integer(nSpaces); if (nSpaces >= 1 || startPos == 0) { element = tempDoc.createElement( "text:s" ); element.setAttribute( "text:c", (new Integer(nSpaces)).toString( ) ); resultSet.addNode( element ); } inSpaces = false; nSpaces = 0; } } }
This is the same program as Example 2.3, “Program show_meta.pl”, except that it uses the XML::SAX module instead of XML::Simple. XML::SAX is a perl module for the Simple API for XML, which interfaces to an event-driven parser. The parser issues many kinds of events as it parses a document; the ones we are interested in are the events that occur when an element starts, when it ends, and when we encounter the element’s text content. To use XML::SAX, you must specify a handler object, which is a Perl package that contains subroutines that are called when the parser detects events. The handler subroutines receive two parameters: a reference to the parser, and data hash with information about the event. Here are the subroutines that we will implement, the keys from the data hash that we are interested in, and how we will use their values.
This subroutine is called whenever the parser detects an opening tag for an element. The relevant keys are
The program will store the element name in a scalar $element and the attributes in a global array @attributes. It sets a global scalar $text to the null string; this variable will be used to collect all the element’s text content.
This subroutine is called whenever the parser detects a series of characters within an element. The relevant key is
The text is concatenated to the end of the $text variable. This is necessary because a single sequence of text may generate multiple calls to the character handler.
This subroutine is called whenever the parser detects an opening tag for an element. The relevant key is
Upon encountering the end of an element, the program will add the element name as a key in a hash named %info. The hash value will be an anonymous array consisting of the $text content followed by the @attributes array.
Here is the rewritten program, which you will find in file sax_show_meta.pl in the appc directory in the downloadable example files.
Example C.8. Program sax_show_meta.pl
#!/usr/bin/perl # # Show meta-information in an OpenDocument file. # use XML::SAX; use IO::File; use Text::Wrap; use Carp; use strict 'vars'; my $suffix; # file suffix my $parser; # instance of XML::SAX parser my $handler; # module that handles elements, etc. my $filehandle; # file handle for piped input my $info; # the hash returned from the parser my @attributes; # attributes from a returned element my %attr_hash; # hash of attribute names and values # # Check for one argument: the name of the OpenDocument file # if (scalar @ARGV != 1) { croak("Usage: $0 document"); } # # Get file suffix for later reference # ($suffix) = $ARGV[0] =~ m/\.(\w\w\w)$/; # # Create an object containing handlers for relevant events. # $handler = MetaElementHandler->new(); # # Create a parser and tell it where to find the handlers. # $parser = XML::SAX::ParserFactory->parser( Handler => $handler); # # Input to the parser comes from the output of member_read.pl # $ARGV[0] =~ s/[;|'"]//g; #eliminate dangerous shell metacharacters $filehandle = IO::File->new( "perl member_read.pl $ARGV[0] meta.xml |" ); # # Parse and collect information. # $parser->parse_file( $filehandle ); # # Retrieve the information collected by the parser # $info = $handler->get_info(); # # Output phase # print "Title: $info->{'dc:title'}[0]\n" if ($info->{'dc:title'}[0]); print "Subject: $info->{'dc:subject'}[0]\n" if ($info->{'dc:subject'}[0]); if ($info->{'dc:description'}[0]) { print "Description:\n"; $Text::Wrap::columns = 60; print wrap("\t", "\t", $info->{'dc:description'}[0]), "\n"; } print "Created: "; print format_date($info->{'meta:creation-date'}[0]); print " by $info->{'meta:initial-creator'}[0]" if ($info->{'meta:initial-creator'}[0]); print "\n"; print "Last edit: "; print format_date($info->{"dc:date"}[0]); print " by $info->{'dc:creator'}[0]" if ($info->{'dc:creator'}[0]); print "\n"; # # Take attributes from the meta:document-statistic element # (if any) and put them into %attr_hash # @attributes = @{$info->{'meta:document-statistic'}}; if (scalar(@attributes) > 1) { shift @attributes; %attr_hash = @attributes; if ($suffix eq "sxw") { print "Pages: $attr_hash{'meta:page-count'}\n"; print "Words: $attr_hash{'meta:word-count'}\n"; print "Tables: $attr_hash{'meta:table-count'}\n"; print "Images: $attr_hash{'meta:image-count'}\n"; } elsif ($suffix eq "sxc") { print "Sheets: $attr_hash{'meta:table-count'}\n"; print "Cells: $attr_hash{'meta:cell-count'}\n" if ($attr_hash{'meta:cell-count'}); } } # # A convenience subroutine to make dates look # prettier than ISO-8601 format. # sub format_date { my $date = shift; my ($year, $month, $day, $hr, $min, $sec); my @monthlist = qw (Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec); ($year, $month, $day, $hr, $min, $sec) = $date =~ m/(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2})/; return "$hr:$min on $day $monthlist[$month-1] $year"; } package MetaElementHandler; my %element_info; # the data structure that we are creating my $element; # name of element being processed my @attributes; # attributes for this element my $text; # text content of the element sub new { my $class = shift; my %opts = @_; bless \%opts, $class; } sub reset { my $self = shift; %$self = (); } # # Store current element and its attribute. # sub start_element { my ($self, $parser_data) = @_; my $hashref; my $item; # loop control variable $element = $parser_data->{"Name"}; foreach $item (keys %{$parser_data->{"Attributes"}}) { $hashref = $parser_data->{"Attributes"}{$item}; push @attributes, $hashref->{"Name"}, $hashref->{"Value"}; } $text = ""; # no text content yet. } # # Create an entry into a hash for the element that is ending # sub end_element { my ($self, $parser_data) = @_; $element = $parser_data->{"Name"}; $element_info{$element} = [$text, @attributes]; } # # Accumulate element's text content. # sub characters { my ($self, $parser_data) = @_; $text .= $parser_data->{"Data"}; } # Return a reference to the %info hash # sub get_info { my $self = shift; return \%element_info; }
If you need to create multiple directory levels, but your system doesn’t have the equivalent of Linux’s mkdir --parents option, use the program shown in Example C.9, “Program to Create Directories”, which is in file make_directories.pl in the appc directory in the downloadable example files.
Copyright (c) 2005 O’Reilly & Associates, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".