Tip: Convert from HTML to XML with HTML Tidy
By Benoit Marchal
2003-12-16
Reader Rating:

Further Processing
Because XHTML documents are valid XML documents, you can insert them into an XML workflow. More specifically, you can post-process them with regular XML tools (XSL, parsers, and the like).
Indeed, I am not very happy with the XHTML vocabulary for this application. Because it's a publishing vocabulary, XHTML has very little structure, and I prefer to maintain photo galleries through the ad hoc XML vocabulary shown in Listing 3 (originally introduced in my tip, Divide and conquer large XML documents). To illustrate an XML workflow, I have written a small XSL stylesheet (see Listing 4) that retrieves the titles, file names, dates, and descriptions from the XHTML document. The stylesheet generates a more structured version of the document that is easier to work with.
First published by IBM developerWorks
If you found this article interesting, you may want to read these as well:
» Better SOAP Interfaces With Header Elements
» Variable Substitution In XML Documents
» Create JPEGs Automatically With SVG
» Grab Headlines From A Remote RSS File
|