.
Developer Spot - Web Development Tutorials
 


Web Hosting Directory
Budget Web Hosting Linux Web Hosting Small Business Hosting
Windows Web Hosting Reseller Web Hosting Web Hosting Articles

Tip: Convert from HTML to XML with HTML Tidy

By Benoit Marchal
2003-12-16
Reader Rating: 5 out of 5
Bookmark Print Version
Further Processing

Because XHTML documents are valid XML documents, you can insert them into an XML workflow. More specifically, you can post-process them with regular XML tools (XSL, parsers, and the like).

Indeed, I am not very happy with the XHTML vocabulary for this application. Because it's a publishing vocabulary, XHTML has very little structure, and I prefer to maintain photo galleries through the ad hoc XML vocabulary shown in Listing 3 (originally introduced in my tip, Divide and conquer large XML documents). To illustrate an XML workflow, I have written a small XSL stylesheet (see Listing 4) that retrieves the titles, file names, dates, and descriptions from the XHTML document. The stylesheet generates a more structured version of the document that is easier to work with.



Article Pages:
Preserve Legacy Web Sites With This Handy Utility
Tool Of The Trade
Listing 1. index.html (an excerpt)
Tidying Up
Listing 2. index.xml (an excerpt)
Further Processing
Listing 3. index-transform.xml (an excerpt)
Listing 4. cleanup.xsl
Conclusion

First published by IBM developerWorks


 Rate this article:   Poor          Excellent 


If you found this article interesting, you may want to read these as well:

» Better SOAP Interfaces With Header Elements

» Variable Substitution In XML Documents

» Create JPEGs Automatically With SVG

» Grab Headlines From A Remote RSS File



 
Development Tutorials
ASP
CGI & Perl
CSS
HTML
Java
JavaScript
Linux
PHP
XML




More Resources
Web Hosting Articles
Development Tutorials: CGI & Perl - CSS - HTML - Java - JavaScript - Linux - PHP - XML