Extensible Stylesheets
The browser has a peculiar dimensionality that natively supports moving around
complexity between different data and language domains. One of those
domains is XML
(Extensible Markup Language) and it comes practically in the
form of RSS
(Really Simple Syndication) Feeds, Atom
Feeds, Site Maps, OPML
(Outline Processor Markup Language) Outlines and other various XML
representations.
XSLT
(Extensible Stylesheet
Language Transformations) is a
XML
into different output
formats. You can expose and transform big or tiny blobs of atom.xml
,
rss.xml
, sitemap.xml
, and opml.xml
files into
PDF
(Portable Document Format) but
that’s outside the scope of this article.
In some respects, XSLT
is considered
“dead”
technology, but take the word dead with a
X
is dead, and Y
killed it” is a common trope on the Internet. You can
find articles and comments for any X
of your choosing.
DTD
(Document Type Definition) for XHTML
(EXtensible HyperText
Markup Language) in its source generated from XML
— you’d be surprised.
XSLT
operates as an XML
templating language and a rather verbose one at
that. If you go deep enough, the verbosity gets unwieldy and like all
programming shenanigans it’s a perpetual rabbit hole. Here’s my practical notes
for working with XML
and XSLT
in a web context for adding style and
presentation while maintaining a bit of sanity.
Formats, File Extensions, and MIME types
The XSLT
transformations discussed here will be limited to
XHTML
output.
Raw XML
in the browser has no styles associated with it
(example) so styles are added with XSLT
(example).
The MIME
(Multipurpose Internet Mail Extensions)
type definition for XSLT
is application/xslt+xml
. A file extension ending in .xsl
or .xslt
is the
commonly accepted and used form.
The mimetype definition for XHTML
is application/xhtml+xml
but it’s usually
served using the text/html
content type for browsers to assume HTML
instead of XML
parsing. XHTML
has
HTML
vs XHTML
is an epic and historic flame war. Think tabs vs. spaces,
self–closing tags vs. non self–closing tags or any other versus trope you can
imagine.
HTML
which you can take a look at in this
XHTML
in a nutshell article.
XML Validation and Formatting
You can validate and check an XML
document for well formedness using
xmllint
from the
libxml2
W3C
(The World Wide Web Consortium) offers an online
feed validation service, but an offline
validator sets up a better feedback loop and is a lot more robust and
XML
has multiple validation grammars in the form of schemas.
RELAX NG
(REgular LAnguage for XML
Next Generation)
is one of those schema language formats. Schema examples can be found in
RFCs
(Request for
Comments) or in niche places around the web — for example here’s a
RSS
rng file, an
ATOM
rnc file, and
an ATOM
rng file.
The catch is that these validation schema files may have differing use cases or
may be out of spec due to time, but they’re still worth looking at.
RELAX NG
has both a standard
xml.rng
syntax and a compact
xml.rnc
syntax. Offline validation with xmllint
does not
xmllint
manual it supports RELAX NG
,
WXS
(W3C
XML
Schema), and
Schematron.
rnc
compact schema syntax — but rng
works. Schema
ATOM
feeds locally.
rnc
and rng
can be achieved with the
Java program
trang
(usually goes by the name
jing-trang
in package
repositories).
- Trang
- Trang converts between different schema languages for
XML
.RELAX NG
(XML
syntax),RELAX NG
compact syntax,XML
1.0
DTDs
andW3C
XML
Schema (WXS
).
In my case, and maybe yours, it’s easier to run trang
on an already well
specified and well formed XML
document. This produces a basic rng
schema
file for validation and adding more rules.
shell
trang rss.xml rss.rng
trang atom.xml atom.rng
trang opml.xml opml.rng
Validate XML
using the rng
file with xmllint
and the --relaxng
flag. The
--noout
flag disables printing the output to the command line.
shell
$ xmllint --noout --relaxng rss.rng rss.xml
rss.xml validates
If it fails to validate it will return the error message defined by the schema’s grammar.
shell
$ xmllint --noout --relaxng rss.rng rss.xml
rss.xml:25: element description: Relax-NG validity error : Did not expect element description there
rss.xml fails to validate
Pretty print XML
with --pretty 1
for basic formatting or --pretty 2
for “one
attribute per line” white space formatting.
shell
xmllint --pretty 1 rss.xml
xmllint --pretty 2 rss.xml
Stylesheet Processing and Validation
The command line XSLT
processor
xsltproc
can be used to process
stylesheets offline and works only on stylesheets up to version 1.1
. If using
xsltproc
as a validation tool for xsl
files, you’ll have to downgrade the
version declaration from version 3.0
to
version 1.1
and
1.0
is the version that
that most browsers support.
shell
xsltproc rss.xsl
shell
xsltproc rss.xsl rss.xml
Other processors like Xalan–Java supports
XSLT
up to version 1.0
and
Saxon
up to version 3.0
.
Stylesheet Boilerplate and Transformations
Below is one variation of a stylesheet that transforms XML
to XHTML
. A
typical XHTML
document skeleton is embedded within along with XSLT
elements
for processing and transformation.
xsl
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
xmlns:dc="http://purl.org/dc/elements/1.1/"
version="1.1"
>
<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>XHTML Document</title>
<meta
http-equiv="Content-Type"
content="text/html; charset=utf-8"
/>
<meta
name="viewport"
content="width=device-width, initial-scale=1, maximum-scale=1"
/>
</head>
<body>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
In the above,
namespace attributes in the form
xmlns:itunes
extend the document. You could think of them as imports for
extending features and avoiding naming conflicts. The URL
points to the
“allowed”
For example, the
Atom Activity Streams namespace
could be added under xmlns:activity
and extend the stylesheet with an
understanding of Activity Streams
related vocabulary. Namespaces can also be used to extend processing
instructions like xmlns:xsl
for XSLT
processors that support them.
xsl
<xsl:stylesheet
xmlns:activity="http://activitystrea.ms/specs/atom/1.0/"
version="1.1"
>
xml
<activity:verb>post</activity:verb>
Drop the xsl
stylesheet inside a XML
document with the xml-stylesheet
declaration and the browser handles the rest.
xml
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<?xml-stylesheet href="/rss.xsl" type="text/xsl"?>
XSLT
works in conjunction with
XPath
(the XML
Path Language)
and is somewhat similar to CSS
(Cascading Style Sheets) selectors. Command
line programs like xmlstarlet
make use of XPath
expressions for selecting data from parts of an XML
document.
xsl
<xsl:value-of select="/rss/channel/atom:link[@rel='previous']/@href"/>
The XPath
expression from the select attribute above gets the href
value
from the <link>
tag in the atom
namespace which is equal to
https://example.com/page/2/rss.xml
.
xml
<atom:link rel="next" href="https://example.com/page/2/rss.xml" />
If you’re familiar with CSS
, then you’re in luck.
Cheat sheets for XPath
are everywhere across the
Internet in a “CSS
to XPath
” format. Test expressions locally with
xsltproc
, xmlstarlet
or an
online Xpath
expression test bed.
css
rss > channel > link[rel="previous"][href] {
display: inline;
}
Value selections, for loops, and, switch statements are the more commonly used
XSLT
elements.
Attributes and Value Selection
Create attributes with the xsl:attribute
element. The attributes are added to
the parent tag. Select values with the xsl:value-of
element.
xsl
<a>
<xsl:attribute name="href">
<xsl:value-of select="/rss/channel/atom:link[@rel='next']/@href"/>
</xsl:attribute>
</a>
<!-- Output: <a href="https://example.com/page/2/rss.xml"></a> -->
xsl
<img>
<xsl:attribute name="alt"><xsl:value-of select="/rss/channel/category"/></xsl:attribute>
<xsl:attribute name="title"><xsl:value-of select="/rss/channel/category"/></xsl:attribute>
<xsl:attribute name="src"><xsl:value-of select="/rss/channel/image/url"/></xsl:attribute>
</img>
<!-- Output: <img alt="image" title="image" src="/image"></img> -->
For Each
A typical for each
construction executes over a range of XML
tags with the
xsl:for-each
element.
xsl
<xsl:for-each select="/rss/channel/item">
<h2>
<xsl:value-of select="title" />
</h2>
</xsl:for-each>
Switch Statements
A switch
statement construction is executed with a combination of the
xsl:choose
, xsl:otherwise
, and xsl:when
elements. The test
attribute
on xsl:when
contains the condition.
xsl
<xsl:choose>
<xsl:when test="/rss/channel/atom:link[@rel='previous']/@href">
<xsl:attribute name="href">
<xsl:value-of select="/rss/channel/atom:link[@rel='previous']/@href"/>
</xsl:attribute>
</xsl:when>
<xsl:otherwise>
<xsl:attribute name="href">/</xsl:attribute>
</xsl:otherwise>
</xsl:choose>
View the XSLT
elements and function
reference for the complete list of instructions.
Conclusion
There you have it — a basic overview and approach to working with XML
,
XSLT
, and XHTML
.