XHTML 1.0 is a transition language, with the purpose of porting HTML 4 documents to XML. So, let's take a look at the few steps to convert HTML 4 documents to XHTML 1.0. Though it may make this migration a little more boring, we will always keep in mind the direction taken by XHTML 1.1: this will allow us to save a lot of time during later migrations.
<html>
) must contain the declaration for the XHTML namespace.<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <!-- Headers --> </head> <body> <!-- Document body --> </body> </html>
UTF-8
or UTF-16
character encodings (by the way, Internet Explorer handles the very same document differently if you write the XML declaration).Content-Type
HTTP header or a meta
element. Therefore, if you want documents with non-default character encodings to be portable and you can't make sure the web server provides the right HTTP headers, you need to include both the XML declaration and the corresponding meta http-equiv
element; e.g.:
<meta http-equiv="Content-type" content="text/html; charset=EUC-JP" />
<p>
element). In XHTML 1.0, instead, all non-EMPTY elements must have an end tag. EMPTY elements must either have an end tag or a slash before the closing angle bracket. E.g.:
<hr></hr> <hr />
<p />
compact
or checked
in HTML). The correct syntax is:
<input type="checkbox" checked="checked">
script
and style
elementsscript
and style
elements are declared as having #PCDATA
content. As a result, <
and &
will be treated as the start of markup, and entities, such as <
and &
, will be expanded to the corresponding characters. You can avoid this by wrapping the content of the script
and style
elements within a #CDATA
marked section:
<script type="text/javascript"> <![CDATA[ [...] ]]> </script>
id
and name
attributesa
, applet
, form
, frame
, iframe
, img
and map
) can use two different attributes as fragment identifiers: name
and id
. In XHTML 1.0, instead, only the id attribute is a legal fragment identifier, while the name
attribute is formally deprecated and will be removed in XHTML 1.1.#foo
"), XML and HTML differ: the former requires elements with an id
attribute, the latter elements with a name
attribute. Therefore, to ensure maximum forward and backward compatibility, identical values may be supplied for both of these attributes:
<a id="foo" name="foo">
[A-Za-z][A-Za-z0-9:_.-]*
should be used.type
attribute of an input
element). In XHTML 1.0, the interpretation of these values is case-sensitive and all values are lower-case.isindex
isindex
element in the document head
. In any case, the isindex
element is deprecated in favor of the input element.lang
and xml:lang
attributesxml:lang
attribute takes precedence.&
") always declares the beginning of an entity reference. Therefore, all ampersands used in a document that are to be treated as literal characters must be expressed themselves as an entity reference (i.e. "&"
), including ampersands inside the href
attribute of an a
element; e.g.:
<a href="http://www.kernel-panic.it/myscript.php?id=123456&name=user">
tbody
element will be inferred by the parser of an HTML user agent, but not by the parser of an XML user agent. Therefore, you should always explicitly add a tbody element if it is referred to in a CSS selector.style
elementsstyle
element can be used to define document-internal style rules; XML, instead, uses an XML stylesheet declaration. In order to be compatible with this convention, style
elements should have their fragment identifier set using the id
attribute, and an XML stylesheet declaration should reference this fragment. E.g.:
<?xml-stylesheet href="/css/mystyle.css" type="text/css"?> <?xml-stylesheet href="#pageStyle" type="text/css"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>An internal stylesheet example</title> <style type="text/css" id="pageStyle"> p { color: blue; font-weight: bold; } </style> </head> <body> <p> Paragrafo blu e in grassetto. </p> </body> </html>
'
entity (the apostrophe, U+0027), was introduced in XML 1.0 but does not appear in HTML (to ensure compatibility, you should use '
).XHTML 1.1 has removed all the features that were deprecated in HTML 4 and XHTML 1.0: therefore, the Transitional and Frameset document types are no longer available in XHTML 1.1. In general, the strategy of XHTML 1.1 is to define a markup language that is rich in structural functionality, but relies upon style sheets for presentation. The main differences can be summarized as follows:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
lang
attribute has been completely removed in favour of the xml:lang
attribute;a
and map
elements, the name
attribute has been removed in favour of the id
attribute;ruby
" collection of elements has been added.