Due to the fact that XHTML is an XML application, certain practices that were perfectly legal in SGML-based HTML 4 must be changed. You already have seen XHTML syntax in previous chapter, so differences between XHTML and HTML are very obvious. Following is the comparison between XHTML and HTML.
Well-formedness is a new concept introduced by XML. Essentially, this means all the elements must have closing tags and you must nest them properly.
CORRECT: Nested Elements
<p>Here is an emphasized <em>paragraph</em>.</p>
INCORRECT: Overlapping Elements
<p>Here is an emphasized <em>paragraph.</p></em>
XHTML documents must use lower case for all HTML elements and attribute names. This difference is necessary because XHTML document is assumed to be an XML document and XML is case-sensitive. For example, <li> and <LI> are different tags.
In HTML, certain elements are permitted to omit the end tag. But XML does not allow end tags to be omitted.
CORRECT: Terminated Elements
<p>Here is a paragraph.</p><p>here is another paragraph.</p> <br><hr/>
INCORRECT: Unterminated Elements
<p>Here is a paragraph.<p>here is another paragraph. <br><hr>
All attribute values including numeric values, must be quoted.
CORRECT: Quoted Attribute Values
<td rowspan="3">
INCORRECT: Unquoted Attribute Values
<td rowspan=3>
XML does not support attribute minimization. Attribute-value pairs must be written in full. Attribute names such as compact and checked cannot occur in elements without their value being specified.
CORRECT: Non Minimized Attributes
<dl compact="compact">
INCORRECT: Minimized Attributes
<dl compact>
When a browser processes attributes, it does the following −
Strips leading and trailing whitespace.
Maps sequences of one or more white space characters (including line breaks) to a single inter-word space.
In XHTML, the script and style elements should not have “<” and “&” characters directly, if they exist; then they are treated as the start of markup. The entities such as “<” and “&” are recognized as entity references by the XML processor for displaying “<” and “&” characters respectively.
Wrapping the content of the script or style element within a CDATA marked section avoids the expansion of these entities.
<script type="text/JavaScript"> <![CDATA[ ... unescaped VB or Java Script here... ... ]]> </script>
An alternative is to use external script and style documents.
XHTML recommends the replacement of name attribute with id attribute. Note that in XHTML 1.0, the name attribute of these elements is formally deprecated, and it will be removed in a subsequent versions of XHTML.
HTML and XHTML both have some attributes that have pre-defined and limited sets of values. For example, type attribute of the input element. In HTML and XML, these are called enumerated attributes. Under HTML 4, the interpretation of these values was case-insensitive, so a value of TEXT was equivalent to a value of text.
Under XHTML, the interpretation of these values is case-sensitive so all of these values are defined in lower-case.
HTML and XML both permit references to characters by using hexadecimal value. In HTML these references could be made using either &#Xnn; or &#xnn; and they are valid but in XHTML documents, you must use the lower-case version only such as &#xnn;.
All XHTML elements must be nested within the <html> root element. All other elements can have sub elements which must be in pairs and correctly nested within their parent element. The basic document structure is −
<!DOCTYPE html....> <html> <head> ... </head> <body> ... </body> </html>