|
The Extensible HyperText Markup Language, or XHTML, is a markup language that has the same expressive possibilities as HTML, but a stricter, more verbose syntax. Whereas HTML is an application of SGML, a very flexible markup language, XHTML is an application of XML, a more restrictive subset of SGML. Because they need to be well-formed (syntactically correct), XHTML documents allow for automated processing to be performed using a standard XML library—unlike HTML, which requires a relatively complex, lenient, and generally custom parser (though an SGML parser library could possibly be used). XHTML can be thought of as the intersection of HTML and XML in many respects, since it is a reformulation of HTML in XML. XHTML 1.0 became a World Wide Web Consortium (W3C) Recommendation on January 26, 2000. XHTML 1.1 is now a W3C recommendation since May 31, 2001. Overview XHTML is the successor to HTML. As such, many consider XHTML to be the current or latest version of HTML. However, XHTML is a separate recommendation; the W3C continues to recommend the use of XHTML 1.1, XHTML 1.0, and HTML 4.01 for web publishing. Motivation The need for a more strict version of HTML was felt primarily because World Wide Web content now needs to be delivered to many devices (like mobile devices) apart from traditional computers, where extra resources cannot be devoted to support the additional complexity of HTML syntax. Another goal for XHTML and XML was to reduce the demands on parsers and user-agents in general. With HTML, user-agents increasingly took on the burden of “correcting” errant documents. Instead XML requires user-agents to fail when encountering malformed XML. This means an XHTML browser can theoretically be faster and made to run more easily on miniaturized devices than a comparable HTML browser. The recommendation for browsers to post an error rather than attempt to render malformed content should help eliminate malformed content. Even when authors do not validate code, and simply test against an XML browser, errors will be revealed. An especially useful feature XHTML inherits from its XML underpinnings is XML namespaces. With namespaces, authors or communities of authors can define their own XML elements, attributes and content models to mix within XHTML documents. This is similar to the semantic flexibility of the ‘class’ attribute from HTML, but with much more power. Some W3C XML namespaces/schema that can be mixed with XHTML include MathML for semantic math markup and Scalable Vector Graphics for markup of vector graphics. Differences from HTML The changes from HTML to first-generation XHTML 1.0 are minor and are mainly to achieve conformance with XML. The most important change is the requirement that the document must be well formed and that all elements must be explicitly closed as required in XML. In XML, all element and attribute names are case-sensitive, so the XHTML approach has been to define all tag names to be lowercase. This contrasts with some earlier established traditions which began around the time of HTML 2.0, when many used uppercase tags. In XHTML, all attribute values must be enclosed by quotes (either 'single' or "double" quotes may be used). In contrast, this was sometimes optional in SGML, and hence in HTML, where quotes may be omitted in some circumstances. XML dispensed with the intricate rules for determining when quotes were required or when they could be omitted by simply requiring them in all cases . All elements must also be explicitly closed, including empty (aka singleton) elements such as img and br. This can be done by adding a closing slash to the start tag: and . Attribute minimization (e.g., ) is also prohibited as the attribute “selected” contains no explicit value; instead, use . More differences are detailed in the W3C XHTML 1.0 recommendation *.Adoption Adoption of XHTML has happened at an uneven pace. The similarities between HTML 4.01 and XHTML 1.0 has led many web authors, content management systems, and entire sites to eagerly adopt the initial W3C XHTML 1.0 recommendations. To aid authors in the transition, the W3C has included an appendix to the XHTML 1.0 recommendations describing how to publish XHTML documents as HTML-compatible documents and serve them as HTML. In a sense the XHTML served in this way becomes a new version of HTML: identified as ‘text/html’ and treated by browsers as HTML. The only difference is that the HTML-compatible XHTML includes an extra ‘xml:lang’ attribute foreign to previous versions of HTML. Browsers' adoption has, in contrast, been incomplete. While almost every developing browser includes support for XML parsing that is necessary to support XHTML, many issues remain unaddressed even though there have been many years since the recommendation status of XHTML 1.0 (January 2000). Foremost among these issues is the Internet Explorer by Microsoft (MSIE). Though MSIE has XML parsing capabilities (ever since version 5.0 in 1999) and can consume XHTML content as XML when identified as either ‘application/xhtml+xml’, ‘application/xml’, and ‘text/xml’, the recognition of XHTML semantics is disabled by default. Moreover, MSIE ships with ‘application/xhtml+xml’ set to an unknown type so that files of this type are treated as download files. This means that nearly every browser surfing the internet has XHTML capabilities, but around 85% of them have XHTML disabled through the default install. While most other browsers respond properly to all of the possible XHTML MIME types, they are not fully compliant. Mozilla does not incrementally render XML as it receives XML over the network in the way it does with HTML. Issues such as these may exist due to Internet Explorer's XHTML disruptions, but they exhibit incomplete support from browser vendors. Compounding the issues, Windows Internet Explorer 7 did not enable the XHTML capabilities by default. Obstacles from browser vendors have slowed the effective rate of the adoption. Without broader browser support, XHTML documents must continue to be served as MIME type ‘text/html’ files, and therefore some of the advantages of XML — namespaces, faster parsing and smaller foot-print browsers — remain elusively unattainable on a wide-scale basis. Recently, some have begun to question why authors ever made the leap into authoring in XHTML. A Campaign now exists discouraging authors from following the W3C’s appendix C HTML compatibility guidelines by suggesting it’s a mistake. Due to a vacuum of information and without forthcoming browser support, XHTML adoption among authors is actually beginning to reverse. XHTML 1.0 The original XHTML W3C Recommendation, XHTML 1.0, was simply a reformulation of HTML 4.01 in XML. There are three different "flavors" of XHTML 1.0, each equal in scope to their respective HTML 4.01 versions. , , , and . It supports everything found in XHTML 1.0 Strict, but also permits the use of a number of elements and attributes that are judged presentational. *XHTML 1.1 The most recent XHTML W3C Recommendation is XHTML 1.1: Module-based XHTML, which is a reformulation of XHTML 1.0 Strict using a set of modules selected from a larger set defined in Modularization of XHTML, a W3C Recommendation which provides a modularization framework, a standard set of modules, and various conformance definitions. All deprecated features of HTML, such as presentational elements and framesets, have been removed from this version. Presentation is controlled purely by Cascading Style Sheets (CSS). This version also allows for ruby markup support, needed for East-Asian languages (especially CJK). Although Modularization of XHTML allows small chunks of XHTML to be re-used by other XML applications in a well-defined manner, and for XHTML to be extended for specialized purposes, XHTML 1.1 adds the concept of a "strictly conforming" document: such a document cannot employ such features—it must be a complete document containing only elements defined in the modules required by XHTML 1.1. For example, if a document is extended by using elements from the XHTML Frames (frameset) module, it may still be described as XHTML 1.1, but not strictly conforming XHTML 1.1. Instead, it might be described as an XHTML Host Language Conforming Document, if the relevant criteria are satisfied. The XHTML 2.0 draft specification Work on XHTML 2.0 is, as of 2006, still underway. The XHTML 2.0 draft is controversial because it breaks backward compatibility with all previous versions, and is therefore, in effect, a new markup language created to circumvent (X)HTML's limitations rather than being simply a new version. Many issues with compatibility are easily addressed, however, by parsing XHTML 2.0 the same way a user agent would parse XHTML 1.1: via an XML parser and a default CSS document conforming to the XHTML 2.0 recommendation. New features brought into the HTML family of markup languages by XHTML 2.0: nl element type, will be included to specifically designate a list as a navigation list. This will be useful in creating nested menus, which are currently created by a wide variety of means like nested unordered lists or nested definition lists.
, similar to XLink. However, XLink itself is not compatible with XHTML due to design differences. attribute, e.g., is the same as .alt attribute of the img element has been removed: alternative text will be given in the content of the img element, much like the object element, e.g., .h) will be added. The level of these headings will be indicated by the nested section elements, each with their own h heading.i, b and tt, still allowed in XHTML 1.x (even Strict), will be absent from XHTML 2.0. The only somewhat presentational elements remaining will be sup and sub for superscript and subscript respectively, because they have significant non-presentational uses and are required by certain languages. All other tags are meant to be semantic instead (e.g. for strong or bolded text) while allowing the user agent to control the presentation of elements via CSS.property and about attributes to facilitate the conversion from XHTML to RDF/XML.Others members of the XHTML family Valid XHTML documents An XHTML document that conforms to the XHTML specification is said to be a valid document. In a perfect world, all browsers would follow the web standards and valid documents would predictably render on every browser and platform. Although validating XHTML does not ensure cross-browser compatibility, it is a recommended first step. A document can be checked for validity with the W3C Markup Validation Service. DOCTYPEs For a document to validate, it must contain a Document Type Declaration, or DOCTYPE. A DOCTYPE declares to the browser what Document Type Definition (DTD) the document conforms to. A Document Type Declaration should be placed at the very beginning of an XHTML document, even before the tag. The system identifier part of the DOCTYPE, which in these examples is the URL that begins with "http", need only point to a copy of the DTD to use if the validator cannot locate one based on the public identifier (the other quoted string). It does not need to be the specific URL that is in these examples; in fact, authors are encouraged to use local copies of the DTD files when possible. The public identifier, however, must be character-for-character the same as in the examples. These are the most common XHTML Document Type Declarations: XHTML 1.0 Strict
XHTML 1.0 Transitional
XHTML 1.0 Frameset
XHTML 1.1
XHTML 2.0 XHTML 2.0 currently (August 2006) is in a draft phase. The likely document type declaration will appear as:
| |||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||
![]() |
|
| |