Copyright © 2007 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
HTML 5 defines the fifth major revision of the core language of the World Wide Web, HTML. This document describes the set of guiding principles used by the HTML Working Group for the development of HTML5. The principles offer guidance for the design of HTML in the areas of compatibility, utility and interoperability.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is the First Public Working Draft of "HTML Design Principles" produced by the HTML Working Group, part of the HTML Activity. The Working Group intends to publish this document as a Working Group Note. The working group is working on a new version of HTML not yet published under TR. In the meantime, you can access the HTML 5 Editor's draft. The appropriate forum for comments on this document is public-html-comments@w3.org, a mailing list with a public archive.
The decision to request publication of the document was based on a poll of the members of the HTML working group, with the results being 51 "Yes" votes, 2 "No" votes, and 1 "Formally Object", vote.
The specific objection recorded was judged to fall under the category of a comment that can be addressed in future drafts — not a critical reason to delay publication, and with the understanding that full consensus is not a prerequisite to publication, because the decision of the HTML working group to publish the document reflects the intent of the group to signal to the community to begin carefully reviewing the document, and to encourage wide review of the document within and outside of W3C.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. The group does not expect this document to become a W3C Recommendation. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
In the HTML Working Group, we have representatives from many different communities, including the WHATWG and other W3C Working Groups. The HTML 5 effort under WHATWG, and much of the work on various W3C standards over the past few years, have been based on different goals and different ideas of what makes for good design. To make useful progress, we need to have some basic agreement on goals for this group.
These design principles are an attempt to capture consensus on design approach. They are pragmatic rules of thumb that must be balanced against each other, not absolutes. They are similar in spirit to the TAG's findings in Architecture of the World Wide Web, but specific to the deliverables of this group.
Many language specifications define a set of conformance requirements for valid documents, and corresponding conformance requirements for implementations processing these valid documents. HTML 5 is somewhat unusual in also defining implementation conformance requirements for many constructs that are not allowed in conforming documents.
This dual nature of the spec allows us to have a relatively clean and understandable language for authors, while at the same time supporting existing documents that make use of older or nonstandard constructs, and enabling better interoperability in error handling.
Some of the design principles below apply much more to the conformance requirements for content (the "conforming language") while others apply much more to the conformance requirements for implementations (the "supported language"). Since the supported language is a strict superset of the conforming language, there is considerable overlap, but the principles will do their best to make clear which set of requirements they apply to.
There are many ways of interpreting compatibility. Sometimes the terms "backwards compatibility" and "forwards compatibility" are used, but sometimes the meaning of those terms can be unclear. The principles in this section address different facets of compatibility.
This principle applies primarily to the supported language.
Existing content often relies upon expected user agent processing and behavior to function as intended. Processing requirements should be specified to ensure that user agents implementing this specification will be able to handle most existing content. In particular, it should be possible to process existing HTML documents as HTML 5 and get results that are compatible with the existing expectations of users and authors, based on the behavior of existing browsers. It should be made possible, though not necessarily required, to do this without mode switching.
Content relying on existing browser behavior can take many forms. It may rely on elements, attributes or APIs that are part of earlier HTML specifications, but not part of HTML 5, or on features that are entirely proprietary. It may depend on specific error handling rules. In rare cases, it may depend on a feature from earlier HTML specifications not being implemented as specified.
When considering changes to legacy features or behavior, relative to current implementations and author expectations, the following questions should be considered:
The benefit of the proposed change should be weighed against the likely cost of breaking content, as measured by these criteria. In some cases, it may be desirable to make a nonstandard feature or behavior part of the conforming language, if it satisfies a valid use case. However, the fact that something is part of the supported language does not by itself mean that relying on it is condoned or encouraged.
Many sites use broken markup, such as badly nested
elements (<b>a<i>b</b>c</i>
), and both authors
and users have expectations based on the error handling used by legacy
user agents. We need to define processing requirements that remain
compatible with the expected handling of such content.
Some sites rely on the <u>
element
giving the presentational effect of an underline.
This principle applies primarily to the conforming language.
On the World Wide Web, authors are often reluctant to use new language features that cause problems in older user agents, or that do not provide some sort of graceful fallback. HTML 5 document conformance requirements should be designed so that Web content can degrade gracefully in older or less capable user agents, even when making use of new elements, attributes, APIs and content models.
It is not necessarily appropriate to consider every Web user agent ever made, including even very old versions of browsers or tools that are extremely unpopular even in their niche markets. However, strong consideration should be given to the following categories of user agents. It is highly likely that content authors will find it important to target these categories:
In some cases, a new feature may simply not apply to a certain class of user agents, or may be impractical to design in a way that can degrade. For example, new scripting APIs cannot be made to work in scriptless user agents. But in many cases, approaches like the following can be used:
This list is not exhaustive; in some cases slightly more complicated approaches are more effective.
The default presentation of the proposed
irrelevant
attribute can be emulated through the CSS rule
[irrelevant] { display: none; }
.
Proposed new multimedia elements like <canvas>
fallback </canvas>
or <video> fallback
</video>
allow fallback content. Older user agents will show
"fallback" while user agents supporting canvas
or
video
will show the multimedia content.
The proposed getElementsByClassName()
method
can be made considerably faster than pure ECMAScript implementations found
in existing libraries, but a script-based implementation can be used when
the native version is not available.
The <datalist>
element can be associated
with an <input>
element and may contain a hidden
<select>
element. This way the fallback for the intended
"combo box" control can be a text field or a text field with an associated
pop-up menu in existing mainstream browsers
If there is already a widely used and implemented technology covering particular use cases, consider specifying that technology in preference to inventing something new for the same purpose. Sometimes, though, new use cases may call for a new approach instead of more extensions on an old approach.
contenteditable=""
was already used and
implemented by user agents. No need to invent a new feature.
When a practice is already widespread among authors, consider adopting it rather than forbidding it or inventing something new.
Authors already use the <br/>
syntax as
opposed to <br>
in HTML and there is no harm done by
allowing that to be used.
Revolutions sometimes change the world to the better. Most often, however, it is better to evolve an existing design rather than throwing it away. This way, authors don't have to learn new models and content will live longer. Specifically, this means that one should prefer to design features so that old content can take advantage of new features without having to make unrelated changes. And implementations should be able to add new features to existing code, rather than having to develop whole separate modes.
Switching to XML syntax requires a global change, so continue supporting classic HTML syntax as well.
These principles call for a design that makes sure HTML can be used effectively for its many intended purposes.
Changes to the spec should solve actual real-world problems. Abstract architectures that don't address an existing need are less favored than pragmatic solutions to problems that web content faces today. And existing widespread problems should be solved, when possible.
In case of conflict, consider users over authors over implementors over specifiers over theoretical purity. In other words costs or difficulties to the user should be given more weight than costs to authors; which in turn should be given more weight than costs to implementors; which should be given more weight than costs to authors of the spec itself, which should be given more weight than those proposing changes for theoretical reasons alone. Of course, it is preferred to make things better for multiple constituencies at once.
Ensure that features work with the security model of the web. Preferrably address security considerations directly in the specification.
Communicating between documents from different sites is useful, but an unrestricted version could put user data at risk. Cross-document messaging is designed to allow this without violating security constraints.
HTML should allow separation of content and presentation. For this reason, markup that expresses structure is usually preferred to purely presentational markup. However, structural markup is a means to an end such as media independence. Profound and detailed semantic encoding is not necessary if the end can be reached otherwise. Defining reasonable default presentation for different media may be sufficient. HTML strikes a balance between semantic expressiveness and practical usefulness. Names of elements and attributes in the markup may be pragmatic (for brevity, history, simplicity) rather than completely accurate.
The article
element defines an individual
article, but not the details of how it is displayed. A journal article may
be the only article on a page, formatted in multiple columns, while a blog
post may share a page with multiple other articles and be presented in a
box with a border.
The b
and i
elements are widely
used — it is better to give them good default rendering for various
media including aural than to try to ban them.
The two serializations should be designed in such a way that the DOM trees produced by the respective parsers appear as consistently as feasible to scripts and other program code operating on the document trees. Discrepancies can be allowed for compatibility with legacy implementations, but the differences should be minimized.
Also, unless required for compatibility with legacy implementations and deployed content, gratuitous difference in syntactic appearance should be avoided as well.
The HTML (text/html
) parser puts elements in
the http://www.w3.org/1999/xhtml
namespace in the DOM for
compatibility with the XML syntax of HTML 5.
These principles exist to improve the chances of HTML implementations being truly interoperable.
Prefer to clearly define behavior that content authors could rely on, in preference to vague or implementation-defined behavior. This way, it is easier to author content that works in a variety of user agents. However, implementations should still be free to make improvements in areas such as user interface and quality of rendering.
Simple solutions are preferred to complex ones, when possible. Simpler features are easier for user agents to implement, more likely to be interoperable, and easier for authors to understand. But this should not be used as an excuse to avoid satisfying the other principles.
Error handling should be defined so that interoperable implementations can be achieved. Prefer graceful error recovery to hard failure, so that users are not exposed to authoring errors.
Features should be designed for universal access. This category covers various principles related to that.
Features should, when possible, work across different platforms, devices, and media. This should not be taken to mean that a feature should be omitted just because some media or platforms can't support it. For example, interactive features should not be omitted merely because they can not be represented in a printed document.
The general reflowability of HTML text makes it more suitable to variable screen dimensions than a representation of exact glyph positions.
A hyperlink can not be actuated in a printed document, but
that is no reason to omit the a
element.
Enable publication in all world languages. But this should not be taken as equalizing writing systems by prohibiting features that do not apply to all of them. Features for packing multiple translations of a document in a single file are out of scope.
Supporting Unicode allows text in most of the world's languages, including mixing of text in different languages.
Italic text is useful because it applies to many bicameral scripts, even though some scripts have no such concept. Similarly, ruby is useful for many scripts, even though it has a CJK focus.
Text in element content has better language support than
text in attribute content; in element content ruby annotations can be
inserted, as well as dir
attributes and bdo
elements in case the Unicode bidirectional algorithm is insufficient to
correctly order adjacent runs of mixed direction text.
Design features to be accessible to users with disabilities. Access by everyone regardless of ability is essential. This does not mean that features should be omitted entirely if not all users can make full use of them, but alternate mechanisms should be provided.
The image in an img
may not be visible to
blind users, but that is a reason to provide alternate text, not to leave
out images.
The progress
element is intrinsically
accessible as it has unambiguous progress bar semantics which permits
mapping to accessibility APIs that can represent progress indicators.
The editors would like to thank Charles McCathieNevile, Chris Wilson, Dan Connolly, Henri Sivonen, Ian Hickson, Jirka Kosek, Lachlan Hunt, Nik Thierry, Philip Taylor, Richard Ishida, Stephen Stewart, and Steven Faulkner for their contributions to this document as well as to all the people who have contributed to HTML 5 over the years for improving the Web!
If you contributed to this document, but your name is not listed above please let the editors know so they can correct this omission.