Access Control for Cross-site Requests

W3C

Access Control for Cross-site Requests

W3C Working Draft 14 February 2008

This Version:
http://www.w3.org/TR/2008/WD-access-control-20080214/
Latest Version:
http://www.w3.org/TR/access-control/
Previous Versions:
http://www.w3.org/TR/2007/WD-access-control-20071126/
http://www.w3.org/TR/2007/WD-access-control-20071001/
http://www.w3.org/TR/2007/WD-access-control-20070618/
http://www.w3.org/TR/2007/WD-access-control-20070215/
http://www.w3.org/TR/2006/WD-access-control-20060517/
http://www.w3.org/TR/2005/NOTE-access-control-20050613/
Editor:
Anne van Kesteren (Opera Software ASA) <annevk@opera.com>

Abstract

This document defines a mechanism to enable client-side cross-site requests. It defines two request algorithms for GET and non-GET methods that specifications, that want to enable cross-site requests in the "protocols" they define, can use. If such a protocol is used on the example.org server and a resource on the hello-world.invalid server opts in to the mechanism the protocol can be used to fetch that resource cross-site.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is the 14 February 2008 Working Draft of the "Access Control for Cross-site Requests" document. It is expected that this document will progress along the W3C Recommendation track. This document is produced by the Web Application Formats (WAF) Working Group. The WAF Working Group is part of the Rich Web Clients Activity in the W3C Interaction Domain.

Please send comments to the WAF Working Group's public mailing list public-appformats@w3.org with [access-control] at the start of the subject line. Archives of this list are available. See also W3C mailing list and archive usage guidelines.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Table of Contents

1. Introduction

Web application technologies commonly apply same-origin restrictions to network requests. These restrictions prevent a Web application running from one origin from obtaining data retrieved from another origin, and also limit the amount of unsafe HTTP requests that can be automatically launched toward destinations that differ from the running application's origin.

In Web application technologies that follow this pattern, network requests typically use ambient authentication and session management information, including HTTP authentication and cookie information.

This specification extends this model in several ways:

This specification is a building block for other specifications, so-called hosting specifications, which will define the precise model by which this specification is used. Among others, such specifications are likely to include XMLHttpRequest Level 2, XBL 2.0, and HTML 5 (for its server-sent events feature).

The design of this specification introduces is based on requirements and use cases, both included as appendix. A FAQ describing the design decisions is also available.

If an author has a simple text resource residing at http://example.com/hello which contains the string "Hello World!" and would like the hello-world.invalid domain to be able to access it, the resource combined with an HTTP header introduced by this specification would look as follows:

Access-Control: allow <hello-world.invalid>

Hello World!

The hello-world.invalid domain can now access this document using XMLHttpRequest; for instance, with the following ECMAScript snippet:

new client = new XMLHttpRequest();
client.open("GET", "http://example.com.com/hello")
client.onreadystatechange = function() { /* do something */ }
client.send()

It gets slightly more complicated if the author wants to be able to handle cross-site requests using the HTTP DELETE and POST methods. In that case the author needs to first reply to a method check request that uses the OPTIONS method and then needs to handle the actual request that uses the POST or DELETE method and give an appropriate response. The response to the method check request could have the following HTTP headers specified:

Access-Control: allow <hello-world.invalid>
Access-Control-Max-Age: 3628800

The Access-Control-Max-Age header indicates how long the response can be cached, so that for subsequent requests, within the specified time, no method check request has to be made. The response to the actual request can contain this header:

Access-Control: allow <hello-world.invalid>

In contrast to handling a request involving a non-GET method, making a request like that is not difficult for the author, as the complexity of invoking the additional method check request is the task of the user agent. Using XMLHttpRequest again and assuming the application were hosted at http://calendar.invalid/app the author could use the following ECMAScript snippet:

function deleteItem(itemId, updateUI) {
  var client = new XMLHttpRequest()
  client.open("DELETE", "http://calendar.invalid/app")
  client.onload = updateUI
  client.onerror = updateUI
  client.onabort = updateUI
  client.send("id=" + itemId)
}

XMLHttpRequest Level 2 includes support for cross-site access requests though it has not yet been published as a W3C Working Draft at the time of writing.

In case of XML resources, authors can also specify which domains are allowed to access (using the GET HTTP method) the resource inside the XML:

<?access-control allow="http://hello-world.invalid https://test.example.net"?>
<hello type="world"/>

The author is allowed to combine these techniques:

Access-Control: allow <http://hello-world.invalid>

<?access-control allow="https://test.example.net"?>
<hello type="world"/>

2. Conformance Criteria

This specification is applicable to both user agents and hosting specifications. This specification will only apply in certain contexts, and hosting specifications defining such contexts will define when and how this specification applies.

As well as sections marked as non-normative, all diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

In this specification, The words must, must not, should, should not and may are to be interpreted as described in RFC 2119. [RFC2119]

A conformant hosting specification is one that implements all the requirements listed in this specification that are applicable to hosting specifications. For instance, such a specification needs to define what the source for the access control origin is.

A conformant user agent is one that implements all the requirements listed in this specification that are applicable to user agents, while also being consistent with the requirements listed in the hosting specification.

User agents may employ any algorithm to implement this specification, so long as the end result is indistinguishable from the result that would be obtained by the specification's algorithms.

2.1 Terminology

Terminology is generally defined throughout the specification. However, the few definitions that did not really fit anywhere else are defined here instead.

The term ToASCII algorithm means that the ToASCII algorithm as described in RFC 3490 is applied with both the AllowUnassigned and UseSTD3ASCIIRules flags set. [RFC3490]

There is a case-insensitive match of strings s1 and s2 if after mapping the ASCII character range A-Z to the range a-z both strings are identical.

U+0009, U+000A, U+000D and U+0020 are space characters.

A space-separated list is a string of which the items are separated by one or more space characters (in any order). The string may also be prefixed or suffixed with zero or more space characters.

To obtain the list of values from a space-separated list user agents must take the string, replace any sequence of space characters with a single U+0020 character, then drop any leading or trailing U+0020 character, then chop the resulting string at each occurrence of a U+0020 character, then drop all U+0020 values, and then return the list of values.

An XML MIME type is text/xml, application/xml or any MIME type (excluding parameters) ending in +xml.

Two URIs are same-origin if after performing scheme-based normalization on both URIs as described in section 5.3.3 of RFC 3987 the scheme, ihost and port components are identical. If either URI does not not have an ihost component the URIs must not be considered same-origin. [RFC3987]

3. Security Considerations

The cross-site access request algorithm defined in this specification is an extension of the same-origin policy in contexts where the same-origin policy currently applies. This impacts hosting specifications referencing the algorithm, user agents implementing it, and authors using it. Below we discuss the security considerations for these groups.

Hosting specifications

Hosting specifications should limit the request headers an author can set and get, and forbidding setting user credentials through any API defined in the hosting specification.

Hosting specifications using the cross-site access request should properly deal with redirects. In particular, if a same-origin request is redirected to a non same-origin URI the specification should abort the request and either terminate completely (as it did until now) or use the cross-site access request algorithm on the non same-origin URI.

User agents

When making a cross-site access request, user agents should ensure to:

User agents must also take care to properly normalize Unicode and to properly interpret IDNs to prevent URI spoofing attacks. [RFC3490]

Authors

Application authors should be aware that content retrieved from another site is not itself trustable. Authors should protect themselves against cross-site scripting attacks by not rendering or executing the retrieved content directly without validating that content.

Authors sharing content with domains that are on shared hosting environments should ensure to not allow access from arbitrary ports on those domains.

example.com could host some user-sensitive data protected by HTTP authentication or cookies. It has an agreement with company.invalid to share this data and therefore uses the following HTTP header:

Access-Control: allow <company.invalid:*>

Now company.invalid happens to be on a shared hosting environment with evil.example.net (they share the IP address). Because of this, evil.example.net can host content at company.invalid:9999 and access user-sensitive data at example.com if the user, by phishing for instance, goes to company.invalid:9999. This can be prevented by clearly specifying the port or omitting it entirely, letting it default to the default port of the URI scheme:

  • Access-Control: allow <company.invalid:80>
  • Access-Control: allow <company.invalid>

Authors are to ensure that GET requests on their applications have no side effects. If by some means an attacker finds out what applications a user is associated with, it might "attack" these applications with GET requests that can effect the user's data (if the user is already authenticated with any of these applications by means of cookies or HTTP authentication).

Authors are encouraged to check the Access-Control-Origin HTTP header, especially for non-GET requests, to ensure that in case of policy change they do not inadvertently allow access due to race conditions (when such access should be denied).

For different authors sharing one host name (people.example.org/~author-name/) it is not possible to allow access only from a certain author as the other authors could trivially work around this through DOM scripting. Sharing access with an author who shares the host name with someone else is therefore discouraged.

Integrity protection of the access control policy statements may be required. This could be achieved by use of SSL/TLS, for example.

4. Syntax

This section defines the various syntactic constructs this specification introduces. A number of these constructs are defined using ABNF as defined in RFC 2616. [RFC2616].

RFC 2616 is used as ABNF basis to ensure that the new headers have equivalent constructs to those introduced in that specification.

4.1 Access Item

An access item is either a single * character (always matches) or a domain that can contain a wildcard at the start and can optionally have a scheme and port specified. An access item must match the following ABNF:

access-item    = [scheme "://"] domain-pattern [":" port-pattern] | "*"
domain-pattern = domain | "*." domain
port-pattern   = port | "*"

scheme and port are used as defined in RFC 3986. domain is an internationalized domain name as defined in RFC 3490. [RFC3986] [RFC3490]

In addition to matching the above ABNF, the ToASCII algorithm must apply successfully (without errors) to each label component of the subdomain (if any) from the access item.

Several examples of conforming access items:

The following access items would make the user agent deny access to the resource as the access items are invalid:

The following access items are identical:

The following access items would match http://foo.bar.example.org:80:

4.2 Access-Control HTTP Response Header

Retrieved resources can have one or more Access-Control headers defined. These headers must match the following ABNF:

Access-Control = "Access-Control" ":" 1#rule
rule           = "allow" 1*(LWS pattern) [LWS "exclude" 1*(LWS pattern)]
pattern        = "<" access item ">"

As stated by RFC 2616, multiple Access-Control headers can be combined.

The syntax of access items when used in the Access-Control HTTP header is restricted to internationalized domain names to which the ToASCII algorithm has been applied (as HTTP does not support Unicode).

LWS is used as defined by RFC 2616. The pattern production above must not include implied LWS. Implied LWS is allowed everywhere else. [RFC2616]

Access-Control: allow <*.example.org> exclude <*.public.example.org>
Access-Control: allow <webmaster.public.example.org>

The above example indicates that every subdomain of example.org can access the resource, including webmaster.public.example.org, but with the exclusion of all other subdomains of public.example.org.

Access-Control: allow <example.org>

The above example means that example.org and all its subdomains can access the resource.

4.3 <?access-control?> Processing Instruction

XML resources may include one or more <?access-control?> processing instructions within the XML Prolog, to indicate (if the access control check is performed) from which domains their content can be accessed. [XML]

The processing instruction takes two pseudo-attributes, allow and exclude, which each take a space-separated list of access items. The allow pseudo-attribute must be specified.

The allow, and exclude pseudo-attributes, when specified, must at least contain a single access item.

An <?access-control?> processing instruction that is part of the XML Prolog must be parsed using the same syntax rules as described in the XML Stylesheet PI specification. <?access-control?> processing instructions outside the XML Prolog are ignored. [XMLSSPI]

The above means that the following examples would be non-conforming and would make the user agent deny access to the resource:

4.4 Access-Control-Max-Age HTTP Response Header

The Access-Control-Max-Age HTTP response header indicates how long the results of a method check request can be cached in a method check result cache. The Access-Control-Max-Age HTTP header must match the following ABNF:

Access-Control-Max-Age = "Access-Control-Max-Age" ":" delta-seconds

The delta-seconds production is defined in RFC 2616. [RFC2616]

4.5 Access-Control-Policy-Path HTTP Response Header

The Access-Control-Policy-Path HTTP response header gives a path that together with the request URI is used to determine for which set of URIs the result of an access control check applies. The Access-Control-Policy-Path HTTP header must match the following ABNF:

Access-Control-Policy-Path = "Access-Control-Policy-Path" ":" abs_path

The abs_path production is defined in RFC 2616. [RFC2616]

4.6 Access-Control-Origin HTTP Request Header

The Access-Control-Origin request HTTP header indicates where the cross-site access request or method check request originates from. The header must match the following ABNF:

Access-Control-Origin = "Access-Control-Origin" ":" access control origin

The access control origin can be the literal string "null" (without quotation marks) in case the request originates from a data: URI for instance.

5. Processing Model

This section (including subsections) describes the processing models that user agents and hosting specifications have to implement. A hosting specification can "implement" an algorithm by referencing it and carefully defining how the return values are handled.

5.1 Cross-site Access Request

The cross-site access request algorithm takes the parameters request URI and request method and returns several values as described below. The algorithm can be invoked by hosting specifications who wish to provide cross-site requests.

If request method is equal to GET, the user agent must follow the cross-site GET access request algorithm. Otherwise, it must follow the cross-site non-GET access request algorithm.

Those algorithms have shared return values that hosting specifications can use to instruct user agents what to do. The status return flag indicates the status of the cross-site access request. It takes the value "success" when cross-site access to the resource is allowed, "same-origin" if the cross-site request turned into a same-origin request due to redirects, "network" if a network error of some sort occurred, and "abort" if the user aborted the request. The uri return flag is used when the status return flag is "same-origin", to indicate the URI which the specification can use for a subsequent same-origin request.

As this algorithm is used by hosting specifications, those specifications must handle all values of the status return flag and handle the uri return flag.

The access control origin is a representation of the source of the request. It is the scheme followed by ://, followed by the domain without any trailing U+002E (.), if any, where each part of the domain has had the ToASCII algorithm applied. Then, if port is not the default port for the scheme, follow it by : and the port. If the source of the request does not have a host-based authority (data: URI scheme, for instance), the access control origin is the literal string "null" (without the quotation marks).

Hosting specifications using cross-site access requests must define the source of the request for the access control origin. Due to the way the origin for protocols is retrieved in different ways, it is not possible to define this in a generic way.

While following the requirements for cross-site access requests, user agents must ensure that for each request (including redirects, et cetera) the Access-Control-Origin HTTP request header is set, with the value set to access control origin.

5.1.1 Cross-site GET Access Request

The steps below describe what user agents must do for cross-site GET access requests. These are requests using the HTTP GET method to a non same-origin URI, the request URI.

  1. Let origin be the access control origin.

  2. Let current request URI be the request URI.

    The current request URI can be modified when applying the generic redirect steps.

  3. Then make a request to current request URI using the HTTP method GET and observe the following request rules:

    If the response is an HTTP redirect

    Apply the generic redirect steps.

    If the user cancels the request

    Apply the generic abort steps.

    If there is a network error

    Apply the generic network error steps.

    Otherwise

    Perform an access control check. If it returns "fail", apply the generic network error steps. Otherwise, if it returns "pass", terminate this algorithm and return with the status flag set to "success". Do not actually terminate the request.

5.1.2 Cross-site Non-GET Access Request

To protect servers against cross-site access with methods that have side effects a preflight request, called a method check request is made to ensure that the server is ok with the request. The result of the preflight request can be stored in the method check result cache. In case the server expects frequent cross-site requests to different resources it can reply with a special header that will incur another preflight request after which a lot of different resources can be accessed without requiring anymore preflight requests. These mechanisms are described in detail in this section.

There are basically two different types of caching policies. One is optimized for a single cross-site URI and one is optimized for a set of cross-site URIs residing on the same origin. The simple case follows the following scenario:

  1. The user agent gets the request from a protocol, such as XMLHttpRequest to perform a cross-site request using the custom XMODIFY method from access control origin http://example.org to http://blog.invalid/entries/hello-world.

  2. The user agent performs an OPTIONS request to http://blog.invalid/entries/hello-world to which the response includes the following HTTP metadata:

    Access-Control: allow <example.org>
    Access-Control-Max-Age: 151200
  3. The user agent then performs the desired XMODIFY request to http://blog.invalid/entries/hello-world as this was allowed by the resource. In addition, for the coming 151200 seconds, or forty-two hours, no OPTIONS request will be needed.

The slightly more complicated scenario where our blog.invalid server wants to allow access to all /entries/ resources, and not just /entries/hello-world, is as follows:

  1. The user agent gets the request from a protocol again. This time to perform four PUT requests to http://blog.invalid/entries/pointland, http://blog.invalid/entries/lineland, http://blog.invalid/entries/flatland, and http://blog.invalid/entries/spaceland. Meanwhile the blog.invalid server has been updated so less requests are required.

  2. The user agent performs an OPTIONS request to http://blog.invalid/entries/pointland to which the response includes the following HTTP metadata:

    Access-Control-Policy-Path: /entries/
  3. The user agent then performs another OPTIONS request to http://blog.invalid/entries/ to which the response includes the following HTTP metadata:

    Access-Control: allow <example.org>
    Access-Control-Policy-Path: /entries/
    Access-Control-Max-Age: 151200
  4. The user agent can now perform all the requests it wants, as long as the access control origin stays http://example.org and the request URI starts with http://blog.invalid/entries/, within the next forty-two hours. (This is because the second OPTIONS request got a response that confirmed that /entries/ is indeed the correct path.)

As mentioned, cross-site non-GET access requests use an method check result cache. This cache consists of a set of entries. Each entry has an origin, uri, uri prefix, and an expiry time field. The uri and uri prefix fields are mutually exclusive. Entries must be removed when the time specified in the expiry time field has passed since storing the entry. Entries can also be added and removed per the algorithms below. They are added and removed in such a way that there can never be duplicate items in the cache or two items where the origin field values are identical and the uri field value of one item starts with the uri prefix field value of another item.

The steps below describe what user agents must do for cross-site non-GET access requests. These are requests to a non same-origin URI with an HTTP request method other than GET that first need to be authorized using either a method check result cache entry or a method check request.

  1. Let origin be the access control origin.

  2. Let request method be the request method parameter passed to the cross-site access request algorithm.

  3. Let current request URI be the request URI.

    The current request URI can be modified when applying the generic redirect steps.

  4. If there is an entry in the method check result cache where the origin matches the origin field value, and the current request URI is identical to the uri field value or starts with the uri prefix field value, proceed to the next step.

    Otherwise, remove the cache entry, if any, and then make a method check request. This is a request using the HTTP OPTIONS method to the current request URI. Observe the following request rules while making the request:

    If the response is an HTTP redirect

    Apply the generic redirect steps.

    If the user cancels the download

    Apply the generic abort steps.

    If there is a network error

    Apply the generic network error steps.

    Otherwise
    If the response does not include a Access-Control-Policy-Path HTTP header

    Perform an access control check. If it returns "fail", apply the generic network error steps. Otherwise, if it returns "pass", append a cache entry and set the uri field value to the current request URI.

    If the response does include a Access-Control-Policy-Path HTTP header
    1. If the Access-Control-Policy-Path can not be successfully parsed apply the generic network error steps.

    2. Let policy URI be the result of resolving the Access-Control-Policy-Path HTTP header value against the current request URI.

    3. If policy URI does not end with a trailing slash, append one.

    4. If policy URI does not match the start of the current request URI apply the generic network error steps.

    5. If a trailing slash was appended to policy URI, remove it.

    6. If policy URI is identical to the current request URI, go the next step. Otherwise, make a request to policy URI using the HTTP OPTIONS method and observe the following request rules:

      If the user cancels the download

      Apply the generic abort steps.

      If the response is an HTTP redirect
      If there is a network error

      Apply the generic network error steps.

      Otherwise

      If there is no Access-Control-Policy-Path header or it can not be successfully parsed apply the generic network error steps.

      Let policy URI check be the result of resolving the Access-Control-Policy-Path header value against the current request URI. If policy URI and policy URI check are not identical apply the generic network error steps.

    7. Perform an access control check. If it returns "fail", apply the generic network error steps. Otherwise, if it returns "pass", remove all entries in the method check result cache where origin is identical to the origin field value and the uri and uri prefix field values either start with or are exact matches for the policy URI. Then append a cache entry and set the uri prefix field value for that entry to the policy URI.

      The remove the cache entry steps are quite different from what is described above. The above set of steps is a specific removel action before putting the new policy in place.

      The append a cache entry set of steps does not always lead to changes to the cache. For instance, if the response did not include a Access-Control-Max-Age header the user agent is not required to add something to the method check result cache.

  5. Make a request to the current request URI using HTTP method request method and observe the request rules below while making the request.

    If the response is an HTTP redirect

    First remove the cache entry and then apply the generic network error steps.

    If the user cancels the download

    Apply the generic abort steps.

    If there is a network error

    Apply the generic network error steps.

    Otherwise

    Perform an access control check. If it returns "fail", remove the cache entry, then apply the generic network error steps. Otherwise, if it returns "pass", terminate this algorithm and return with the status flag set to "success". Do not actually terminate the request.

5.1.3 Generic Cross-site Access Request Algorithms

The variables used in the generic set of steps are part of the algorithms that invoke these set of steps.

The generic redirect steps are as follows:

If the new URI scheme is not supported, infinite loop precautions are violated, or something else went wrong, apply the generic network error steps. Otherwise, let current request URI be the new URI and then follow these set of steps:

  1. If the current request URI contains the userinfo production, as defined in section 3.2.1 of RFC 3986, apply the generic network error steps. [RFC3986]

  2. If the current request URI and origin are same-origin, terminate the algorithm that invoked this set of steps and return with the uri flag set to the current request URI and the status flag set to "same-origin".

  3. Otherwise, transparently follow the redirect while observing the set of request rules.

Whenever the generic abort steps are applied, terminate the algorithm that invoked this set of steps and return with the status flag set to "abort".

Whenever the generic network error steps are applied, terminate the algorithm that invoked this set of steps and return with the status flag set to "network".

Remove the cache entry means removing the entry in the method check result cache where origin is identical to the origin field value and current request URI is either identical to the uri field value or starts with the uri prefix field value.

To append a cache entry means to create a new entry in the method check result cache with origin as origin field value and as expiry time field value the value of the Access-Control-Max-Age header, if any. If there is no Access-Control-Max-Age header or the Access-Control-Max-Age header can not be successfully parsed, the user agent may choose to nevertheless cache the entry for a short period of time or not store a cache entry at all.

The uri or uri prefix field is set when the above set of steps is referenced.

5.2 Access Control Check

5.2.1 Access Control Check Algorithm

When a user agent has to make an access control check for a particular resource, it must then associate the following with that resource:

The allow lists and exclude lists are unordered lists of access items. The allow lists are guaranteed to be non-empty and the exclude lists can be empty.

After associating the aforementioned lists and when all HTTP headers have been received, the user agent must run the following algorithm (unless stated otherwise):

  1. Parse the Access-Control headers. If any value does not conform to the syntax required, terminate the algorithm and return "fail". Otherwise, if parsed successfully, then for each rule append a new list item to the HTTP access control list, where the allow list is constructed of each access item following "allow" and the exclude list of each access item following "exclude". The exclude list will be empty if "exclude" is not present.

  2. Then run the list check on the HTTP access control list.

  3. If the requested resource has an XML MIME type and a non-empty entity body, go to the next step. Otherwise, if the allow access flag is "false", then terminate the algorithm and return "fail". If the allow access flag is "true", then terminate the algorithm and return "pass".

  4. Parse the resource as an XML document using a streaming XML parser, following the rules set forth in the XML specification up to but not including the root-element start tag. Then process the encountered <?access-control?> processing instructions (if any). [XML]

    If there is either an XML parse error or failure to parse the processing instructions, then terminate the overall algorithm and return "fail". Otherwise, run the following steps for each <?access-control?> processing instruction:

    1. If the processing instruction does not have a single allow psuedo-attribute and optionally a single exclude pseudo-attribute, then terminate the overall algorithm and return "fail".

    2. Append a new list item to the PI access control list where the allow list is the result of parsing the allow pseudo-attribute value and the exclude list the result of parsing the exclude attribute, if specified, or empty otherwise. If any obtained value does not match the access item syntax or if no value was obtained, then terminate the overall algorithm and return "fail".

  5. Then run the list check on the PI access control list.

  6. If the allow access flag is "false", then return "fail". Otherwise, if the allow access flag is "true", then return "pass".

5.2.2 Shared Algorithm

The algorithm in this section is to be read as if it was part of the algorithm that invoked it. The "overall set of steps" and "overall algorithm" are references to the algorithm that invoked the algorithm defined in this section.

The list check algorithm takes a list of items consisting of allow and exclude lists. For each item in the list, run the following steps:

  1. If there is no match for any access item from the allow list against the access control origin, process the next list item. If there is no next list item, then go to the next step in the overall set of steps.

  2. If the exclude list is non-empty and there is a match for any access item from the exclude list against the access control origin, process the next list item. If there is no next list item, then go to the next step in the overall set of steps.

  3. Set the allow access flag to "true" and go to the next step in the overall set of steps.

5.3 Access Item Check

The algorithm described below determines whether there is a match between a access control origin (http://test.example.org for instance) and an access item (*.example.org, or * for instance).

The following table gives some example outcomes of this algorithm with the access control origin in the first column, the access item in the second column, and the result in the final column.

Access control origin Access item Result
null * Match
null example.org No match
http://example.org EXAMPLE.OrG Match
http://example.org:81 example.org No match
http://example.org example.org Match
http://site.example.org *.org Match
http://xn--74h.example.org ☺.example.org Match

To determine whether a access control origin and an access item match, user agents must run the following algorithm:

  1. Let origin be access control origin and item be access item.

  2. If item is a single U+002A (*), there is a match. Terminate this algorithm.

  3. If origin is "null", there is no match. Terminate this algorithm.

  4. If item does not have a port-pattern, let the port of item be the default port for the scheme of item or, if item does not have a scheme, let it be the default port of the scheme of origin.

  5. If item has a scheme and it does not case-insensitively match the scheme from origin, there is no match. Terminate this algorithm.

  6. Remove the scheme from item (if it has one specified) and origin, including the :// sequence following it.

  7. If origin does not have a port component specified let it be the default port for the scheme of origin.

  8. If the port from item does not match the port from origin and is not *, there is no match. Terminate this algorithm.

  9. Remove the port from item and origin, including the U+003A (:) preceding it.

  10. If item item has a single U+002E (.) as last character, remove that character from item.

  11. Let origin list be origin split on the U+002E (.) character (dropping that character in the process) and item list be item split on the U+002E (.) character (dropping that character in the process). Ensure that the order is preserved.

  12. Reverse the order of origin list and item list.

  13. Now process the first list item of both origin list and item list using the following steps:

    1. Let the item from origin list be origin label and the item from item list be item label.

    2. If item label is a single U+002A (*) character, then go to the next step in the overall set of steps.

    3. Apply the ToASCII algorithm to item label and store the result in item label.

    4. If origin label does not case-insensitively match item label, there is no match (terminate the overall algorithm).

      Otherwise, apply these set of steps to the next list item of both origin list and item list. If the origin list has no next list item, there is no match (terminate the overall algorithm). If the item list has no next list item, then go to the next step in the overall set of steps.

  14. There is a match. Terminate this algorithm.

Requirements

While the requirements use "normative" terminology this appendix does not affect conformance and is therefore non-normative.

The requirements that influenced the design of the Access Control for Cross-site Requests specification are as follows:

  1. Must not introduce new attack vectors, such as:

    1. Must not introduce attack vectors to servers that are only protected only by a firewall.

      The solution should not introduce additional attack vectors against services that are protected only by way of firewalls. This requirement addresses "intranet" style services authorize any requests that can be sent to the service.

      Note that this requirement does not preclude HEAD, OPTIONS, or GET requests (even with ambient authentication and session information).

    2. It should not be possible to perform cross-site non-safe operations, i.e., HTTP operations except for GET, HEAD, and OPTIONS, without an authorization check being performed.

    3. Should try to prevent dictionary-based, distributed, brute-force attacks that try to get login accounts to 3rd party servers, to the extent possible.

    4. Should properly enforce security policy in the face of commonly deployed proxy servers sitting between the user agent and any of servers with whom the user agent is communicating.

    5. Should not allow loading and exposing of resources from 3rd party servers without explicit consent of these servers as such resources can contain sensitive information.

  2. Must not require content authors or site maintainers to implement new or additional security protections to preserve their existing level of security protection.

  3. Must be deployable to IIS and Apache without requiring actions by the server administrator in a configuration where the user can upload static files, run serverside scripts (such as PHP, ASP, and CGI), control HTTP headers, and control authorization, but only do this for URIs under a given set of subdirectories on the server.

  4. Must able to deploy support for cross-site GET requests without having to use server-side scripting (such as PHP, ASP, or CGI) on IIS and Apache.

  5. The solution must be applicable to arbitrary media types. It must be deployable without requiring special packaging of resources, or changes to resources' content.

  6. It should be possible to configure distinct cross-site authorization policies for different target resources that reside within the same origin.

  7. It should be possible to distribute content of any type. Likewise, it should be possible to transmit content of any type to the server if the protocol in use allows such functionality.

  8. It should be possible to allow only specific servers, or sets of servers to fetch the resource.

  9. Must not require that the server filters the entity body of the resource in order to deny cross-site access to all resources on the server.

  10. Cross-site requests should not require API changes other than allowing cross-site requests. This means that the following examples should work for resources residing on http://test.invalid (modulo changes to the respective specifications to allow cross-site requests):

  11. It should be possible to issue methods other than GET to the server, such as POST and DELETE.

  12. Should be compatible with commonly used HTTP authentication and session management mechanisms. I.e. on an IIS server where authentication and session management is generally done by the server before ASP pages execute this should be doable also for requests coming from cross-site requests. Same thing applies to PHP on Apache.

  13. Should reduce the risk of inadvertently allowing access when it is not intended. This is, it should be clear to the content provider when access is granted and when it is not.

Use Cases

The use cases appendix documents several potential use cases that guided development of the Access Control work. This appendix does not affect conformance and is therefore non-normative.

Design Decision FAQ

This appendix documents several frequently asked questions and their corresponding response. As it does not affect conformance it is non-normative.

Why is there a second check for non-GET requests?

For non-GET requests two access control checks are performed. Initially a "permission to make the request" check is done on the response to the method check request. And then a "permission to read" check is done on the response to the actual request. Both of these checks need to succeed in order for success to be relayed to the protocol (e.g. XMLHttpRequest).

The "permission to make the request" check is mostly performed because deployed servers don't expect cross-site non-GET requests, such as a DELETE request, to be made. If they reply positively to the method check request the client knows it can go ahead and perform the actual desired request.

Why is POST not treated identically to GET?

While most cross-site POST requests are already possible using a HTML form element with the enctype attribute set to text/plain this only works if the receiving server does not check the submission media type.

Cross-site requests to a server that accepts application/xml submissions only for instance (and checks the media type strictly) are not possible using the HTML form element. As the policy described by this specification would allow cross-site POST requests carrying the media type application/xml (and others) a method check request has to be performed for POST as well to ensure that the server is ok with such a request. (We can not introduce a new attack vector.)

Why are cookies and authentication information sent in the request?

Sending cookies and authentication information enables user-specific cross-site widgets (external XBL file). It also allows for a user authenticated data storage API that services can use to store data in.

Cookies and authentication information is already sent cross-site for various HTML elements, such as img, script, and form. This means that such a request does not introduce a new attack vector.

Why can cookies and authentication information not be provided by the script author for the request?

This would allow dictionary based, distributed, cookies / user credentials search.

Why is the client the policy enforcement point?

The client already is the policy enforcement point for these requests. The mechanism allows the server to opt-in to let the client expose the data. Something clients currently not do and which servers rely upon.

Note however that the server is in full control. Based on the value of the Access-Control-Origin header in cross-site requests it can simply opt to return no data at all or not provide the necessary handshake (in form of the Access-Control HTTP headers and <?access-control?> processing instructions.

What about the JSONRequest proposal?

JSONRequest has been considered by the Web Application Formats Working Group and the group has concluded that it does not meet the documented requirements. For instance, requests originating from the JSRONRequest API do not have cookies or user credentials and JSONRequest is format specific.

References

[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. IETF, March 1997.
[RFC2616]
Hypertext Transfer Protocol -- HTTP/1.1, R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, editors. IETF, June 1999
[RFC3490]
Internationalizing Domain Names in Applications (IDNA), P. Faltstrom, P. Hoffman, A. Costello. IETF, March 2003.
[RFC3986]
Uniform Resource Identifier (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter, editors. IETF, January 2005.
[RFC3987]
Internationalized Resource Identifiers (IRIs), M. Duerst, M. Suignard, editors. IETF, January 2005.
[XML]
Extensible Markup Language (XML) 1.0 (Fourth Edition), T. Bray et al., editors. W3C, August 2006.
Namespaces in XML 1.0 (Second Edition), T. Bray et al., editors. W3C, August 2006.
[XMLSSPI]
Associating Style Sheets with XML documents, J. Clark, editor. W3C, June 1999

Acknowledgments

The editor would like to thank Arthur Barstow, Benjamin Hawkes-Lewis, Björn Höhrmann, Cameron McCormack, David Håsäther, David Orchard, Dean Jackson, Eric Lawrence, Frank Ellerman, Frederick Hirsch, Graham Klyne, Hal Lockhart, Henri Sivonen, Ian Hickson, Jonas Sicking, Lachlan Hunt, Maciej Stachowiak, Marc Silbey, Marcos Caceres, Mark Nottingham, Martin Dürst, Matt Womer, Michael Smith, Mohamed Zergaoui, Sharath Udupa, Sunava Dutta, Surya Ismail, Thomas Roessler, Tyler Close, and Zhenbin Xu for their contributions to this specification.

Special thanks to Brad Porter, Matt Oshry and R. Auburn, who all helped editing earlier versions of this document.