<?xml version="1.0" standalone="no" ?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<!--
http://xml.resource.org/authoring/rfc2629.dtd
-->
<?rfc toc='yes'?>
<?rfc compact='yes'?>
<?rfc sortrefs='yes'?>
<?rfc autobreaks='no'?>
<rfc ipr="full3667" docName="draft-kindberg-tag-uri-07">
   <front>
      <title abbrev="Tag URIs">The 'tag' URI scheme</title>
      <author initials="T.P.J.G." surname="Kindberg" fullname="Tim Kindberg">
         <organization>Hewlett-Packard Corporation</organization>
         <address>
            <postal>
               <street>Hewlett-Packard Laboratories</street>
               <street>Filton Road</street>
               <street>Stoke Gifford</street>
               <city>Bristol</city>
               <code>BS34 8QZ</code>
               <country>UK</country>
            </postal>
            <phone>+44 117 312 9920</phone>
            <email>timothy@hpl.hp.com</email>
         </address>
      </author>
      <author initials="S." surname="Hawke" fullname="Sandro Hawke">
         <organization>World Wide Web Consortium</organization>
         <address>
            <postal>
               <street>32 Vassar Street</street>
               <street>Building 32-G508</street>
               <city>Cambridge</city>
               <region>MA</region>
               <code>02139</code>
               <country>USA</country>
            </postal>
            <phone>+1 617 253-7288</phone>
            <email>sandro@w3.org</email>
         </address>
      </author>
      <date month="January" year="2005"/>
<!--
<area>Applications</area>
<keyword></keyword>
-->
      <abstract>
         <t>This document describes the "tag" Uniform Resource Identifier (URI) scheme.
         Tag URIs (also known as "tags") are designed to be unique across space and time while being tractable to humans. 
         They are distinct from most other URIs in that there is no authoritative resolution mechanism. 
         A tag may be used purely as an entity identifier. 
		 Furthermore, using tags has some advantages over the common practice of using "http" URIs as 
		 identifiers for non-HTTP-accessible resources.</t>
      </abstract>
      <note title="Terminology">
         <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.</t>
      </note>
      <note title="Disclaimer">
         <t>The views and opinions of authors expressed herein do not
   necessarily state or reflect those of the World Wide Web Consortium,
   and may not be used for advertising or product endorsement purposes.
   This proposal has not undergone technical review within the
   Consortium and must not be construed as a Consortium recommendation.
</t>
      </note>
      <note title="Further Information and Discussion of this Document">
         <t>Information about the tag URI scheme additional to this document -- motivation, genesis and discussion -- can be obtained from http://www.taguri.org.</t>
         <t>Earlier drafts of this document have been discussed on uri@w3.org. The authors welcome further discussion and comments.</t>
      </note>
   </front>
   <middle>
      <section title="Introduction" anchor="INTRO">
         <t>
            A tag is a type of Uniform Resource Identifier (URI)
            <xref target="RFC3986 "/>
            designed to meet the following requirements:
            <vspace blankLines="1"/>
         </t>
         <list style="numbers">
            <t>Identifiers are likely to be unique across space and time, and come from a practically inexhaustible supply.</t>
            <t>Identifiers are relatively convenient for humans to mint (create), read, type, remember etc.</t>
            <t>No central registration is necessary, at least for holders of domain names or email addresses; and there is negligible cost to mint each new identifier.</t>
            <t>The identifiers are independent of any particular resolution scheme.</t>
         </list>
         <t>
            For example, the above requirements may apply in the case of a user who wants to place identifiers on their documents:
            <vspace blankLines="1"/>
         </t>
         <list style="letters">
            <t>The user wants to be reasonably sure that the identifier is unique. Global uniqueness is valuable because it prevents identifiers from becoming unintentionally ambiguous.</t>
            <t>The identifiers should be tractable to the user, who should, for example, be able to mint new identifiers conveniently, 
            to memorise them, and to type them into emails and forms.</t>
            <t>The user does not want to have to communicate with anyone else in order to mint identifiers for their documents.</t>
            <t>The user wants to avoid identifiers that might be taken to imply the existence of an electronic resource accessible via a default resolution mechanism, when no such electronic resource exists.</t>
         </list>
         <t>Existing identification schemes satisfy some but not all of the requirements above. For example:</t>
         <t>
            UUIDs <xref target="UUID"/>, <xref target="ISO11578"/> are hard for humans to read.
         </t>
         <t>
            OIDs <xref target="OID"/>, <xref target="RFC3061"/> and Digital Object Identifiers <xref target="DOI"/> 
            require entities to register as naming authorities, even in cases where the entity already holds a domain name registration.
         </t>
         <t>
            URLs (in particular, "http" URLs) are sometimes used as identifiers that satisfy most of the above requirements. 
            Many users and organisations have already registered a domain name, and the use of the domain name to mint 
            identifiers comes at no additional cost. But there are drawbacks to URLs-as-identifiers:
            <vspace blankLines="1"/>
         </t>
         <list style="symbols">
            <t>An attempt may be made to resolve a URL-as-identifier, even though there is no resource accessible at the "location".</t>
            <t>Domain names change hands and the new assignee of a domain name can't be sure that they are minting new names. 
            For example, if example.org is assigned first to a user Smith and then to a user Jones, there is no systematic way for 
            Jones to tell whether Smith has already used a particular identifier such as http://example.org/9999.</t>
            <t>Entities could rely on purl.org or a similar service as a (first-come, first-served) assigner of unique URIs; 
            but a solution without reliance upon another entity such as the Online Computer Library
            Center (OCLC, which runs purl.org) may be preferable.
            </t>
         </list>
         <t>
         Lastly, many entities -- especially individuals -- are assignees of email addresses but not domain names. 
         It would be preferable to enable those entities to mint unique identifiers.
         </t>
      </section>
      <section title="Tag Syntax and Rules" anchor="TAG_SPEC">
         <t>This section first specifies the syntax of tag URIs and gives examples. It then describes a set of rules for minting tags designed to make them unique. Finally, it discusses the resolution and comparison of tags.</t>
         <section title="Tag Syntax and Examples" anchor="SYNTAX">
            <t>
               The general syntax of a tag URI, in ABNF <xref target="RFC2234"/>, is:
               <vspace blankLines="1"/>
            </t>
            <list style="empty">
               <t>tagURI        = "tag:" taggingEntity ":" specific [ "#" fragment ]</t>
            </list>
            <t>
               Where:
               <vspace blankLines="1"/>
            </t>
            <list style="empty">
               <t>taggingEntity = authorityName "," date</t>
               <t>authorityName = DNSname / emailAddress</t>
               <t>
                  date          = year ["-" month ["-" day]]
               </t>
               <t>
                  year          = 4DIGIT
               </t>
               <t>
                  month         = 2DIGIT
               </t>
               <t>
                  day           = 2DIGIT
               </t>
               <t>
                  DNSname       = DNScomp *( "." DNScomp )  ; see RFC 1035 <xref target="RFC1035"/>
               </t>
               <t>DNScomp       = alphaNum [*(alphaNum /"-") alphaNum]</t>
               <t>emailAddress  = 1*(alphaNum /"-"/"."/"_") "@" DNSname</t>
               <t>alphaNum      = DIGIT / ALPHA</t>
               <t>
                  specific      = *( pchar / "/" / "?" ) ; pchar from RFC 3986  <xref target="RFC3986 "/>
               </t>
               <t>
                  fragment      = *( pchar / "/" / "?" ) ; same as RFC 3986  <xref target="RFC3986 "/>
               </t>
            </list>
            <t>The component "taggingEntity" is the name space part of the URI. 
               To avoid ambiguity, the domain name in "authorityName" 
               (whether an email address or a simple domain name) MUST be fully qualified. It is 
               RECOMMENDED that the domain name should be in lowercase form. Alternative formulations of the same authority name 
               will be counted as distinct and 
               hence tags containing them will be unequal (see
               <xref target="EQUALITY"/>). 
               For example, tags beginning "tag:EXAMPLE.com,2000:" are never equal to those beginning "tag:example.com,2000:", 
               even though they refer to the same domain name.
               
            </t>
            <t>Authority names could, in principle, belong to any syntactically distinct namespaces whose names are assigned to a unique entity at a time. Those include, for example, certain IP addresses, certain MAC addresses, and telephone numbers. However, to simplify the tag scheme, we restrict authority names to be domain names and email addresses. Future standards efforts may allow use of other authority names following syntax that is disjoint from this syntax. To allow for such developments, software that processes tags MUST NOT reject them on the grounds that they are outside the syntax defined above.</t>
            <t>
               The component "specific" is the name-space-specific part of the URI: 
               it is a string of URI characters (see restrictions in syntax specification) chosen by the minter of the URI.
               Note that the "specific" component allows
               for "query" subcomponents as defined in RFC 3986  <xref target="RFC3986 "/>. 
               It is RECOMMENDED that specific identifiers should be human-friendly.
            </t>
            <t>
               Tag URIs may optionally end in a fragment identifier, in accordance with the general
               syntax of RFC 3986  <xref target="RFC3986 "/>. 
            </t>
            <t>
               In the interests of tractability to humans, tags SHOULD NOT
               be minted with percent-encoded parts.
               However, the tag syntax does allow percent-encoded characters, in the "pchar" elements
               (defined in RFC 3986  <xref target="RFC3986 "/>).
            </t>
            <t>
               Examples of tag URIs are:
               <vspace blankLines="1"/>
            </t>
            <list style="empty">
               <t>tag:timothy@hpl.hp.com,2001:web/externalHome</t>
               <t>tag:sandro@w3.org,2004-05:Sandro</t>
               <t>tag:my-ids.com,2001-09-15:TimKindberg:presentations:UBath2004-05-19</t>                              <t>tag:blogger.com,1999:blog-555</t>
               <t>tag:yaml.org,2002:int</t>
            </list>
         </section>
         <section title="Rules for Minting Tags" anchor="RULES">
            <t>As Section 2.1 has specified, each tag includes a "tagging entity" followed, optionally, by a specific identifier. 
               The tagging entity is designated by an "authority name" -- a fully qualified domain name or an email address 
               containing a fully qualified domain name -- followed by a date. The date is chosen to make the tagging entity globally unique, exploiting the fact that domain names and email addresses are assigned to at most one entity at a time. That entity then ensures that it mints unique identifiers.</t>
            <t>
               The date specifies, according to the Gregorian calendar and UTC, any particular day on which the authority name was assigned to the tagging entity at 00:00 UTC (the start of the day). The date MAY be a past or present date on which the authority name was assigned at that moment. The date is specified using one of the "YYYY", "YYYY-MM" and "YYYY-MM-DD" formats allowed by the ISO 8601 standard
               <xref target="ISO8601"/> (see also RFC 3339 <xref target="RFC3339"/>). The tag specification permits no other formats. Tagging entities MUST ascertain the date with sufficient accuracy to avoid accidentally using a date on which the authority name was not in fact assigned 
               (many computers and mobile devices have poorly synchronised clocks). 
               The date MUST be reckoned from UTC -- which may differ from the date in the tagging entity's local timezone at 00:00 UTC. 
               That distinction can generally be safely ignored in practice, but not on the day of the authority name's assignment. 
               In principle it would otherwise be possible on
               that day for the previous assignee and the new assignee to use the same date and thus mint the same tags.  
            </t>
            <t>In the interests of brevity, the month and day default to "01". A day value of "01" MAY be omitted; 
            a month value of "01" MAY be omitted unless it is followed by a day value other than "01". 
            For example, "2001-07" is the date 2001-07-01 and "2000" is the date 2000-01-01. 
            All date formulations specify a moment (00:00 UTC) of a single day, and not a period of a day or more such as 
            "the whole of July 2001" or "the whole of 2000". 
            Assignment at that moment is all that is required to use a given date.</t>
            <t>
               Tagging entities should be aware that alternative formulations of the same date will be counted as distinct and hence tags containing them will be unequal. 
               For example, tags beginning "tag:example.com,2000:" are never equal to those beginning 
               "tag:example.com,2000-01-01:", even though they refer to the same date (see
               <xref target="EQUALITY"/>).
            </t>
            <t>An entity MUST NOT mint tags under an authority name that was assigned to a different entity at 00:00 UTC on the given date, 
            and it MUST NOT mint tags under a future date.</t>
            <t>An entity that acquires an authority name immediately after a period during which the name was unassigned MAY mint tags as if the entity was assigned the name during the unassigned period. This practice has considerable potential for error and MUST NOT be used unless the entity has substantial evidence that the name was unassigned during that period. The authors are currently unaware of any mechanism that would count as evidence, other than daily polling of the "whois" registry.</t>
            <t>For example, Hewlett-Packard holds the domain registration for hp.com and may mint any tags rooted at that name with a current or past date when it held the registration. It must not mint tags such as "tag:champignon.net,2001:" under domain names not registered to it. 
            It must not mint tags dated in the future, such as "tag:hp.com,2999:". If it obtains assignment of "extremelyunlikelytobeassigned.org" on 2001-05-01, then it must not mint tags under "extremelyunlikelytobeassigned.org,2001-04-01" unless it has evidence proving that that name was continuously unassigned between 2001-04-01 and 2001-05-01.</t>
            <t>A tagging entity mints specific identifiers that are unique within its context, 
            in accordance with any internal scheme that uses only URI characters.
            Tagging entities SHOULD use record-keeping procedures to achieve uniqueness. 
            Some tagging entities (e.g. corporations, mailing lists) consist of many people, 
            in which case group decision-making SHOULD also be used to achieve uniqueness. 
            The outcome of such decision-making could be to delegate control over
            parts of the namespace. For example, the assignees of example.com could delegate control over all tags with 
            the prefixes tag:example.com,2004:fred: and tag:example.com,2004:bill: respectively to the individuals with 
            internal names "fred" and "bill" on 2004-01-01.</t>
         </section>
         <section title="Resolution of Tags" anchor="RESOLUTION">
            <t>
               There is no authoritative resolution mechanism for tags. 
               Unlike most other URIs, tags can only be used as identifiers, 
               and are not designed to support resolution. If authoritative resolution is a desired feature, 
               a different URI scheme should be used.
<!--
However, the denotation of a tag SHOULD be unambiguous in the context of its use. (As the guidelines in RFC 2718, Section 2.2
               <xref target="RFC2718"/>
               state, the resource that a URI identifies should be well defined.)
-->
<!--
However, that does not entail an authoritative resolution mechanism; for example, "mid" and "cid" URIs <xref 
target="RFC2392" /> simply label messages and associated content. 
-->
            </t>
         </section>
         <section title="Equality of Tags" anchor="EQUALITY">
            <t>Tags are simply strings of characters and are considered equal if and only if they are completely indistinguishable in their machine representations when using the same character encoding. 
               That is, one can compare tags for equality by comparing the numeric codes of their characters, in sequence, for numeric equality.
               This criterion for equality allows for simplification of tag-handling software, 
               which does not have to transform tags in any way to compare them.
            </t>
         </section>
      </section>
      <section title="Security Considerations" anchor="SECURITY">
         <t>Minting a tag, by itself, is an operation internal to the tagging entity with no external consequences. The consequences of using an improperly minted tag (due to malice or error) in an application depends on the application, and must be considered in the design of any application that uses tags.</t>
         <t>There is a significant possibility of minting errors by people who fail to apply the rules governing dates, or who use a shared (organizational) authority-name without prior organization-wide agreement. Tag-aware software MAY help catch and warn against these errors. As stated in Section 2, however, to allow for future expansion, software MUST NOT reject tags which do not conform to the syntax specified in Section 2.</t>
         <t>A malicious party could make it appear that the same domain name or email address was assigned to each of two or more entities. Tagging entities SHOULD use reputable assigning authorities, and verify assignment wherever possible.</t>
         <t>Entities SHOULD also avoid the potential for malicious exploitation of clock skew, by using authority names that were assigned continuously from well before to well after 00:00 UTC on the date chosen for the tagging entity -- preferably by intervals in the order of days.</t>
      </section>
      <section title="IANA Considerations" anchor="IANA">
         <t>
             The IANA is asked to register the tag URI scheme as specified in this document
             and summarised in the following template:
         </t>
<t>
URI scheme name: tag
</t> 
<t>
Status: permanent
</t> 
<t>
URI scheme syntax: see <xref target="TAG_SPEC"/>
</t> 
<t>
Character encoding considerations: percent-encoding is allowed in 'specific' and 'fragment' components (see <xref target="TAG_SPEC"/>)
</t> 
<t>
Intended usage: see <xref target="INTRO"/> and <xref target="RESOLUTION"/>
</t> 
<t>
Applications and/or protocols that use this URI scheme name: Any applications that use URIs as identifiers without 
requiring dereference, such as RDF, YAML, and Atom.
</t> 
<t>
Interoperability considerations: none 
</t> 
<t>
Security considerations: see <xref target="SECURITY"/>
</t> 
<t>
Relevant publications: none
</t> 
<t>
Contact: Tim Kindberg (timothy@hpl.hp.com) and Sandro Hawke (sandro@w3.org)
</t> 
<t>
Author/Change controller: Tim Kindberg and Sandro Hawke 
</t> 
      </section>
   </middle>
   <back>
      <references title="Normative References">
         <reference anchor="ISO8601">
            <front>
               <title>Data elements and interchange formats -- Information interchange -- Representation of dates and   times</title>
               <date month="" year="1988"/>
            </front>
            <seriesInfo name="ISO (International Organization for Standardization)" value="ISO 8601:1988"/>
         </reference>

         <reference anchor="RFC3986 ">
            <front>
               <title>Uniform Resource Identifier (URI): Generic Syntax</title>
               <author initials="T." surname="Berners-Lee" fullname="Tim Berners-Lee"/>
               <author initials="R." surname="Fielding" fullname="Roy Fielding"/>
               <author initials="L." surname="Masinter" fullname="Larry Masinter"/>
               <date month="January" year="2005"/>
            </front>
            <seriesInfo name="RFC" value="3986"/>
         </reference>

         <?rfc include='reference.RFC.2234.xml'?>
         <?rfc include='reference.RFC.1035.xml'?>
      </references>
      <references title="Informative References">
         <reference anchor="DOI">
            <front>
               <title>Information Identifiers</title>
               <author initials="N." surname="Paskin" fullname="Norman Paskin"/>
               <date month="April" year="1997"/>
            </front>
            <seriesInfo name="Learned Publishing" value="Vol. 10, No. 2, pp. 135-156"/>
            <seriesInfo name="" value="(see also www.doi.org)"/>
         </reference>
         <reference anchor="ISO11578">
            <front>
               <title>Information technology - Open Systems Interconnection - Remote Procedure Call (RPC)</title>
               <date month="" year="1996"/>
            </front>
            <seriesInfo name="ISO (International Organization for Standardization)" value="ISO/IEC 11578:1996"/>
         </reference>
         <reference anchor="OID">
            <front>
               <title>Specification of abstract syntax notation one (ASN.1)</title>
               <date month="" year="1988"/>
            </front>
            <seriesInfo name="ITU-T recommendation" value="X.208"/>
            <seriesInfo name="" value="(see also RFC 1778)"/>
         </reference>
<!--
         <?rfc include='reference.RFC.2392.xml'?>
         <?rfc include='reference.RFC.2718.xml'?>
-->
         <?rfc include='reference.RFC.3061.xml'?>
         <?rfc include='reference.RFC.3339.xml'?>
         <reference anchor="UUID">
            <front>
               <title>UUIDs and GUIDs</title>
               <author initials="P." surname="Leach" fullname="Paul Leach"/>
               <author initials="R." surname="Salz" fullname="Rich Salz"/>
               <date month="" year="1997"/>
            </front>
            <seriesInfo name="Internet-Draft" value="draft-leach-uuids-01"/>
         </reference>
      </references>
   </back>
</rfc>