| TOC |
|
This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 25, 2004.
Copyright (C) The Internet Society (2003). All Rights Reserved.
This memo defines the core features of the Extensible Messaging and Presence Protocol (XMPP), a protocol for streaming XML[1] elements in order to exchange messages and presence information in close to real time. While XMPP provides a generalized, extensible framework for transporting structured information, it is used mainly for the purpose of building instant messaging and presence applications that meet the requirements of RFC 2779.
| TOC |
| TOC |
The Extensible Messaging and Presence Protocol (XMPP) is an open XML[1] protocol for near-real-time messaging, presence, and request-response services. The basic syntax and semantics were developed originally within the Jabber open-source community, mainly in 1999. In 2002, the XMPP WG was chartered with developing an adaptation of the Jabber protocol that would be suitable as an IETF instant messaging (IM) and presence technology. As a result of work by the XMPP WG, the current memo defines the core features of XMPP; XMPP IM[21] defines the extensions required to provide the instant messaging and presence functionality defined in RFC 2779[2].
The capitalized key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119[3].
The authors welcome discussion and comments related to the topics presented in this document. The preferred forum is the <xmppwg@jabber.org> mailing list, for which archives and subscription information are available at <http://www.jabber.org/cgi-bin/mailman/listinfo/xmppwg/>.
This document is in full compliance with all provisions of Section 10 of RFC 2026. Parts of this specification use the term "jabber" for identifying namespaces and other protocol syntax. Jabber[tm] is a registered trademark of Jabber, Inc. Jabber, Inc. grants permission to the IETF for use of the Jabber trademark in association with this specification and its successors, if any.
| TOC |
Although XMPP is not wedded to any specific network architecture, to date it usually has been implemented via a typical client-server architecture, wherein a client utilizing XMPP accesses a server over a TCP[4] socket.
The following diagram provides a high-level overview of this architecture (where "-" represents communications that use XMPP and "=" represents communications that use any other protocol).
C1 - S1 - S2 - C3
/ \
C2 - G1 = FN1 = FC1
The symbols are as follows:
A server acts as an intelligent abstraction layer for XMPP communications. Its primary responsibilities are to manage connections from or sessions for other entities (in the form of XML streams to and from authorized clients, servers, and other entities) and to route appropriately-addressed XML stanzas among such entities over XML streams. Most XMPP-compliant servers also assume responsibility for the storage of data that is used by clients (e.g., contact lists for users of XMPP-based instant messaging and presence applications); in this case, the XML data is processed directly by the server itself on behalf of the client and is not routed to another entity. Compliant server implementations MUST ensure in-order processing of XML stanzas between any two entities.
Most clients connect directly to a server over a TCP socket and use XMPP to take full advantage of the functionality provided by a server and any associated services. Although there is no necessary coupling of an XML stream to a TCP socket (e.g., a client COULD connect via HTTP[22] polling or some other mechanism), this specification defines a binding of XMPP to TCP only. Multiple resources (e.g., devices or locations) MAY connect simultaneously to a server on behalf of each authorized client, with each resource differentiated by the resource identifier of a JID (e.g., <node@domain/home> vs. <node@domain/work>) as defined under Addressing Scheme. The RECOMMENDED port for connections between a client and a server is 5222, as registered with the Internet Assigned Numbers Authority (IANA)[5] (see Port Numbers).
A gateway is a special-purpose server-side service whose primary function is to translate XMPP into the protocol used by a foreign (non-XMPP) messaging system, as well as to translate the return data back into XMPP. Examples are gateways to Internet Relay Chat (IRC), Short Message Service (SMS), SIMPLE, SMTP, and legacy instant messaging networks such as AIM, ICQ, MSN Messenger, and Yahoo! Instant Messenger. Communications between gateways and servers, and between gateways and the foreign messaging system, are not defined in this document.
Because each server is identified by a network address and because server-to-server communications are a straightforward extension of the client-to-server protocol, in practice the system consists of a network of servers that inter-communicate. Thus user-a@domain1 is able to exchange messages, presence, and other information with user-b@domain2. This pattern is familiar from messaging protocols (such as SMTP) that make use of network addressing standards. Communications between any two servers are OPTIONAL. If enabled, such communications SHOULD occur over XML streams that are bound to TCP sockets. The RECOMMENDED port for connections between servers is 5222, as registered with the Internet Assigned Numbers Authority (IANA)[5] (see Port Numbers).
| TOC |
An entity is anything that can be considered a network endpoint (i.e., an ID on the network) and that can communicate using XMPP. All such entities are uniquely addressable in a form that is consistent with RFC 2396[23]. For historical reasons, the address of such an entity is called a Jabber Identifier or JID. A valid JID contains a set of ordered elements formed of a domain identifier, node identifier, and resource identifier in the following format: [node@]domain[/resource]. Each allowable portion of a JID (node identifier, domain identifier, and resource identifier) may be up to 1023 bytes in length, resulting in a maximum total size (including the '@' and '/' separators) of 3071 bytes.
All JIDs are based on the foregoing structure. The most common use of this structure is to identify an instant messaging user, the server to which the user connects, and the user's active session or connection (e.g., a specific client) in the form of <user@host/resource>. However, node types other than clients are possible; for example, a specific chat room offered by a multi-user chat service could be addressed as <room@service> (where "room" is the name of the chat room and "service" is the hostname of the multi-user chat service) and a specific occupant of such a room could be addressed as <room@service/nick> (where "nick" is the occupant's room nickname). Many other JID types are possible (e.g., <domain/resource> could be a server-side script or service).
The domain identifier is the primary identifier and is the only REQUIRED element of a JID (a mere domain identifier is a valid JID). It usually represents the network gateway or "primary" server to which other entities connect for XML routing and data management capabilities. However, the entity referenced by a domain identifier is not always a server, and may be a service that is addressed as a subdomain of a server and that provides functionality above and beyond the capabilities of a server (e.g., a multi-user chat service, a user directory, or a gateway to a foreign messaging system).
The domain identifier for every server or service that will communicate over a network SHOULD resolve to a Fully Qualified Domain Name. A domain identifier MUST be not more than 1023 bytes in length and MUST conform to the Nameprep[6] profile of stringprep[7].
The node identifier is an optional secondary identifier placed before the domain identifier and separated from the latter by the '@' character. It usually represents the entity requesting and using network access provided by the server or gateway (i.e., a client), although it can also represent other kinds of entities (e.g., a chat room associated with a multi-user chat service). The entity represented by a node identifier is addressed within the context of a specific domain; within instant messaging and presence applications of XMPP this address is called a "bare JID" and is of the form <node@domain>.
A node identifier MUST be no more than 1023 bytes in length and MUST conform to the Nodeprep profile of stringprep[7].
The resource identifier is an optional tertiary identifier placed after the domain identifier and separated from the latter by the '/' character. A resource identifier may modify either a <node@domain> or mere <domain> address. It usually represents a specific session, connection (e.g., a device or location), or object (e.g., a participant in a multi-user chat room) belonging to the entity associated with a node identifier. A resource identifier is opaque to both servers and other clients, and is typically defined by a client implementation when it provides the information necessary to complete Resource Binding (although it may be generated by a server on behalf of a client). An entity may maintain multiple resources simultaneously.
A resource identifier MUST be no more than 1023 bytes in length and MUST conform to the Resourceprep profile of stringprep[7].
The syntax for a JID is defined below using Augmented Backus-Naur Form as defined in RFC 2234[8]. The IPv4address and IPv6address rules are defined in Appendix B of RFC 2373[9]; the hostname rule is defined in Section 3.2.2 of RFC 2396[23]; the allowable character sequences that conform to the node rule are defined by the Nodeprep profile of stringprep[7] as documented in this memo; the allowable character sequences that conform to the resource rule are defined by the Resourceprep profile of stringprep[7] as documented in this memo.
jid = [ node "@" ] domain [ "/" resource ]
domain = hostname / IPv4address / IPv6address
After Stream Authentication and, if appropriate, Resource Binding, the receiving entity for a stream MUST determine the initiating entity's JID.
For server-to-server communications, the initiating entity's JID SHOULD be the authorization identity, derived from the authentication identity as defined in RFC 2222[13] if no authorization identity was specified during stream authentication.
For client-to-server communications, the "bare JID" (<node@domain>) SHOULD be the authorization identity, derived from the authentication identity as defined in RFC 2222[13] if no authorization identity was specified during stream authentication; the resource identifier portion of the "full JID" (<node@domain/resource>) SHOULD be the resource identifier negotiated by the client and server during Resource Binding.
The receiving entity MUST ensure that the resulting JID (including node identifier, domain identifier, resource identifier, and separator characters) conforms to the rules and formats defined earlier in this section.
| TOC |
Two fundamental concepts make possible the rapid, asynchronous exchange of relatively small payloads of structured information between presence-aware entities: XML streams and XML stanzas. These terms are defined as follows:
- Definition of XML Stream:
- An XML stream is a container for the exchange of XML elements between any two entities over a network. An XML stream is negotiated from an initiating entity (usually a client or server) to a receiving entity (usually a server), normally over a TCP socket, and corresponds to the initiating entity's "session" with the receiving entity. The start of the XML stream is denoted unambiguously by an opening XML <stream> tag (with appropriate attributes and namespace declarations), while the end of the XML stream is denoted unambiguously by a closing XML </stream> tag. An XML stream is unidirectional; in order to enable bidirectional information exchange, the initiating entity and receiving entity MUST negotiate one stream in each direction (the "initial stream" and the "response stream"), normally over the same TCP connection.
- Definition of XML Stanza:
- An XML stanza is a discrete semantic unit of structured information that is sent from one entity to another over an XML stream. An XML stanza exists at the direct child level of the root <stream/> element and is said to be well-balanced if it matches production [43] content of the XML specification[1]). The start of any XML stanza is denoted unambiguously by the element start tag at depth=1 of the XML stream (e.g., <presence>), and the end of any XML stanza is denoted unambiguously by the corresponding close tag at depth=1 (e.g., </presence>). An XML stanza MAY contain child elements (with accompanying attributes, elements, and CDATA) as necessary in order to convey the desired information. The only defined XML stanzas are <message/>, <presence/>, and <iq/> as defined under XML Stanzas; an XML element sent for the purpose of stream encryption, stream authentication, or server dialback is not considered to be an XML stanza.
Consider the example of a client's session with a server. In order to connect to a server, a client MUST initiate an XML stream by sending an opening <stream> tag to the server, optionally preceded by a text declaration specifying the XML version and the character encoding supported (see Inclusion of Text Declaration; see also Character Encoding). Subject to local policies and service provisioning, the server SHOULD then reply with a second XML stream back to the client, again optionally preceded by a text declaration. Once the client has completed Stream Authentication, the client MAY send an unbounded number of XML stanzas over the stream to any recipient on the network. When the client desires to close the stream, it simply sends a closing </stream> tag to the server (alternatively, the stream may be closed by the server), after which both the client and server SHOULD close the underlying TCP connection as well.
Those who are accustomed to thinking of XML in a document-centric manner may wish to view a client's session with a server as consisting of two open-ended XML documents: one from the client to the server and one from the server to the client. From this perspective, the root <stream/> element can be considered the document entity for each "document", and the two "documents" are built up through the accumulation of XML stanzas sent over the two XML streams. However, this perspective is a convenience only, and XMPP does not deal in documents but in XML streams and XML stanzas.
In essence, then, an XML stream acts as an envelope for all the XML stanzas sent during a session. We can represent this in a simplistic fashion as follows:
|--------------------|
| <stream> |
|--------------------|
| <presence> |
| <show/> |
| </presence> |
|--------------------|
| <message to='foo'> |
| <body/> |
| </message> |
|--------------------|
| <iq to='bar'> |
| <query/> |
| </iq> |
|--------------------|
| ... |
|--------------------|
| </stream> |
|--------------------|
The attributes of the stream element are as follows:
We can summarize as follows:
| initiating to receiving | receiving to initiating
---------+---------------------------+-----------------------
to | hostname of receiver | silently ignored
from | silently ignored | hostname of receiver
id | silently ignored | session key
xml:lang | default language | default language
version | signals XMPP 1.0 support | signals XMPP 1.0 support
The following rules apply to the generation and handling of the 'version' attribute:
The stream element MUST possess both a streams namespace declaration and a default namespace declaration (as "namespace declaration" is defined in the XML namespaces specification[10]). For detailed information regarding the streams namespace and default namespace, see Namespace Names and Prefixes.
If the initiating entity includes the 'version' attribute set to a value of "1.0" in the initial stream header, the receiving entity MUST send a <features/> child element (prefixed by the streams namespace prefix) to the initiating entity in order to announce any stream-level features that can be negotiated (or capabilities that otherwise need to be advertised). Currently this is used only for Stream Encryption, Stream Authentication, and Resource Binding as defined herein, and for Session Establishment as defined in XMPP IM[21]; however, the stream features functionality could be used to advertise other negotiable features in the future. If an entity does not understand or support some features, it SHOULD silently ignore them.
XML streams in XMPP 1.0 SHOULD be encrypted as defined under Stream Encryption and MUST be authenticated as defined under Stream Authentication. If the initiating entity attempts to send an XML Stanza before the stream has been authenticated, the receiving entity SHOULD return a <not-authorized/> stream error to the initiating entity and then terminate both the XML stream and the underlying TCP connection.
The root stream element MAY contain an <error/> child element that is prefixed by the streams namespace prefix. The error child MUST be sent by a compliant entity (usually a server rather than a client) if it perceives that a stream-level error has occurred.
The following rules apply to stream-level errors:
The syntax for stream errors is as follows:
<stream:error>
<defined-condition xmlns='urn:ietf:params:xml:ns:xmpp-streams'/>
<text xmlns='urn:ietf:params:xml:ns:xmpp-streams'>
OPTIONAL descriptive text
</text>
[OPTIONAL application-specific condition element]
</stream:error>
The <error/> element:
The <text/> element is OPTIONAL. If included, it SHOULD be used only to provide descriptive or diagnostic information that supplements the meaning of a defined condition or application-specific condition. It SHOULD NOT be interpreted programmatically by an application. It SHOULD NOT be used as the error message presented to a user, but MAY be shown in addition to the error message associated with the included condition element (or elements).
The following stream-level error conditions are defined:
As noted, an application MAY provide application-specific stream error information by including a properly-namespaced child in the error element. The application-specific element SHOULD supplement or further qualify a defined element. Thus the <error/> element will contain two or three child elements:
<stream:error>
<xml-not-well-formed
xmlns='urn:ietf:params:xml:ns:xmpp-streams'/>
<text xml:lang='en' xmlns='urn:ietf:params:xml:ns:xmpp-streams'>
Some special application diagnostic information!
</text>
<escape-your-data xmlns='application-ns'/>
</stream:error>
</stream:stream>
This section contains two simplified examples of a stream-based "session" of a client on a server (where the "C" lines are sent from the client to the server, and the "S" lines are sent from the server to the client); these examples are included for the purpose of illustrating the concepts introduced thus far.
A basic "session":
C: <?xml version='1.0'?>
<stream:stream
to='example.com'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
S: <?xml version='1.0'?>
<stream:stream
from='example.com'
id='someid'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
... encryption, authentication, and resource binding ...
C: <message from='juliet@example.com'
to='romeo@example.net'
xml:lang='en'>
C: <body>Art thou not Romeo, and a Montague?</body>
C: </message>
S: <message from='romeo@example.net'
to='juliet@example.com'
xml:lang='en'>
S: <body>Neither, fair saint, if either thee dislike.</body>
S: </message>
C: </stream:stream>
S: </stream:stream>
A "session" gone bad:
C: <?xml version='1.0'?>
<stream:stream
to='example.com'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
S: <?xml version='1.0'?>
<stream:stream
from='example.com'
id='someid'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
... encryption, authentication, and resource binding ...
C: <message xml:lang='en'>
<body>Bad XML, no closing body tag!
</message>
S: <stream:error>
<xml-not-well-formed
xmlns='urn:ietf:params:xml:ns:xmpp-streams'/>
</stream:error>
S: </stream:stream>
| TOC |
XMPP includes a method for securing the stream from tampering and eavesdropping. This channel encryption method makes use of the Transport Layer Security (TLS)[11] protocol, along with a "STARTTLS" extension that is modelled after similar extensions for the IMAP[25], POP3[26], and ACAP[27] protocols as described in RFC 2595[28]. The namespace name for the STARTTLS extension is 'urn:ietf:params:xml:ns:xmpp-tls', which adheres to the format defined in The IETF XML Registry[24].
An administrator of a given domain MAY require the use of TLS for client-to-server communications, server-to-server communications, or both. Clients SHOULD use TLS to secure the streams prior to attempting to complete Stream Authentication, and servers SHOULD use TLS between two domains for the purpose of securing server-to-server communications.
The following rules apply:
- Case 1 -- The initiating entity has been configured with a set of trusted root certificates:
- Normal certificate validation processing is appropriate, and SHOULD follow the rules defined for HTTP over TLS[12]. The trusted roots may be either a well-known public set or a manually configured Root CA (e.g., an organization's own Certificate Authority or a self-signed Root CA for the service as defined under High Security). This case is RECOMMENDED.
- Case 2 -- The initiating entity has been configured with the receiving entity's self-signed service certificate:
- Simple comparison of public keys is appropriate. This case is NOT RECOMMENDED (see High Security for details).
If the above methods fail, the certificate SHOULD be presented to a human (e.g., an end user or server administrator) for approval; if presented, the receiver MUST deliver the entire certificate chain to the human, who SHOULD be given the option to store the Root CA certificate (not the service or End Entity certificate) and to not be queried again regarding acceptance of the certificate for some reasonable period of time.
When an initiating entity secures a stream with a receiving entity, the steps involved are as follows:
The following example shows the data flow for a client securing a stream using STARTTLS (note: the alternate steps shown below are provided to illustrate the protocol for failure cases; they are not exhaustive and would not necessarily be triggered by the data sent in the example).
Step 1: Client initiates stream to server:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
to='example.com'
version='1.0'>
Step 2: Server responds by sending a stream tag to client:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
id='c2s_123'
from='example.com'
version='1.0'>
Step 3: Server sends the STARTTLS extension to client along with authentication mechanisms and any other stream features:
<stream:features>
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'>
<required/>
</starttls>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>PLAIN</mechanism>
</mechanisms>
</stream:features>
Step 4: Client sends the STARTTLS command to server:
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5: Server informs client to proceed:
<proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5 (alt): Server informs client that TLS negotiation has failed and closes both stream and TCP connection:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
</stream:stream>
Step 6: Client and server attempt to complete TLS negotiation over the existing TCP connection.
Step 7: If TLS negotiation is successful, client initiates a new stream to server:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
to='example.com'
version='1.0'>
Step 7 (alt): If TLS negotiation is unsuccessful, Server2 closes TCP connection.
Step 8: Server responds by sending a stream header to client along with any available stream features:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
from='example.com'
id='c2s_234'
version='1.0'>
<stream:features>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>PLAIN</mechanism>
<mechanism>EXTERNAL</mechanism>
</mechanisms>
</stream:features>
Step 9: Client continues with Stream Authentication.
The following example shows the data flow for two servers securing a stream using STARTTLS (note: the alternate steps shown below are provided to illustrate the protocol for failure cases; they are not exhaustive and would not necessarily be triggered by the data sent in the example).
Step 1: Server1 initiates stream to Server2:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
to='example.com'
version='1.0'>
Step 2: Server2 responds by sending a stream tag to Server1:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
from='example.com'
id='s2s_123'
version='1.0'>
Step 3: Server2 sends the STARTTLS extension to Server1 along with authentication mechanisms and any other stream features:
<stream:features>
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
<required/>
</starttls>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>KERBEROS_V4</mechanism>
</mechanisms>
</stream:features>
Step 4: Server1 sends the STARTTLS command to Server2:
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5: Server2 informs Server1 to proceed:
<proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5 (alt): Server2 informs Server1 that TLS negotiation has failed and closes stream:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
</stream:stream>
Step 6: Server1 and Server2 attempt to complete TLS negotiation via TCP.
Step 7: If TLS negotiation is successful, Server1 initiates a new stream to Server2:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
to='example.com'
version='1.0'>
Step 7 (alt): If TLS negotiation is unsuccessful, server closes TCP connection.
Step 8: Server2 responds by sending a stream header to Server1 along with any available stream features:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
from='example.com'
id='s2s_234'
version='1.0'>
<stream:features>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>KERBEROS_V4</mechanism>
<mechanism>EXTERNAL</mechanism>
</mechanisms>
</stream:features>
Step 9: Server1 continues with Stream Authentication.
| TOC |
XMPP includes a method for authenticating a stream by means of an XMPP-specific profile of the Simple Authentication and Security Layer (SASL)[13]. SASL provides a generalized method for adding authentication support to connection-based protocols, and XMPP uses a generic XML namespace profile for SASL that conforms to Section 4 ("Profiling Requirements") of RFC 2222[13] (the namespace name that qualifies XML elements used in stream authentication is 'urn:ietf:params:xml:ns:xmpp-sasl', which adheres to the format defined in The IETF XML Registry[24]).
The following rules apply:
When an initiating entity authenticates with a receiving entity, the steps involved are as follows:
This series of challenge/response pairs continues until one of three things happens:
Section 4 of the SASL specification[13] requires that the following information be supplied by a protocol definition:
- service name:
- "xmpp"
- initiation sequence:
- After the initiating entity provides an opening XML stream header and the receiving entity replies in kind, the receiving entity provides a list of acceptable authentication methods. The initiating entity chooses one method from the list and sends it to the receiving entity as the value of the 'mechanism' attribute possessed by an <auth/> element, optionally including an initial response to avoid a round trip.
- exchange sequence:
- Challenges and responses are carried through the exchange of <challenge/> elements from receiving entity to initiating entity and <response/> elements from initiating entity to receiving entity. The receiving entity reports failure by sending a <failure/> element and success by sending a <success/> element; the initiating entity aborts the exchange by sending an <abort/> element. Upon successful negotiation, both sides consider the original XML stream to be closed and new stream headers are sent by both entities.
- security layer negotiation:
- The security layer takes effect immediately after sending the closing ">" character of the <success/> element for the receiving entity, and immediately after receiving the closing ">" character of the <success/> element for the initiating entity. The order of layers is first TCP, then TLS, then SASL, then XMPP.
- use of the authorization identity:
- The authorization identity may be used by xmpp to denote the <node@domain> of a client or the sending <domain> of a server.
The following SASL-related error conditions are defined:
The following example shows the data flow for a client authenticating with a server using SASL, normally after successful TLS negotiation (note: the alternate steps shown below are provided to illustrate the protocol for failure cases; they are not exhaustive and would not necessarily be triggered by the data sent in the example).
Step 1: Client initiates stream to server:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
to='example.com'
version='1.0'>
Step 2: Server responds with a stream tag sent to client:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
id='c2s_234'
from='example.com'
version='1.0'>
Step 3: Server informs client of available authentication mechanisms:
<stream:features>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>PLAIN</mechanism>
</mechanisms>
</stream:features>
Step 4: Client selects an authentication mechanism:
<auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl'
mechanism='DIGEST-MD5'/>
Step 5: Server sends a base64[14]-encoded challenge to client:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> cmVhbG09InNvbWVyZWFsbSIsbm9uY2U9Ik9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgi LGNoYXJzZXQ9dXRmLTgsYWxnb3JpdGhtPW1kNS1zZXNzCg== </challenge>
The decoded challenge is:
realm="somerealm",nonce="OA6MG9tEQGm2hh",\
qop="auth",charset=utf-8,algorithm=md5-sess
Step 5 (alt): Server returns error to client:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<incorrect-encoding/>
</failure>
</stream:stream>
Step 6: Client sends a base64[14]-encoded response to the challenge:
<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> dXNlcm5hbWU9InNvbWVub2RlIixyZWFsbT0ic29tZXJlYWxtIixub25jZT0i T0E2TUc5dEVRR20yaGgiLGNub25jZT0iT0E2TUhYaDZWcVRyUmsiLG5jPTAw MDAwMDAxLHFvcD1hdXRoLGRpZ2VzdC11cmk9InhtcHAvZXhhbXBsZS5jb20i LHJlc3BvbnNlPWQzODhkYWQ5MGQ0YmJkNzYwYTE1MjMyMWYyMTQzYWY3LGNo YXJzZXQ9dXRmLTgK </response>
The decoded response is:
username="somenode",realm="somerealm",\
nonce="OA6MG9tEQGm2hh",cnonce="OA6MHXh6VqTrRk",\
nc=00000001,qop=auth,digest-uri="xmpp/example.com",\
response=d388dad90d4bbd760a152321f2143af7,charset=utf-8
Step 7: Server sends another base64[14]-encoded challenge to client:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZAo= </challenge>
The decoded challenge is:
rspauth=ea40f60335c427b5527b84dbabcdfffd
Step 7 (alt): Server returns error to client:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism-too-weak/>
</failure>
</stream:stream>
Step 8: Client responds to the challenge:
<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>
Step 9: Server informs client of successful authentication:
<success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>
Step 9 (alt): Server informs client of failed authentication:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<temporary-auth-failure/>
</failure>
</stream:stream>
Step 10: Client initiates a new stream to server:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
to='example.com'
version='1.0'>
Step 11: Server responds by sending a stream header to client along with any additional features (or an empty features element):
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
id='c2s_345'
from='example.com'
version='1.0'>
<stream:features>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'>
<session xmlns='urn:ietf:params:xml:ns:xmpp-session'>
</stream:features>
The following example shows the data flow for a server authenticating with another server using SASL, normally after successful TLS negotiation (note: the alternate steps shown below are provided to illustrate the protocol for failure cases; they are not exhaustive and would not necessarily be triggered by the data sent in the example).
Step 1: Server1 initiates stream to Server2:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
to='example.com'
version='1.0'>
Step 2: Server2 responds with a stream tag sent to Server1:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
from='example.com'
id='s2s_234'
version='1.0'>
Step 3: Server2 informs Server1 of available authentication mechanisms:
<stream:features>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>KERBEROS_V4</mechanism>
</mechanisms>
</stream:features>
Step 4: Server1 selects an authentication mechanism:
<auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl'
mechanism='DIGEST-MD5'/>
Step 5: Server2 sends a base64[14]-encoded challenge to Server1:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> dXNlcm5hbWU9InNvbWVkb21haW4iLHJlYWxtPSJzb21lcmVhbG0iLG5vbmNl PSJPQTZNRzl0RVFHbTJoaCIscW9wPSJhdXRoIixjaGFyc2V0PXV0Zi04LGFs Z29yaXRobT1tZDUtc2Vzcwo= </challenge>
The decoded challenge is:
username="somedomain",realm="somerealm",\
nonce="OA6MG9tEQGm2hh",qop="auth",\
charset=utf-8,algorithm=md5-sess
Step 5 (alt): Server2 returns error to Server1:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<incorrect-encoding/>
</failure>
</stream:stream>
Step 6: Server1 sends a base64[14]-encoded response to the challenge:
<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> dXNlcm5hbWU9InNvbWVkb21haW4iLHJlYWxtPSJzb21lcmVhbG0iLG5vbmNl PSJPQTZNRzl0RVFHbTJoaCIsY25vbmNlPSJPQTZNSFhoNlZxVHJSayIsbmM9 MDAwMDAwMDEscW9wPWF1dGgsZGlnZXN0LXVyaT0ieG1wcC9leGFtcGxlLmNv bSIscmVzcG9uc2U9ZDM4OGRhZDkwZDRiYmQ3NjBhMTUyMzIxZjIxNDNhZjcs Y2hhcnNldD11dGYtOAo= </response>
The decoded response is:
username="somedomain",realm="somerealm",\
nonce="OA6MG9tEQGm2hh",cnonce="OA6MHXh6VqTrRk",\
nc=00000001,qop=auth,digest-uri="xmpp/example.com",\
response=d388dad90d4bbd760a152321f2143af7,charset=utf-8
Step 7: Server2 sends another base64[14]-encoded challenge to Server1:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZAo= </challenge>
The decoded challenge is:
rspauth=ea40f60335c427b5527b84dbabcdfffd
Step 7 (alt): Server2 returns error to Server1:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<invalid-authzid/>
</failure>
</stream:stream>
Step 8: Server1 responds to the challenge:
<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>
Step 8 (alt): Server1 aborts negotiation:
<abort xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>
Step 9: Server2 informs Server1 of successful authentication:
<success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>
Step 9 (alt): Server2 informs Server1 of failed authentication:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<aborted/>
</failure>
</stream:stream>
Step 10: Server1 initiates a new stream to Server2:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
to='example.com'
version='1.0'>
Step 11: Server2 responds by sending a stream header to Server1 along with any additional features (or an empty features element):
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
from='example.com'
id='s2s_345'
version='1.0'>
<stream:features/>
| TOC |
After Stream Authentication with the receiving entity, the initiating entity MAY want or need to bind a specific resource to that stream. In general this applies only to clients: in order to conform to the addressing format and stanza delivery rules specified herein, there MUST be a resource identifier associated with the <node@domain> of the client (which is either generated by the server or provided by the client application); this ensures that the address for use over that stream is a "full JID" of the form <node@domain/resource>.
Upon receiving a success indication within the SASL negotiation, the client MUST send a new stream header to the server, to which the server MUST respond with a stream header as well as a list of available stream features. Specifically, if the server requires the client to bind a resource to the stream after successful stream authentication, it MUST include an empty <bind/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-bind' namespace in the stream features list it presents to the client upon sending the header for the response stream sent after successful stream authentication (but not before):
Server advertises resource binding feature to client:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
id='c2s_345'
from='example.com'
version='1.0'>
<stream:features>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'>
</stream:features>
Upon being so informed that resource binding is required, the client MUST bind a resource to the stream by sending to the server an IQ stanza of type "set" (see IQ Semantics) containing data qualified by the 'urn:ietf:params:xml:ns:xmpp-bind' namespace.
If the client wishes to allow the server to generate the resource identifier on its behalf, it sends an IQ stanza of type "set" that contains an empty <bind/> element:
Client asks server to bind a resource:
<iq type='set' id='bind_1'>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'/>
</iq>
A server that supports resource binding MUST be able to generate a resource identifier on behalf of a client. A resource identifier generated by the server MUST be unique for that <node@domain>.
If the client wishes to specify the resource identifier, it sends an IQ stanza of type "set" that contains the desired resource identifier as the CDATA of a <resource/> element that is a child of the <bind/> element:
Client binds a resource:
<iq type='set' id='bind_2'>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'>
<resource>someresource</resource>
</bind>
</iq>
Once the server has generated a resource identifier for the client or accepted the resource identifier provided by the client, it MUST return an IQ stanza of type "result" to the client, which MUST include a <jid/> child element that specifies the full JID for the client as determined by the server:
Server informs client of successful resource binding:
<iq type='result' id='bind_2'>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'>
<jid>somenode@somedomain/someresource</jid>
</bind>
</iq>
A server is NOT REQUIRED to accept the resource identifier provided by the client, and MAY override it with a resource identifier that the server generates; in this case, the server SHOULD NOT return a stanza error (e.g., <forbidden/>) to the client but instead SHOULD communicate the generated resource identifier to the client in the IQ result as shown above.
When a client supplies a resource identifier, the following stanza error conditions may occur (see Stanza Errors):
The protocol for these error conditions is shown below.
Resource identifier cannot be processed:
<iq type='error' id='bind_2'>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'>
<resource>someresource</resource>
</bind>
<error type='modify'>
<bad-request xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
</error>
</iq>
Client is not allowed to bind a resource:
<iq type='error' id='bind_2'>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'>
<resource>someresource</resource>
</bind>
<error type='cancel'>
<not-allowed xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
</error>
</iq>
Resource identifier is in use:
<iq type='error' id='bind_2'>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'>
<resource>someresource</resource>
</bind>
<error type='cancel'>
<conflict xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
</error>
</iq>
| TOC |
The Jabber protocol from which XMPP was adapted includes a "server dialback" method for protecting against domain spoofing, thus making it more difficult to spoof XML stanzas (see Server-to-Server Communications regarding this method's security characteristics). Server dialback also makes it easier to deploy systems in which outbound messages and inbound messages are handled by different machines for the same domain. The server dialback method is made possible by the existence of the Domain Name System (DNS), since one server can (normally) discover the authoritative server for a given domain.
Because dialback depends on DNS, inter-domain communications MUST NOT proceed until the DNS hostnames asserted by the servers have been resolved (see Server-to-Server Communications).
The method for generating and verifying the keys used in server dialback MUST take into account the hostnames being used, the random ID generated for the stream, and a secret known by the authoritative server's network.
Any error that occurs during dialback negotiation MUST be considered a stream error, resulting in termination of the stream and of the underlying TCP connection. The possible error conditions are specified in the protocol description below.
The following terminology applies:
The following is a brief summary of the order of events in dialback:
We can represent this flow of events graphically as follows:
Originating Receiving
Server Server
----------- ---------
| |
| establish connection |
| ----------------------> |
| |
| send stream header |
| ----------------------> |
| |
| send stream header |
| <---------------------- |
| | Authoritative
| send dialback key | Server
| ----------------------> | -------------
| | |
| establish connection |
| ----------------------> |
| |
| send stream header |
| ----------------------> |
| |
| establish connection |
| <---------------------- |
| |
| send stream header |
| <---------------------- |
| |
| send dialback key |
| ----------------------> |
| |
| validate dialback key |
| <---------------------- |
|
| report dialback result |
| <---------------------- |
| |
The detailed protocol interaction between the servers is as follows:
<stream:stream
xmlns:stream='http://etherx.jabber.org/streams'
xmlns='jabber:server'
xmlns:db='jabber:server:dialback'>
Note: the 'to' and 'from' attributes are NOT REQUIRED on the root stream element. The inclusion of the xmlns:db namespace declaration with the name shown indicates to Receiving Server that Originating Server supports dialback. If the namespace name is incorrect, then Receiving Server MUST generate an <invalid-namespace/> stream error condition and terminate both the XML stream and the underlying TCP connection.
<stream:stream
xmlns:stream='http://etherx.jabber.org/streams'
xmlns='jabber:server'
xmlns:db='jabber:server:dialback'
id='457F9224A0...'>
Note: The 'to' and 'from' attributes are NOT REQUIRED on the root stream element. If the namespace name is incorrect, then Originating Server MUST generate an <invalid-namespace/> stream error condition and terminate both the XML stream and the underlying TCP connection. Note well that Receiving Server is NOT REQUIRED to reply and MAY silently terminate the XML stream and underlying TCP connection depending on security policies in place.
<db:result
to='Receiving Server'
from='Originating Server'>
98AF014EDC0...
</db:result>
Note: this key is not examined by Receiving Server, since Receiving Server does not keep information about Originating Server between sessions. The key generated by Originating Server MUST be based in part on the value of the ID provided by Receiving Server in the previous step, and in part on a secret shared by Originating Server and Authoritative Server. If the value of the 'to' address does not match a hostname recognized by Receiving Server, then Receiving Server MUST generate a <host-unknown/> stream error condition and terminate both the XML stream and the underlying TCP connection. If the value of the 'from' address matches a domain with which Receiving Server already has an established connection, then Receiving Server MUST maintain the existing connection until it validates whether the new connection is legitimate; additionally, Receiving Server MAY choose to generate a <not-authorized/> stream error condition for the new connection and then terminate both the XML stream and the underlying TCP connection related to the new request.
<stream:stream
xmlns:stream='http://etherx.jabber.org/streams'
xmlns='jabber:server'
xmlns:db='jabber:server:dialback'>
Note: the 'to' and 'from' attributes are NOT REQUIRED on the root stream element. If the namespace name is incorrect, then Authoritative Server MUST generate an <invalid-namespace/> stream error condition and terminate both the XML stream and the underlying TCP connection.
<stream:stream
xmlns:stream='http://etherx.jabber.org/streams'
xmlns='jabber:server'
xmlns:db='jabber:server:dialback'
id='1251A342B...'>
Note: if the namespace name is incorrect, then Receiving Server MUST generate an <invalid-namespace/> stream error condition and terminate both the XML stream and the underlying TCP connection between it and Authoritative Server. If a stream error occurs between Receiving Server and Authoritative Server, then Receiving Server MUST generate a <remote-connection-failed/> stream error condition and terminate both the XML stream and the underlying TCP connection between it and Originating Server.
<db:verify
from='Receiving Server'
to='Originating Server'
id='457F9224A0...'>
98AF014EDC0...
</db:verify>
Note: passed here are the hostnames, the original identifier from Receiving Server's stream header to Originating Server in Step 3, and the key that Originating Server sent to Receiving Server in Step 4. Based on this information as well as shared secret information within the Authoritative Server's network, the key is verified. Any verifiable method MAY be used to generate the key. If the value of the 'to' address does not match a hostname recognized by Authoritative Server, then Authoritative Server MUST generate a <host-unknown/> stream error condition and terminate both the XML stream and the underlying TCP connection. If the value of the 'from' address does not match the hostname represented by Receiving Server when opening the TCP connection (or any validated domain), then Authoritative Server MUST generate an <invalid-from/> stream error condition and terminate both the XML stream and the underlying TCP connection.
<db:verify
from='Originating Server'
to='Receiving Server'
type='valid'
id='457F9224A0...'/>
or
<db:verify
from='Originating Server'
to='Receiving Server'
type='invalid'
id='457F9224A0...'/>
Note: if the ID does not match that provided by Receiving Server in Step 3, then Receiving Server MUST generate an <invalid-id/> stream error condition and terminate both the XML stream and the underlying TCP connection. If the value of the 'to' address does not match a hostname recognized by Receiving Server, then Receiving Server MUST generate a <host-unknown/> stream error condition and terminate both the XML stream and the underlying TCP connection. If the value of the 'from' address does not match the hostname represented by Originating Server when opening the TCP connection (or any validated domain), then Receiving Server MUST generate an <invalid-from/> stream error condition and terminate both the XML stream and the underlying TCP connection.
<db:result
from='Receiving Server'
to='Originating Server'
type='valid'/>
Note: At this point the connection has either been validated via a type='valid', or reported as invalid. If the connection is invalid, then Receiving Server MUST terminate both the XML stream and the underlying TCP connection. If the connection is validated, data can be sent by Originating Server and read by Receiving Server; before that, all data stanzas sent to Receiving Server SHOULD be silently dropped.
Even if dialback negotiation is successful, a server MUST verify that all XML stanzas received from the other server include a 'from' attribute and a 'to' attribute; if a stanza does not meet this restriction, the server that receives the stanza MUST generate an <improper-addressing/> stream error condition and terminate both the XML stream and the underlying TCP connection. Furthermore, a server MUST verify that the 'from' attribute of stanzas received from the other server includes a validated domain for the stream; if a stanza does not meet this restriction, the server that receives the stanza MUST generate an <invalid-from/> stream error condition and terminate both the XML stream and the underlying TCP connection. Both of these checks help to prevent spoofing related to particular stanzas.
| TOC |
After Stream Encryption if desired, Stream Authentication, and Resource Binding if necessary, XML stanzas can be sent over the streams. Three kinds of XML stanza are defined for the 'jabber:client' and 'jabber:server' namespaces: <message/>, <presence/>, and <iq/>. In addition, there are five common attributes for these kinds of stanza. These common attributes, as well as the basic semantics of the three stanza kinds, are defined herein; more detailed information regarding the syntax of XML stanzas in relation to instant messaging and presence applications is provided in XMPP IM[21].
The following five attributes are common to message, presence, and IQ stanzas:
The 'to' attribute specifies the JID of the intended recipient for the stanza.
In the 'jabber:client' namespace, a stanza SHOULD possess a 'to' attribute, although a stanza sent from a client to a server for handling by that server (e.g., presence sent to the server for broadcasting to other entities) SHOULD NOT possess a 'to' attribute.
In the 'jabber:server' namespace, a stanza MUST possess a 'to' attribute; if a server receives a stanza that does not meet this restriction, it MUST generate an <improper-addressing/> stream error condition and terminate both the XML stream and the underlying TCP connection with the offending server.
If the value of the 'to' attribute is invalid or cannot be contacted, the entity discovering that fact (usually the sender's or recipient's server) MUST return an appropriate error to the sender, setting the 'from' attribute of the error stanza to the value provided in the 'to' attribute of the offending stanza.
The 'from' attribute specifies the JID of the sender.
When a server receives an XML stanza within the context of an authenticated stream qualified by the 'jabber:client' namespace, it MUST do one of the following:
If a client attempts to send an XML stanza for which the value of the 'from' attribute does not match one of the connected resources for that entity, the server SHOULD return an <invalid-from/> stream error to the client. If a client attempts to send an XML stanza over a stream that is not yet authenticated, the server SHOULD return a <not-authorized/> stream error to the client. If generated, both of these conditions MUST result in closing of the stream and termination of the underlying TCP connection; this helps to prevent a denial of service attack launched from a rogue client.
In the 'jabber:server' namespace, a stanza MUST possess a 'from' attribute; if a server receives a stanza that does not meet this restriction, it MUST generate an <improper-addressing/> stream error condition. Furthermore, the domain identifier portion of the JID contained in the 'from' attribute MUST match the hostname (or any validated domain) of the sending server as communicated in the SASL negotiation or dialback negotiation; if a server receives a stanza that does not meet this restriction, it MUST generate an <invalid-from/> stream error condition. Both of these conditions MUST result in closing of the stream and termination of the underlying TCP connection; this helps to prevent a denial of service attack launched from a rogue server.
The optional 'id' attribute MAY be used by a sending entity for internal tracking of stanzas that it sends and receives (especially for tracking the request-response interaction inherent in the semantics of IQ stanzas). The value of the 'id' attribute is NOT REQUIRED to be unique either globally, within a domain, or within a stream. The semantics of IQ stanzas impose additional restrictions; see IQ Semantics.
The 'type' attribute specifies detailed information about the purpose or context of the message, presence, or IQ stanza. The particular allowable values for the 'type' attribute vary depending on whether the stanza is a message, presence, or IQ; the values for message and presence stanzas are specific to instant messaging and presence applications and therefore are defined in XMPP IM[21], whereas the values for IQ stanzas specify the role of an IQ stanza in a structured request-response "conversation" and thus are defined under IQ Semantics below. The only 'type' value common to all three stanzas is "error", for which see Stanza Errors.
A stanza SHOULD possess an 'xml:lang' attribute (as defined in Section 2.12 of the XML specification[1]) if the stanza contains XML character data that is intended to be presented to a human user (as explained in RFC 2277[15], "internationalization is for humans"). The value of the 'xml:lang' attribute specifies the default language of any such human-readable XML character data, which MAY be overridden by the 'xml:lang' attribute of a specific child element. If a stanza does not possess an 'xml:lang' attribute, an implementation MUST assume that the default language is that specified for the stream as defined under Stream Attributes above. The value of the 'xml:lang' attribute MUST be an NMTOKEN and MUST conform to the format defined in RFC 3066[16].
The <message/> stanza kind can be seen as a "push" mechanism whereby one entity pushes information to another entity, similar to the communications that occur in a system such as email. All message stanzas SHOULD possess a 'to' attribute that specifies the intended recipient of the message; upon receiving such a stanza, a server SHOULD route or deliver it to the intended recipient (see Server Rules for Handling XML Stanzas for general routing and delivery rules related to XML stanzas).
The <presence/> element can be seen as a basic broadcast or "publish-subscribe" mechanism, whereby multiple entities receive information (in this case, presence information) about an entity to which they have subscribed. In general, a publishing entity SHOULD send a presence stanza with no 'to' attribute, in which case the server to which the entity is connected SHOULD broadcast or multiplex that stanza to all subscribing entities. However, a publishing entity MAY also send a presence stanza with a 'to' attribute, in which case the server SHOULD route or deliver that stanza to the intended recipient. See Server Rules for Handling XML Stanzas for general routing and delivery rules related to XML stanzas, and XMPP IM[21] for presence-specific rules in the context of an instant messaging and presence application.
Info/Query, or IQ, is a request-response mechanism, similar in some ways to HTTP[22]. The semantics of IQ enable an entity to make a request of, and receive a response from, another entity. The data content of the request and response is defined by the namespace declaration of a direct child element of the IQ element, and the interaction is tracked by the requesting entity through use of the 'id' attribute. Thus IQ interactions follow a common pattern of structured data exchange such as get/result or set/result (although an error may be returned in reply to a request if appropriate):
Requesting Responding
Entity Entity
---------- ----------
| |
| <iq type='get' id='1'> |
| ------------------------> |
| |
| <iq type='result' id='1'> |
| <------------------------ |
| |
| <iq type='set' id='2'> |
| ------------------------> |
| |
| <iq type='error' id='2'> |
| <------------------------ |
| |
In order to enforce these semantics, the following rules apply:
Stanza-related errors are handled in a manner similar to stream errors. However, stanza errors are not unrecoverable, as stream errors are; therefore error stanzas include hints regarding actions that the original sender can take in order to remedy the error.
The following rules apply to stanza-related errors:
The syntax for stanza-related errors is as follows:
<stanza-name to='sender' type='error'>
[RECOMMENDED to include sender XML here]
<error type='error-type'>
<defined-condition xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
<text xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'>
OPTIONAL descriptive text
</text>
[OPTIONAL application-specific condition element]
</error>
</stanza-name>
The stanza-name is one of message, presence, or iq.
The value of the <error/> element's 'type' attribute MUST be one of the following:
The <error/> element: