| TOC |
|
This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on October 16, 2003.
Copyright (C) The Internet Society (2003). All Rights Reserved.
This document describes the core features of the Extensible Messaging and Presence Protocol (XMPP), a protocol for streaming XML elements in order to exchange messages and presence information in close to real time. XMPP is used mainly for the purpose of building instant messaging (IM) and presence applications, such as the servers and clients that comprise the Jabber network.
| TOC |
| TOC |
The Extensible Messaging and Presence Protocol (XMPP) is an open XML[1] protocol for near-real-time messaging, presence, and request-response services. The basic syntax and semantics were developed originally within the Jabber open-source community, mainly in 1999. In 2002, the XMPP WG was chartered with developing an adaptation of the Jabber protocol that would be suitable as an IETF instant messaging and presence technology. As a result of work by the XMPP WG, the current document defines the core features of XMPP; XMPP IM[22] defines the extensions required to provide the instant messaging (IM) and presence functionality defined in RFC 2779[2].
The capitalized key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119[3].
The authors welcome discussion and comments related to the topics presented in this document. The preferred forum is the <xmppwg@jabber.org> mailing list, for which archives and subscription information are available at <http://www.jabber.org/cgi-bin/mailman/listinfo/xmppwg/>.
This document is in full compliance with all provisions of Section 10 of RFC 2026. Parts of this specification use the term "jabber" for identifying namespaces and other protocol syntax. Jabber[tm] is a registered trademark of Jabber, Inc. Jabber, Inc. grants permission to the IETF for use of the Jabber trademark in association with this specification and its successors, if any.
| TOC |
Although XMPP is not wedded to any specific network architecture, to this point it has usually been implemented via a typical client-server architecture, wherein a client utilizing XMPP accesses a server over a TCP[4] socket.
The following diagram provides a high-level overview of this architecture (where "-" represents communications that use XMPP and "=" represents communications that use any other protocol).
C1 - S1 - S2 - C3
/ \
C2 - G1 = FN1 = FC1
The symbols are as follows:
A server acts as an intelligent abstraction layer for XMPP communications. Its primary responsibilities are to manage connections from or sessions for other entities (in the form of XML streams to and from authorized clients, servers, and other entities) and to route appropriately-addressed XML data "stanzas" among such entities over XML streams. Most XMPP-compliant servers also assume responsibility for the storage of data that is used by clients (e.g., contact lists for users of XMPP-based IM applications); in this case, the XML data is processed directly by the server itself on behalf of the client and is not routed to another entity. Compliant server implementations MUST ensure in-order processing of XML stanzas between any two entities.
Most clients connect directly to a server over a TCP socket and use XMPP to take full advantage of the functionality provided by a server and any associated services. Although there is no necessary coupling of an XML stream to a TCP socket (e.g., a client COULD connect via HTTP polling or some other mechanism), this specification defines a binding for XMPP to TCP only. Multiple resources (e.g., devices or locations) MAY connect simultaneously to a server on behalf of each authorized client, with each resource connecting over a discrete TCP socket and differentiated by the resource identifier of a JID (e.g., user@domain/home vs. user@domain/work). The port registered with the IANA[5] for connections between a Jabber client and a Jabber server is 5222.
A gateway is a special-purpose server-side service whose primary function is to translate XMPP into the protocol used by a foreign (non-XMPP) messaging system, as well as to translate the return data back into XMPP. Examples are gateways to SIMPLE, Internet Relay Chat (IRC), Short Message Service (SMS), SMTP, and legacy instant messaging networks such as AIM, ICQ, MSN Messenger, and Yahoo! Instant Messenger. Communications between gateways and servers, and between gateways and the foreign messaging system, are not defined in this document.
Because each server is identified by a network address (typically a DNS hostname) and because server-to-server communications are a straightforward extension of the client-to-server protocol, in practice the system consists of a network of servers that inter-communicate. Thus user-a@domain1 is able to exchange messages, presence, and other information with user-b@domain2. This pattern is familiar from messaging protocols (such as SMTP) that make use of network addressing standards. Upon opening a TCP socket on the IANA-registered port 5269, there are two methods for negotiating a connection between any two servers: primarily SASL authentication and secondarily server dialback.
| TOC |
An entity is anything that can be considered a network endpoint (i.e., an ID on the network) and that can communicate using XMPP. All such entities are uniquely addressable in a form that is consistent with RFC 2396[23]. In particular, a valid Jabber Identifier (JID) contains a set of ordered elements formed of a domain identifier, node identifier, and resource identifier in the following format: [node@]domain[/resource].
All JIDs are based on the foregoing structure. The most common use of this structure is to identify an IM user, the server to which the user connects, and the user's active session or connection (e.g., a specific client) in the form of user@domain/resource. However, node types other than clients are possible; for example, a specific chat room offered by a multi-user chat service could be addressed as <room@service> (where "room" is the name of the chat room and "service" is the hostname of the multi-user chat service) and a specific occupant of such a room could be addressed as <room@service/nick> (where "nick" is the occupant's room nickname). Many other JID types are possible (e.g., <domain/resource> could be a server-side script or service).
The domain identifier is the primary identifier and is the only REQUIRED element of a JID (a mere domain identifier is a valid JID). It usually represents the network gateway or "primary" server to which other entities connect for XML routing and data management capabilities. However, the entity referenced by a domain identifier is not always a server, and may be a service that is addressed as a subdomain of a server and that provides functionality above and beyond the capabilities of a server (a multi-user chat service, a user directory, a gateway to a foreign messaging system, etc.).
The domain identifier for every server or service that will communicate over a network SHOULD resolve to a Fully Qualified Domain Name. A domain identifier MUST conform to RFC 952[6] and RFC 1123[7]. A domain identifier MUST be no more than 1023 bytes in length and MUST conform to the nameprep[8] profile of stringprep[9].
The node identifier is an optional secondary identifier. It usually represents the entity requesting and using network access provided by the server or gateway (i.e., a client), although it can also represent other kinds of entities (e.g., a multi-user chat room associated with a multi-user chat service). The entity represented by a node identifier is addressed within the context of a specific domain; within IM applications of XMPP this address is called a "bare JID" and is of the form <user@domain>.
A node identifier MUST be no more than 1023 bytes in length and MUST conform to the nodeprep[10] profile of stringprep[9].
The resource identifier is an optional tertiary identifier, which may modify either a "user@domain" or mere "domain" address. It usually represents a specific session, connection (e.g., a device or location), or object (e.g., a participant in a multi-user chat room) belonging to the entity associated with a node identifier. A resource identifier is typically defined by a client implementation and is opaque to both servers and other clients. An entity may maintain multiple resources simultaneously.
A resource identifier MUST be no more than 1023 bytes in length and MUST conform to the resourceprep[11] profile of stringprep[9].
| TOC |
Two fundamental concepts make possible the rapid, asynchronous exchange of relatively small payloads of structured information between presence-aware entities: XML streams and XML stanzas. The terms may be defined as follows:
- Definition of XML stream:
- An XML stream is a container for the exchange of XML elements between any two entities over a network. An XML stream is negotiated from an initiating entity (usually a client or server) to a receiving entity (usually a server), normally over a TCP socket, and corresponds to the initiating entity's "session" with the receiving entity. The start of the XML stream is denoted unambiguously by an opening XML <stream> tag with appropriate attributes and namespace declarations, and the end of the XML stream is denoted unambiguously be a closing XML </stream> tag. An XML stream is unidirectional; in order to enable bidirectional information exchange, the initiating entity and receiving entity must negotiate one stream in each direction, normally over the same TCP connection.
- Definition of XML stanza:
- An XML stanza is a discrete semantic unit of structured information that is sent from one entity to another over an XML stream. An XML stanza exists at the direct child level of the root <stream/> element and is said to be well-balanced if it matches production [43] content of the XML specification[1]). The start of any XML stanza is denoted unambiguously by the element start tag at depth=1 (e.g., <presence>), and the end of any XML stanza is denoted unambiguously by the corresponding close tag at depth=1 (e.g., </presence>). An XML stanza MAY contain child elements (with accompanying attributes, elements, and CDATA) as necessary in order to convey the desired information.
Consider the example of a client's session with a server. In order to connect to a server, a client must initiate an XML stream by sending an opening <stream> tag to the server, optionally preceded by a text declaration specifying the XML version supported and the character encoding. The server SHOULD then reply with a second XML stream back to the client, again optionally preceded by a text declaration. Once the client has authenticated with the server (see Stream Authentication), the client MAY send an unlimited number of XML stanzas over the stream to any recipient on the network. When the client desires to close the stream, it simply sends a closing </stream> tag to the server (alternatively, the session may be closed by the server), after which both the client and server SHOULD close the underlying TCP connection as well.
Those who are accustomed to thinking of XML in a document-centric manner may wish to view a client's session with a server as consisting of two open-ended XML documents: one from the client to the server and one from the server to the client. From this perspective, the root <stream/> element can be considered the document entity for each "document", and the two "documents" are built up through the accumulation of XML stanzas sent over the two XML streams. However, this perspective is a convenience only, and XMPP does not deal in documents but in XML streams and XML stanzas.
In essence, then, an XML stream acts as an envelope for all the XML stanzas sent during a session. We can represent this graphically as follows:
|-------------------|
| <stream> |
|-------------------|
| <message to=''> |
| <body/> |
| </message> |
|-------------------|
| <presence to=''> |
| <show/> |
| </presence> |
|-------------------|
| <iq to=''> |
| <query/> |
| </iq> |
|-------------------|
| ... |
|-------------------|
| </stream> |
|-------------------|
The attributes of the stream element are as follows:
We can summarize these values as follows:
| initiating to receiving | receiving to initiating
------------------------------------------------------------
to | hostname of receiver | silently ignored
from | silently ignored | hostname of receiver
id | silently ignored | session key
version | signals XMPP 1.0 support | signals XMPP 1.0 support
The stream element MAY contain namespace declarations as defined in the XML namespaces specification[12].
A stream namespace declaration (e.g., 'xmlns:stream') is REQUIRED in both XML streams. A compliant entity SHOULD accept any namespace prefix on the <stream/> element; however, for historical reasons some entities MAY accept only a 'stream' prefix, resulting in the use of a <stream:stream/> element as the stream root. The name of the stream namespace MUST be "http://etherx.jabber.org/streams".
A default namespace declaration ('xmlns') is REQUIRED and is used in both XML streams in order to define the allowable first-level children of the root stream element for both streams. This namespace declaration MUST be the same for the initiating stream and the responding stream so that both streams are scoped consistently. The default namespace declaration applies to the stream and all stanzas sent within a stream (unless explicitly scoped by another namespace).
Since XML streams function as containers for any XML stanzas sent asynchronously between network endpoints, it should be possible to scope an XML stream with any default namespace declaration (i.e., it should be possible to send any properly-namespaced XML stanza over an XML stream). At a minimum, a compliant implementation MUST support the following two namespaces (for historical reasons, some implementations MAY support only these two default namespaces):
The jabber:client and jabber:server namespaces are nearly identical but are used in different contexts (client-to-server communications for jabber:client and server-to-server communications for jabber:server). The only difference between the two is that the 'to' and 'from' attributes are OPTIONAL on stanzas sent within jabber:client, whereas they are REQUIRED on stanzas sent within jabber:server. If a compliant implementation accepts a stream that is scoped by the 'jabber:client' or 'jabber:server' namespace, it MUST support all three core stanza types (message, presence, and IQ) as described herein and defined in the schema.
The root stream element MAY contain a features child element (e.g., <stream:features/> if the stream namespace prefix is 'stream'). This is used to communicate generic stream-level capabilities including stream-level features that can be negotiated as the streams are set up. If the initiating entity sends a "version='1.0'" flag in its initiating stream element, the receiving entity MUST send a features child element to the initiating entity if there are any capabilities that need to be advertised or features that can be negotiated for the stream. Currently this is used for SASL and TLS negotiation only, but it could be used for other negotiable features in the future (usage is defined under Stream Encryption and Stream Authentication below). If an entity does not understand or support some features, it SHOULD silently ignore them.
The root stream element MAY contain an error child element (e.g., <stream:error/> if the stream namespace prefix is 'stream'). The error child MUST be sent by a compliant entity (usually a server rather than a client) if it perceives that a stream-level error has occurred.
The following rules apply to stream-level errors:
The syntax for stream errors is as follows:
<stream:error class='error-class'>
<stream-condition xmlns='urn:ietf:params:xml:ns:xmpp-streams'>
<descriptive-element-name/>
</stream-condition>
</stream:error>
The value of the 'class' attribute must be one of the following:
The <stream-condition/> element MUST contain a child element that specifies a particular stream-level error condition, as defined in the next section. (Note: the XML namespace name 'urn:ietf:params:xml:ns:xmpp-streams' that scopes the <stream-condition/> element adheres to the format defined in The IETF XML Registry[24].)
The following stream-level error conditions are defined:
If desired, an XMPP application MAY provide custom error information; this MUST be contained in a properly-namespaced child of the <stream-condition/> element (i.e., the namespace name MUST NOT be one of the namespace names defined herein).
The following is a stream-based session of a client on a server (where the "C" lines are sent from the client to the server, and the "S" lines are sent from the server to the client):
A basic session:
C: <?xml version='1.0'?>
<stream:stream
to='shakespeare.lit'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
S: <?xml version='1.0'?>
<stream:stream
from='shakespeare.lit'
id='id_123456789'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
... authentication ...
C: <message from='juliet@shakespeare.lit'
to='romeo@shakespeare.lit'>
C: <body>Art thou not Romeo, and a Montague?</body>
C: </message>
S: <message from='romeo@shakespeare.lit'
to='juliet@shakespeare.lit'>
S: <body>Neither, fair saint, if either thee dislike.</body>
S: </message>
C: </stream:stream>
S: </stream:stream>
A session gone bad:
C: <?xml version='1.0'?>
<stream:stream
to='shakespeare.lit'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
S: <?xml version='1.0'?>
<stream:stream
from='shakespeare.lit'
id='id_123456789'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
... authentication ...
C: <message><body>Bad XML, no closing body tag!</message>
S: <stream:error class='client'>
<stream-condition xmlns='urn:ietf:params:xml:ns:xmpp-streams'>
<xml-not-well-formed/>
</stream-condition>
</stream:error>
S: </stream:stream>
| TOC |
XMPP includes a method for securing the stream from tampering and eavesdropping. This channel encryption method makes use of the Transport Layer Security (TLS)[13] protocol, along with a "STARTTLS" extension that is modelled on similar extensions for the IMAP[25], POP3[26], and ACAP[27] protocols as described in RFC 2595[28]. The namespace identifier for the STARTTLS extension is 'urn:ietf:params:xml:ns:xmpp-tls'.
TLS SHOULD be used between any initiating entity and any receiving entity (e.g., a stream from a client to a server or from one server to another). An administrator of a given domain MAY require use of TLS for either or both client-to-server communications and server-to-server communications. Servers SHOULD use TLS betweeen two domains for the purpose of securing server-to-server communcations. When the remote domain is already known, the server can verify the credentials of the known domain by comparing known keys or certificates. When the remote domain is not recognized, it may still be possible to verify a certificate if it is signed by a common trusted authority. Even if there is no way to verify certificates (e.g., an unknown domain with a self-signed certificate, or a certificate signed by an unrecognized authority), if the servers choose to communicate despite the lack of verified credentials, TLS still SHOULD be used to provide encryption.
The following business rules apply:
If the above methods fail, the certificate MAY be presented to a user for approval; the user SHOULD be given the option to store the certificate and not ask again for at least some reasonable period of time.
When an initiating entity secures a stream with a receiving entity, the steps involved are as follows:
The following example shows the data flow for a client securing a stream using STARTTLS.
Step 1: Client initiates stream to server:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
to='capulet.com'
version='1.0'>
Step 2: Server responds by sending a stream tag to the client:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
id='12345678'
version='1.0'>
Step 3: Server sends the STARTTLS extension to the client along with authentication mechanisms and any other stream features (if TLS is required for interaction with this server, the server SHOULD signal that fact by including a <required/> element as a child of the <starttls/> element):
<stream:features>
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'>
<required/>
</starttls>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>PLAIN</mechanism>
</mechanisms>
</stream:features>
Step 4: Client sends the STARTTLS command to the server:
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5: Server informs client to proceed:
<proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5 (alt): Server informs client that TLS negotiation has failed and closes stream:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
</stream:stream>
Step 6: Client and server complete TLS negotiation over the existing TCP connection.
Step 7: Client initiates a new stream to the server:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
to='capulet.com'
version='1.0'>
Step 8: Server responds by sending a stream header to the client along with any remaining negotiatiable stream features:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
id='12345678'
version='1.0'>
<stream:features>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>PLAIN</mechanism>
<mechanism>EXTERNAL</mechanism>
</mechanisms>
</stream:features>
Step 9: Client SHOULD continue with stream authentication.
The following example shows the data flow for two servers securing a stream using STARTTLS.
Step 1: Server1 initiates stream to Server2:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
Step 2: Server2 responds by sending a stream tag to Server1:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
id='12345678'
version='1.0'>
Step 3: Server2 sends the STARTTLS extension to Server1 along with authentication mechanisms and any other stream features (if TLS is required for interaction with Server2, it SHOULD signal that fact by including a <required/> element as a child of the <starttls/> element):
<stream:features>
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
<required/>
</starttls>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>KERBEROS_V4</mechanism>
</mechanisms>
</stream:features>
Step 4: Server1 sends the STARTTLS command to Server2:
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5: Server2 informs Server1 to proceed:
<proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5 (alt): Server2 informs Server1 that TLS negotiation has failed and closes stream:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
</stream:stream>
Step 6: Server1 and Server2 complete TLS negotiation via TCP.
Step 7: Server1 initiates a new stream to Server2:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
Step 8: Server2 responds by sending a stream header to Server1 along with any remaining negotiatiable stream features:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
id='12345678'
version='1.0'>
<stream:features>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>KERBEROS_V4</mechanism>
<mechanism>EXTERNAL</mechanism>
</mechanisms>
</stream:features>
Step 9: Server1 SHOULD continue with stream authentication.
| TOC |
XMPP includes two methods for enforcing authentication at the level of XML streams. The secure and preferred method for authenticating streams between two entities uses an XMPP adaptation of the Simple Authentication and Security Layer (SASL)[14]. If SASL negotiation is not possible, some level of trust MAY be established based on existing trust in DNS; the authentication method used in this case is the server dialback protocol that is native to XMPP (no such ad-hoc method is defined between a client and a server). If SASL is used for server-to-server authentication, the servers MUST NOT use dialback. For further information about the relative merits of these two methods, consult Security Considerations.
Stream authentication is REQUIRED for all direct communications between two entities; if an entity sends a stanza to an unauthenticated stream, the receiving entity SHOULD silently drop the stanza and MUST NOT process it.
The Simple Authentication and Security Layer (SASL) provides a generalized method for adding authentication support to connection-based protocols. XMPP uses a generic XML namespace profile for SASL that conforms to section 4 ("Profiling Requirements") of RFC 2222[14] (the XMPP-specific namespace identifier is 'urn:ietf:params:xml:ns:xmpp-sasl').
The following business rules apply:
The following syntax rules apply:
When an initiating entity authenticates with a receiving entity, the steps involved are as follows:
This series of challenge/response pairs continues until one of three things happens:
Any character data contained within these elements MUST be encoded using base64.
Section 4 of the SASL specification[14] requires that the following information be supplied by a protocol definition:
- service name:
- "xmpp"
- initiation sequence:
- After the initiating entity provides an opening XML stream header and the receiving entity replies in kind, the receiving entity provides a list of acceptable authentication methods. The initiating entity chooses one method from the list and sends it to the receiving entity as the value of the 'mechanism' attribute possessed by an <auth/> element, optionally including an initial response to avoid a round trip.
- exchange sequence:
- Challenges and responses are carried through the exchange of <challenge/> elements from receiving entity to initiating entity and <response/> elements from initiating entity to receiving entity. The receiving entity reports failure by sending a <failure/> element and success by sending a <success/> element; the initiating entity aborts the exchange by sending an <abort/> element. (All of these elements are scoped by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace.)
- security layer negotiation:
- If a security layer is negotiated, both sides consider the original stream closed and new <stream/> headers are sent by both entities. The security layer takes effect immediately following the ">" character of the <response/> element for the client and immediately following the closing ">" character of the <succeed/> element for the server. (Both of these elements are scoped by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace.)
- use of the authorization identity:
- The authorization identity is used by xmpp only in negotiation between a client and a server, and denotes the "full JID" (user@host/resource) requested by the user or application associated with the client.
The following example shows the data flow for a client authenticating with a server using SASL.
Step 1: Client initiates stream to server:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
to='domain'
version='1.0'>
Step 2: Server responds with a stream tag sent to the client:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
id='12345678'
from='domain'
version='1.0'>
Step 3: Server informs client of available authentication mechanisms:
<stream:features>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>PLAIN</mechanism>
</mechanisms>
</stream:features>
Step 4: Client selects an authentication mechanism:
<auth
xmlns='urn:ietf:params:xml:ns:xmpp-sasl'
mechanism='DIGEST-MD5'/>
Step 5: Server sends a base64-encoded challenge to the client:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
cmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik9BNk1HOXRFUUdtMmhoIi
xxb3A9ImF1dGgiLGNoYXJzZXQ9dXRmLTgsYWxnb3JpdGhtPW1kNS1zZXNz
</challenge>
The decoded challenge is:
realm="cataclysm.cx",nonce="OA6MG9tEQGm2hh",\
qop="auth",charset=utf-8,algorithm=md5-sess
Step 6: Client responds to the challenge:
<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
dXNlcm5hbWU9InJvYiIscmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik
9BNk1HOXRFUUdtMmhoIixcIGNub25jZT0iT0E2TUhYaDZWcVRyUmsiLG5j
PTAwMDAwMDAxLHFvcD1hdXRoLFwgZGlnZXN0LXVyaT0ieG1wcC9jYXRhY2
x5c20uY3giLFwgcmVzcG9uc2U9ZDM4OGRhZDkwZDRiYmQ3NjBhMTUyMzIxZ
jIxNDNhZjcsY2hhcnNldD11dGYtOA==
</response>
The decoded response is:
username="rob",realm="cataclysm.cx",\
nonce="OA6MG9tEQGm2hh",cnonce="OA6MHXh6VqTrRk",\
nc=00000001,qop=auth,digest-uri="xmpp/cataclysm.cx",\
response=d388dad90d4bbd760a152321f2143af7,charset=utf-8,\
authzid="rob@cataclysm.cx/myResource"
Step 7: Server sends another challenge to the client:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZA==
</challenge>
The decoded challenge is:
rspauth=ea40f60335c427b5527b84dbabcdfffd
Step 8: Client responds to the challenge:
<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>
Step 9: Server informs client of successful authentication:
<success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>
Step 9 (alt): Server informs client of failed authentication:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<resource-conflict/>
</failure>
Step 10: Client initiates a new stream to the server:
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
to='domain'
version='1.0'>
Step 11: Server responds by sending a stream header to the client, with the stream already authenticated (not followed by further stream features):
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
id='12345678'
from='domain'
version='1.0'>
The following example shows the data flow for a server authenticating with another server using SASL.
Step 1: Server1 initiates stream to Server2:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
Step 2: Server2 responds with a stream tag sent to Server1:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
id='12345678'
version='1.0'>
Step 3: Server2 informs Server1 of available authentication mechanisms:
<stream:features>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
<mechanism>KERBEROS_V4</mechanism>
</mechanisms>
</stream:features>
Step 4: Server1 selects an authentication mechanism:
<auth
xmlns='urn:ietf:params:xml:ns:xmpp-sasl'
mechanism='DIGEST-MD5'/>
Step 5: Server2 sends a base64-encoded challenge to Server1:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
cmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik9BNk1HOXRFUUdtMmhoIi
xxb3A9ImF1dGgiLGNoYXJzZXQ9dXRmLTgsYWxnb3JpdGhtPW1kNS1zZXNz
</challenge>
The decoded challenge is:
realm="cataclysm.cx",nonce="OA6MG9tEQGm2hh",\
qop="auth",charset=utf-8,algorithm=md5-sess
Step 6: Server1 responds to the challenge:
<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
cmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik9BNk1HOXRFUUdtMmhoIi
xjbm9uY2U9Ik9BNk1IWGg2VnFUclJrIixuYz0wMDAwMDAwMSxxb3A9YXV0
aCxkaWdlc3QtdXJpPSJ4bXBwL2NhdGFjbHlzbS5jeCIscmVzcG9uc2U9ZD
M4OGRhZDkwZDRiYmQ3NjBhMTUyMzIxZjIxNDNhZjcsY2hhcnNldD11dGYt
OAo=
</response>
The decoded response is:
realm="cataclysm.cx",nonce="OA6MG9tEQGm2hh",cnonce="OA6MHXh6VqTrRk",\
nc=00000001,qop=auth,digest-uri="xmpp/cataclysm.cx",\
response=d388dad90d4bbd760a152321f2143af7,charset=utf-8
Step 7: Server2 sends another challenge to Server1:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZA==
</challenge>
The decoded challenge is:
rspauth=ea40f60335c427b5527b84dbabcdfffd
Step 8: Server1 responds to the challenge:
<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>
Step 9: Server2 informs Server1 of successful authentication:
<success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>
Step 9 (alt): Server2 informs Server1 of failed authentication:
<failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<temporary-auth-failure/>
</failure>
Step 10: Server1 initiates a new stream to Server2:
<stream:stream
xmlns='jabber:server'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
Step 11: Server2 responds by sending a stream header to Server1, with the stream already authenticated (not followed by further stream features):
<stream:stream
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
id='12345678'
version='1.0'>
XMPP includes a protocol-level method for verifying that a connection between two servers can be trusted as much as the DNS can be trusted. The method is called dialback and is used only within XML streams that are declared under the "jabber:server" namespace.
The purpose of the dialback protocol is to make server spoofing more difficult, and thus to make it more difficult to forge XML stanzas. Dialback is decidedly not intended as a mechanism for securing or encrypting the streams between servers as is done via SASL and TLS, only for helping to prevent the spoofing of a server and the sending of false data from it. In particular, dialback authentication is susceptible to DNS poisoning attacks unless DNSSec[29] is used. Furthermore, even if the DNS information is accurate, dialback authentication cannot protect from attacks where the attacker is capable of hijacking the IP address of the remote domain. Domains requiring more robust security SHOULD use TLS and SASL as defined above.
Server dialback is made possible by the existence of DNS, since one server can verify that another server which is connecting to it is authorized to represent a given hostname. All DNS hostname resolutions MUST first resolve the hostname using an SRV[17] record of _jabber._tcp.server. If the SRV lookup fails, the fallback is a normal A lookup to determine the IP address, using the jabber-server port of 5269 assigned by the Internet Assigned Numbers Authority[5].
The method for generating and verifying the keys used in the dialback protocol MUST take into account the hostnames being used, the random ID generated for the stream, and a secret known by the authoritative server's network. Generating unique but verifiable keys is important to prevent common man-in-the-middle attacks and server spoofing.
Any error that occurs during dialback negotiation MUST be considered a stream error, resulting in termination of the stream and of the underlying TCP connection. The possible error conditions are specified in the protocol description below.
The following terminology applies:
The following is a brief summary of the order of events in dialback:
We can represent this flow of events graphically as follows:
Originating Receiving
Server Server
----------- ---------
| |
| establish connection |
| ----------------------> |
| |
| send stream header |
| ----------------------> |
| |
| establish connection |
| <---------------------- |
| |
| send stream header |
| <---------------------- |
| | Authoritative
| send dialback key | Server
| ----------------------> | -------------
| | |
| establish connection |
| ----------------------> |
| |
| send stream header |
| ----------------------> |
| |
| establish connection |
| <---------------------- |
| |
| send stream header |
| <---------------------- |
| |
| send dialback key |
| ----------------------> |
| |
| validate dialback key |
| <---------------------- |
|
| report dialback result |
| <---------------------- |
| |
The interaction between the servers is as follows:
<stream:stream
xmlns:stream='http://etherx.jabber.org/streams'
xmlns='jabber:server'
xmlns:db='jabber:server:dialback'>
Note: the 'to' and 'from' attributes are NOT REQUIRED on the root stream element. The inclusion of the xmlns:db namespace declaration with the name shown indicates to Receiving Server that Originating Server supports dialback. If the namespace name is incorrect, then Receiving Server MUST generate an <invalid-namespace/> stream error condition and terminate both the stream and the underlying TCP connection.
<stream:stream
xmlns:stream='http://etherx.jabber.org/streams'
xmlns='jabber:server'
xmlns:db='jabber:server:dialback'
id='457F9224A0...'>
Note: Receiving Server is NOT REQUIRED to reply, and SHOULD NOT reply if there exists an established session between Receiving Server and the hostname asserted by Originating Server. The 'to' and 'from' attributes are NOT REQUIRED on the root stream element. If the namespace name is incorrect, then Originating Server MUST generate an <invalid-namespace/> stream error condition and terminate both the stream and the underlying TCP connection.
<db:result
to='Receiving Server'
from='Originating Server'>
98AF014EDC0...
</db:result>
Note: this key is not examined by Receiving Server, since Receiving Server does not keep information about Originating Server between sessions. The key generated by Originating Server must be based in part on the value of the ID provided by Receiving Server in the previous step, and in part on a secret shared by Originating Server and Authoritative Server. If the value of the 'to' address does not match a hostname recognized by Receiving Server, then Receiving Server MUST generate a <host-unknown/> stream error condition and terminate both the stream and the underlying TCP connection. If the value of the 'from' address does not match the hostname represented by Originating Server when opening the TCP connection (or any validated domain), then Receiving Server MUST generate a <nonmatching-hosts/> stream error condition and terminate both the stream and the underlying TCP connection.
<stream:stream
xmlns:stream='http://etherx.jabber.org/streams'
xmlns='jabber:server'
xmlns:db='jabber:server:dialback'>
Note: the 'to' and 'from' attributes are NOT REQUIRED on the root stream element. If the namespace name is incorrect, then Authoritative Server MUST generate an <invalid-namespace/> stream error condition and terminate both the stream and the underlying TCP connection.
<stream:stream
xmlns:stream='http://etherx.jabber.org/streams'
xmlns='jabber:server'
xmlns:db='jabber:server:dialback'
id='1251A342B...'>
Note: if the namespace name is incorrect, then Receiving Server MUST generate an <invalid-namespace/> stream error condition and terminate both the stream and the underlying TCP connection between it and Authoritative Server. If the ID does not match that provided by Receiving Server in Step 3, then Receiving Server MUST generate an <invalid-id/> stream error condition and terminate both the stream and the underlying TCP connection between it and Authoritative Server. If either of the foregoing stream errors occurs between Receiving Server and Authoritative Server, then Receiving Server MUST generate a <remote-connection-failed/> stream error condition and terminate both the stream and the underlying TCP connection between it and Originating Server.
<db:verify
from='Receiving Server'
to='Originating Server'
id='457F9224A0...'>
98AF014EDC0...
</db:verify>
Note: passed here are the hostnames, the original identifier from Receiving Server's stream header to Originating Server in Step 3, and the key that Originating Server sent to Receiving Server in Step 4. Based on this information and shared secret information within the Authoritative Server's network, the key is verified. Any verifiable method MAY be used to generate the key. If the value of the 'to' address does not match a hostname recognized by Authoritative Server, then Authoritative Server MUST generate a <host-unknown/> stream error condition and terminate both the stream and the underlying TCP connection. If the value of the 'from' address does not match the hostname represented by Receiving Server when opening the TCP connection (or any validated domain), then Authoritative Server MUST generate a <nonmatching-hosts/> stream error condition and terminate both the stream and the underlying TCP connection.
<db:verify
from='Originating Server'
to='Receiving Server'
type='valid'
id='457F9224A0...'/>
or
<db:verify
from='Originating Server'
to='Receiving Server'
type='invalid'
id='457F9224A0...'/>
Note: if the ID does not match that provided by Receiving Server in Step 3, then Receiving Server MUST generate an <invalid-id/> stream error condition and terminate both the stream and the underlying TCP connection. If the value of the 'to' address does not match a hostname recognized by Receiving Server, then Receiving Server MUST generate a <host-unknown/> stream error condition and terminate both the stream and the underlying TCP connection. If the value of the 'from' address does not match the hostname represented by Originating Server when opening the TCP connection (or any validated domain), then Receiving Server MUST generate a <nonmatching-hosts/> stream error condition and terminate both the stream and the underlying TCP connection.
<db:result
from='Receiving Server'
to='Originating Server'
type='valid'/>
Note: At this point the connection has either been validated via a type='valid', or reported as invalid. If the connection is invalid, then Receiving Server MUST terminate both the stream and the underlying TCP connection. If the connection is validated, data can be sent by Originating Server and read by Receiving Server; before that, all data stanzas sent to Receiving Server SHOULD be silently dropped.
Even if dialback negotiation is successful, a server MUST verify that all XML stanzas received from the other server include a 'from' attribute and a 'to' attribute; if a stanza does not meet this restriction, the server that receives the stanza MUST generate an <invalid-xml/> stream error condition and terminate both the stream and the underlying TCP connection. Furthermore, a server MUST verify that the 'from' attribute of stanzas received from the other server includes the validated domain (or any validated domain); if a stanza does not meet this restriction, the server that receives the stanza MUST generate a <nonmatching-hosts/> stream error condition and terminate both the stream and the underlying TCP connection. Both of these checks help to prevent spoofing related to particular stanzas.
| TOC |
Once the XML streams in each direction have been authenticated and (if desired) encrypted, XML stanzas can be sent over the streams. Three XML stanza types are defined for the 'jabber:client' and 'jabber:server' namespaces: <message/>, <presence/>, and <iq/>.
In essence, the <message/> stanza type can be seen as a "push" mechanism whereby one entity pushes information to another entity, similar to the communications that occur in a system such as email. The <presence/> element can be seen as a basic broadcast or "publish-subscribe" mechanism, whereby multiple entities receive information (in this case, presence information) about an entity to which they have subscribed. The <iq/> element can be seen as a "request-response" mechanism similar to HTTP, whereby two entities can engage in a structured conversation using 'get' or 'set' requests and 'result' or 'error' responses.
The syntax for these stanza types is defined below.
Five attributes are common to message, presence, and IQ stanzas. These are defined below.
The 'to' attribute specifies the JID of the intended recipient for the stanza.
In the 'jabber:client' namespace, a stanza SHOULD possess a 'to' attribute, although a stanza sent from a client to a server for handling by that server (e.g., presence sent to the server for broadcasting to other entities) MAY legitimately lack a 'to' attribute.
In the 'jabber:server' namespace, a stanza MUST possess a 'to' attribute; if a server receives a stanza that does not meet this restriction, it MUST generate an <invalid-xml/> stream error condition and terminate both the stream and the underlying TCP connection.
The 'from' attribute specifies the JID of the sender.
In the 'jabber:client' namespace, a client MUST NOT include a 'from' attribute on the stanzas it sends to a server; if a server receives a stanza from a client and the stanza possesses a 'from' attribute, it MUST ignore the value of the 'from' attribute and MAY return an error to the sender. In addition, a server MUST stamp stanzas received from a client with the user@domain/resource (full JID) of the connected resource that generated the stanza.
In the 'jabber:server' namespace, a stanza MUST possess a 'from' attribute; if a server receives a stanza that does not meet this restriction, it MUST generate an <invalid-xml/> stream error condition. Furthermore, the domain identifier portion of the JID contained in the 'from' attribute MUST match the hostname of the sending server (or any validated domain) as communicated in the SASL negotiation or dialback negotiation; if a server receives a stanza that does not meet this restriction, it MUST generate a <nonmatching-hosts/> stream error condition. Both of these conditions MUST result in closing of the stream and termination of the underlying TCP connection.
The optional 'id' attribute MAY be used to track stanzas sent and received. The 'id' attribute is generated by the sender. An 'id' attribute included in an IQ request of type "get" or "set" SHOULD be returned to the sender in any IQ response of type "result" or "error" generated by the recipient of the request. A recipient of a message or presence stanza MAY return that 'id' in any replies, but is NOT REQUIRED to do so.
The value of the 'id' attribute is not intended to be unique -- globally, within a domain, or within a stream. It is generated by a sender only for internal tracking of information within the sending application.
The 'type' attribute specifies detailed information about the purpose or context of the message, presence, or IQ stanza. The particular allowable values for the 'type' attribute vary depending on whether the stanza is a message, presence, or IQ, and thus are specified in the following sections.
Any message or presence stanza MAY possess an 'xml:lang' attribute specifying the default language of any CDATA sections of the stanza or its child elements. An IQ stanza SHOULD NOT possess an 'xml:lang' attribute, since it is merely a vessel for data in other namespaces and does not itself contain children that have CDATA. The value of the 'xml:lang' attribute MUST be an NMTOKEN and MUST conform to the format defined in RFC 3066[16].
Message stanzas in the 'jabber:client' or 'jabber:server' namespace are used to "push" information to another entity. Common uses in the context of instant messaging include single messages, messages sent in the context of a chat conversation, messages sent in the context of a multi-user chat room, headlines, and errors. These messages types are identified more fully below.
The 'type' attribute of a message stanza is OPTIONAL; if included, it specifies the conversational context of the message. The sending of a message stanza without a 'type' attribute signals that the message stanza is a single message. However, the 'type' attribute MAY also have one of the following values:
For information about the meaning of these message types, refer to XMPP IM[22].
As described under extended namespaces, a message stanza MAY contain any properly-namespaced child element as long as the namespace name is not "jabber:client", "jabber:server", or "http://etherx.jabber.org/streams".
In accordance with the default namespace declaration, by default a message stanza is in the 'jabber:client' or 'jabber:server' namespace, which defines certain allowable children of message stanzas. If the message stanza is of type "error", it MUST include an <error/> child; for details, see Stanza Errors. If the message stanza has no 'type' attribute or has a 'type' attribute with a value of "chat", "groupchat", or "headline", it MAY contain any of the following child elements without an explicit namespace declaration:
The <body/> element contains the textual contents of the message; normally included but NOT REQUIRED. The <body/> element SHOULD NOT possess any attributes, with the exception of the 'xml:lang' attribute. Multiple instances of the <body/> element MAY be included but only if each instance possesses an 'xml:lang' attribute with a distinct language value. The <body> element MUST NOT contain mixed content.
The <subject/> element specifies the topic of the message. The <subject/> element SHOULD NOT possess any attributes, with the exception of the 'xml:lang' attribute. Multiple instances of the <subject/> element MAY be included for the purpose of providing alternate versions of the same subject, but only if each instance possesses an 'xml:lang' attribute with a distinct language value. The <subject> element MUST NOT contain mixed content.
The <thread/> element contains a random string that is generated by the sender and that SHOULD be copied back in replies; it is used for tracking a conversation thread (sometimes referred to as an "IM session") between two entities. If used, it MUST be unique to that conversation thread within the stream and MUST be consistent throughout that conversation. The use of the <thread/> element is optional and is not used to identify individual messages, only conversations. Only one <thread/> element MAY be included in a message stanza, and it MUST NOT possess any attributes. The <thread/> element MUST be treated as an opaque string by entities; no semantic meaning may be derived from it, and only exact, case-insensitve comparisons be made against it. The <thread> element MUST NOT contain mixed content.
The method for generating thread IDs SHOULD be as follows:
Presence stanzas are used in the 'jabber:client' or 'jabber:server' namespace to express an entity's current availability status (offline or online, along with various sub-states of the latter and optional user-defined descriptive text) and to communicate that status to other entities. Presence stanzas are also used to negotiate and manage subscriptions to the presence of other entities.
The 'type' attribute of a presence stanza is optional. A presence stanza that does not possess a 'type' attribute is used to signal to the server that the sender is online and available for communication. If included, the 'type' attribute specifies a lack of availability, a request to manage a subscription to another entity's presence, a request for another entity's current presence, or an error related to a previously-sent presence stanza. The 'type' attribute MAY have one of the following values:
Information about the subscription model used within XMPP can be found in XMPP IM[22].
As described under extended namespaces, a presence stanza MAY contain any properly-namespaced child element as long as the namespace name is not "jabber:client", "jabber:server", or "http://etherx.jabber.org/streams".
In accordance with the default namespace declaration, by default a presence stanza is in the 'jabber:client' or 'jabber:server' namespace, which defines certain allowable children of presence stanzas. If the presence stanza is of type "error", it MUST include an <error/> child; for details, see Stanza Errors. If the presence stanza possesses no 'type' attribute, it MAY contain any of the following child elements (note that the <status/> child MAY be sent in a presence stanza of type "unavailable" or, for historical reasons, "subscribe"):
The optional <show/> element specifies a particular availability status of an entity or specific resource (if a <show/> element is not provided, default availability is assumed (if a <show/> element is not provided, default availability is assumed)). Only one <show/> element MAY be included in a presence stanza, and it SHOULD NOT possess any attributes. The CDATA value SHOULD be one of the following (values other than these four SHOULD be ignored; additional availability types could be defined through a properly-namespaced child element of the presence stanza):
For information about the meaning of these values, refer to XMPP IM[22].
The optional <status/> element contains a natural-language description of availability status. It is normally used in conjunction with the show element to provide a detailed description of an availability state (e.g., "In a meeting"). The <status/> element SHOULD NOT possess any attributes, with the exception of the 'xml:lang' attribute. Multiple instances of the <status/> element MAY be included but only if each instance possesses an 'xml:lang' attribute with a distinct language value.
The optional <priority/> element specifies the priority level of the connected resource. The value may be any integer between -128 to 127. Only one <priority/> element MAY be included in a presence stanza, and it MUST NOT possess any attributes. For information regarding the use of priority values in stanza routing within IM applications, see XMPP IM[22].
Info/Query, or IQ, is a request-response mechanism, similar in some ways to HTTP[30]. IQ stanzas in the 'jabber:client' or 'jabber:server' namespace enable an entity to make a request of, and receive a response from, another entity. The data content of the request and response is defined by the namespace declaration of a direct child element of the IQ element, and the interaction is tracked by the requesting entity through use of the 'id' attribute, which responding entities SHOULD return in any response.
Most IQ interactions follow a common pattern of structured data exchange such as get/result or set/result (although an error may be returned in response to a request if appropriate):
Requesting Responding
Entity Entity
---------- ----------
| |
| <iq type='get' id='1'> |
| ------------------------> |
| |
| <iq type='result' id='1'> |
| <------------------------ |
| |
| <iq type='set' id='2'> |
| ------------------------> |
| |
| <iq type='result' id='2'> |
| <------------------------ |
| |
An entity that receives an IQ request of type 'get' or 'set' MUST reply with an IQ response of type 'result' or 'error' (which response MUST preserve the 'id' attribute of the request). An entity that receives a stanza of type 'result' or 'error' MUST NOT respond to the stanza by sending a further IQ response of type 'result' or 'error'; however, as shown above, the requesting entity MAY send another request (e.g., an IQ of type 'set' in order to provide required information discovered through a get/result pair).
The 'type' attribute of an IQ stanza is REQUIRED. The 'type' attribute specifies a distinct step within a request-response interaction. The value SHOULD be one of the following (all other values SHOULD be ignored):
As described under extended namespaces, an IQ stanza MAY contain any properly-namespaced child element as long as the namespace name is not "jabber:client", "jabber"server", or "http://etherx.jabber.org/streams". However, an IQ stanza contains no children in the 'jabber:client' or 'jabber:server' namespace since it is a vessel for XML in another namespace.
If the IQ stanza is of type "error", it MUST include an <error/> child; for details, see Stanza Errors.
While the core data elements in the "jabber:client" or "jabber:server" namespace (along with their attributes and child elements) provide a basic level of functionality for messaging and presence, XMPP uses XML namespaces to extend the core data elements for the purpose of providing additional functionality. Thus a message, presence, or IQ stanza MAY house one or more optional child elements containing content that extends the meaning of the message (e.g., an encrypted form of the message body). This child element MAY be have any name and MUST possess an 'xmlns' namespace declaration (other than "jabber:client", "jabber:server", or "http://etherx.jabber.org/streams") that defines all data contained within the child element.
Support for any given extended namespace is OPTIONAL on the part of any implementation. If an entity does not understand such a namespace, the entity's expected behavior depends on whether the entity is (1) the recipient or (2) an entity that is routing the stanza to the recipient. In particular:
- Recipient:
- If a recipient receives a stanza that contains a child element it does not understand, it SHOULD ignore that specific XML data, i.e., it SHOULD not process it or present it to a user or associated application (if any). In particular:
- If an entity receives a message or presence stanza that contains XML data in an extended namespace it does not understand, the portion of the stanza that is in the unknown namespace SHOULD be ignored.
- If an entity receives a message stanza without a <body/> element but containing only a child element bound by a namespace it does not understand, it MUST ignore the entire stanza/
- If an entity receives an IQ stanza in a namespace it does not understand, the entity SHOULD return an IQ stanza of type "error" with an error condition of <feature-not-implemented/>.
- Router:
- If a routing entity (usually a server) handles a stanza that contains a child element it does not understand, it SHOULD ignore the associated XML data by passing it on untouched to the recipient.
As defined below, stanza-related errors are handled in a manner similar to stream errors.
The following rules apply to stanza-related errors:
The syntax for stanza-related errors is as follows:
<stanza-name to='sender' type='error'>
[include sender XML here]
<error class='error-class'>
<stanza-condition xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'>
<descriptive-element-name/>
</stanza-condition>
</error>
</stanza-name>
The stanza-name is one of message, presence, or iq.
The value of the 'class' attribute MUST be one of the following:
The <stanza-condition/> element MUST contain a child element that specifies a particular stanza-related error condition, as defined in the next section. (Note: the XML namespace name 'urn:ietf:params:xml:ns:xmpp-stanzas' that scopes the <stanza-condition/> element adheres to the format defined in The IETF XML Registry[24].)
The following stanza-related error conditions are defined:
If desired, an XMPP application MAY provide custom error information; the error class MUST be "app" and the data MUST be contained in a properly-namespaced child of the <stanza-condition/> element (i.e., the namespace name MUST NOT be one of namespace names defined herein).
| TOC |
XMPP is a simplified and specialized protocol for streaming XML elements in order to exchange messages and presence information in close to real time. Because XMPP does not require the parsing of arbitrary and complete XML documents, there is no requirement that XMPP must support the full XML specification[1]. In particular, the following restrictions apply:
With regard to XML generation, an XMPP implementation MUST NOT inject into an XML stream any of the following:
With regard to XML processing, if an XMPP implementation receives such restricted XML data, it MUST ignore the data.
XML Namespaces[12] are used within all XMPP-compliant XML to create strict boundaries of data ownership. The basic function of namespaces is to separate different vocabularies of XML elements that are structurally mixed together. Ensuring that XMPP-compliant XML is namespace-aware enables any XML to be structurally mixed with any data element within XMPP.
Additionally, XMPP is more strict about namespace prefixes than the XML namespace specification requires.
Except as noted with regard to 'to' and 'from' addresses for stanzas within the 'jabber:server' namespace, a server is not responsible for validating the XML elements forwarded to a client or another server; an implementation MAY choose to provide only validated data elements but is NOT REQUIRED to do so. Clients SHOULD NOT rely on the ability to send data which does not conform to the schemas, and SHOULD ignore any non-conformant elements or attributes on the incoming XML stream. Validation of XML streams and stanzas is NOT REQUIRED or recommended, and schemas are included herein for descriptive purposes only.
Software implementing XML streams MUST support the UTF-8 (RFC 2279[18]) and UTF-16 (RFC 2781[19]) transformations of Universal Character Set (ISO/IEC 10646-1[20]) characters. Software MUST NOT attempt to use any other encoding for transmitted data. The encodings of the transmitted and received streams are independent. Software MAY select either UTF-8 or UTF-16 for the transmitted stream, and SHOULD deduce the encoding of the received stream as described in the XML specification[1]. For historical reasons, existing implementations MAY support UTF-8 only.
An application MAY send a text declaration. Applications MUST follow the rules in the XML specification[1] regarding the circumstances under which a text declaration is included.
| TOC |
A URN sub-namespace for TLS-related data in XMPP is defined as follows.
- URI:
- urn:ietf:params:xml:ns:xmpp-tls
- Specification:
- [RFCXXXX]
- Description:
- This is the XML namespace name for TLS-related data in XMPP as defined by [RFCXXXX].
- Registrant Contact:
- IETF, XMPP Working Group, <xmppwg@jabber.org>
A URN sub-namespace for SASL-related data in XMPP is defined as follows.
- URI:
- urn:ietf:params:xml:ns:xmpp-sasl
- Specification:
- [RFCXXXX]
- Description:
- This is the XML namespace name for SASL-related data in XMPP as defined by [RFCXXXX].
- Registrant Contact:
- IETF, XMPP Working Group, <xmppwg@jabber.org>
A URN sub-namespace for XMPP stream error tags is defined as follows.
- URI:
- urn:ietf:params:xml:ns:xmpp-streams
- Specification:
- [RFCXXXX]
- Description:
- This is the XML namespace name for XMPP stream errors as defined by [RFCXXXX].
- Registrant Contact:
- IETF, XMPP Working Group, <xmppwg@jabber.org>
A URN sub-namespace for XMPP stanza error tags is defined as follows.
- URI:
- urn:ietf:params:xml:ns:xmpp-stanzas
- Specification:
- [RFCXXXX]
- Description:
- This is the XML namespace name for XMPP stanza errors as defined by [RFCXXXX].
- Registrant Contact:
- IETF, XMPP Working Group, <xmppwg@jabber.org>
The IANA registers "xmpp" as a GSSAPI[21] service name, as specified in SASL Definition.
Additionally, the IANA registers "jabber-client" and "jabber-server" as keywords for TCP ports 5222 and 5269 respectively.
| TOC |
Usage of the 'xml:lang' attribute is described above. If a client includes an 'xml:lang' attribute in a stanza, a server MUST NOT modify or delete it.
| TOC |
For the purposes of XMPP communications (client-to-server and server-to-server), the term "high security" refers to the use of security technologies that provide both mutual authentication and integrity-checking; in particular, when using certificate-based authentication to provide high security, a chain-of-trust SHOULD be established out-of-band, although a shared certificate authority signing certificates could allow a previously unknown certificate to establish trust in-band.
Self-signed certificates MAY be used but pose a problem for administrators the first time such a certificate is seen. A self-signed certificate, if accepted, MUST be stored by an entity in order to verify in future communications. A server that changes its self-signed cert to another self-signed cert (or to a certificate signed by an unrecognized authority) therefore creates administration problems for all entities with which it has communicated before and will again. In particular, those entities have no reason to believe that the new self-signed cert was not generated by an attacker to impersonate the previously-trusted server.
Implementations MUST support high security. Service provisioning SHOULD use high security, subject to local security policies.
The TLS protocol for encrypting XML streams (defined under Stream Encryption) provides a reliable mechanism for helping to ensure the confidentiality and data integrity of data exchanged between two entities.
The SASL protocol for authenticating XML streams (defined under SASL Authentication) provides a reliable mechanism for validating that a client connecting to a server is who it claims to be.
The IP address and method of access of clients MUST NOT be made available by a server, nor are any connections other than the original server connection required. This helps protect the client's server from direct attack or identification by third parties.
End-to-end encryption of message bodies and presence status information MAY be effected through use of the methods defined in End-to-End Object Encryption in XMPP[31].
A compliant implementation MUST support both TLS and SASL for inter-domain communications. For historical reasons, a compliant implementation SHOULD also support the lower-security Dialback Protocol, which provides a mechanism for helping to prevent the spoofing of domains.
Because service provisioning is a matter of policy, it is OPTIONAL for any given domain to communicate with other domains, and server-to-server communications MAY be disabled by the administrator of any given deployment. If a particular domain enables inter-domain communications, it SHOULD enable high security. In the absence of high security, a domain MAY use server dialback for inter-domain communications.
Administrators may want to require use of SASL for server-to-server communications in order to ensure authentication and confidentiality (e.g., on an organization's private network). Compliant implementations SHOULD support SASL for this purpose.
Communications using XMPP normally occur over TCP sockets on port 5222 (client-to-server) or port 5269 (server-to-server), as registered with the IANA[5]. Use of these well-known ports allows administrators to easily enable or disable XMPP activity through existing and commonly-deployed firewalls.
At a minimum, all implementations MUST support the following mechanisms:
- for authentication:
- the SASL DIGEST-MD5 mechanism
- for confidentiality:
- TLS (using the TLS_RSA_WITH_3DES_EDE_CBC_SHA cipher)
- for both:
- TLS (using the TLS_RSA_WITH_3DES_EDE_CBC_SHA cipher supporting client-side certificates)
| TOC |
| [1] | World Wide Web Consortium, "Extensible Markup Language (XML) 1.0 (Second Edition)", W3C xml, October 2000. |
| [2] | Day, M., Aggarwal, S., Mohr, G. and J. Vincent, "A Model for Presence and Instant Messaging", RFC 2779, February 2000. |
| [3] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
| [4] | University of Southern California, " |