XEP-0313: Message Archive Management

Abstract:This document defines a protocol to query and control an archive of messages stored on a server.
Author:Matthew Wild
Copyright:© 1999 - 2014 XMPP Standards Foundation. SEE LEGAL NOTICES.
Status:Experimental
Type:Standards Track
Version:0.2
Last Updated:2013-05-31

WARNING: This Standards-Track document is Experimental. Publication as an XMPP Extension Protocol does not imply approval of this proposal by the XMPP Standards Foundation. Implementation of the protocol described herein is encouraged in exploratory implementations, but production systems are advised to carefully consider whether it is appropriate to deploy implementations of this protocol before it advances to a status of Draft.


Table of Contents


1. Introduction
2. Requirements
3. Message archives
    3.1. Archiving messages
4. Querying the archive
    4.1. Filtering results
       4.1.1. Filtering by JID
       4.1.2. Filtering by time received
       4.1.3. Limiting results
       4.1.4. Paging through results
    4.2. Query results
5. Archiving Preferences
    5.1. Simple configuration
       5.1.1. Default behaviour
       5.1.2. Always archive
       5.1.3. Never archive
    5.2. Advanced configuration
    5.3. JID matching
       5.3.1. General rules
       5.3.2. Outgoing messages
       5.3.3. Incoming messages
6. Determining support
7. Security Considerations
    7.1. Spoofing of 'archived'
    7.2. Data privacy
8. XMPP Registrar Considerations
9. Acknowledgements

Appendices
    A: Document Information
    B: Author Information
    C: Legal Notices
    D: Relation to XMPP
    E: Discussion Venue
    F: Requirements Conformance
    G: Notes
    H: Revision History


1. Introduction

It is a common desire for a user using XMPP for IM to want to store their messages in a central archive on their server. This feature allows them to record conversations that take place on clients that do not support local history storage, and also to synchronise their conversation history seamlessly between multiple clients.

2. Requirements

As this extension aims to make things easy for client developers, some research was made into the way clients handle history today. The resulting protocol was designed to allow for the following primary usage scenarios:

Another extension for archiving already exists in XMPP, Message Archiving (XEP-0136) [1]). However implementation experience has shown that the protocol defined therein supports rather more functionality than is typically needed for the above uses, and is significantly more effort to implement.

This specification aims to define a much simpler and modular protocol for working with a server-side message store. Through this it is hoped to boost implementation and deployment of archiving in XMPP. It should be noted that (although not required) a server is free to implement XEP-0136 alongside this protocol if it so chooses, though a mapping between both protocols is beyond the scope of this specification.

Notable functionality in XEP-0136 that is intentionally not defined by this specification for simplicity:

3. Message archives

An archive is a collection of messages stored on a user's server. Messages sent to or from a user's account are generally automatically added to a user's archive by the server. The collection is ordered chronologically by the time each message was sent/received.

Exactly which messages a server archives is left up to implementation and deployment policy, but as a minimum servers SHOULD NOT archive messages that do not have a <body/> child tag.

A stored message consists of at least the following pieces of information:

A server MAY impose limits on the size of a user's archive. For example a server might begin to discard old messages once the archive reaches a certain size, or only keep messages until they reach a certain age. The UIDs of deleted messages MUST NOT be reused for new messages.

There is no restriction on where an archive may be hosted. Servers that archive messages on behalf of local users SHOULD expose archives to the user on the user's bare JID, while a MUC service might allow MAM queries to be sent to the room's bare JID.

3.1 Archiving messages

When an incoming message is archived, the server SHOULD add an <archived/> element to the message, which informs the client of where the message is stored. The element MUST contain a 'by' attribute giving the JID of the archive (i.e. where the client would send queries to) and an 'id' attribute giving the message's UID within the archive.

Servers MUST NOT include the <archived/> element in messages addressed to JIDs that do not have permission to access the archive, such as a user's outgoing messages to their contacts.

Example 1. Client receives a message that has been archived

<message to='juliet@capulet.lit/balcony'
  from='romeo@montague.lit/orchard'
  type='chat'>
  <body>Call me but love, and I'll be new baptized; Henceforth I never will be Romeo.</body>
  <archived by='juliet@capulet.lit' id='28482-98726-73623' />
</message>

Naturally a message might be archived in multiple places, and include multiple <archived/> elements with different 'by' attributes. Clients MUST be prepared to handle this situation, and MUST ignore additional elements with 'by' attributes from entities they don't recognise, or that have not been determined to have MAM support (see Determining support). Archiving servers supporting MAM MUST strip any existing <archived/> element with a 'by' attribute equal to an archive that they provide.

4. Querying the archive

A client is able to query the archive for all messages within a certain timespan, optionally restricting results to those to/from a particular JID. To allow limiting the results or paging through them a client may use Result Set Management (XEP-0059) [2], which MUST be supported by servers.

A query consists of an <iq/> stanza addressed to the account or server entity hosting the archive, with a 'query' payload. On receiving the query, the server pushes to the client a series of messages from the archive that match the client's given criteria, and finally returns the <iq/> result.

Example 2. Querying the archive for messages

<iq type='get' id='juliet1'>
  <query xmlns='urn:xmpp:mam:tmp' queryid='f27' />
</iq>

[... server sends matching messages ...]

<iq type='result' id='juliet1'/>
    

To ensure that the client knows when the results are complete, the server MUST delay the result <iq/> until after it has pushed all the results to the client. An optional 'queryid' attribute allows the client to match results to a certain query.

4.1 Filtering results

By default all messages match a query, and filters are used to request a subset of the archived messages. The query can contain any combination of three filtering tags - <with/>, <start/> and <end/>. However each of these tags MUST NOT be specified more than once in a query.

4.1.1 Filtering by JID

If a <with/> element is present in the <query/>, it contains a JID against which to match messages. The server MUST only return messages if they match the supplied JID.

If <with/> is omitted, the server SHOULD return all messages in the selected timespan, regardless of the to/from addresses on each message.

Example 3. Querying for all messages to/from a particular JID

<iq type='get' id='juliet1'>
  <query xmlns='urn:xmpp:mam:tmp'>
    <with>juliet@capulet.lit</with>
  </query>
</iq>
    

If (and only if) the supplied JID is a bare JID (i.e. no resource is present), then the server SHOULD return messages if their bare to/from address would match it. For example, if the client supplies a 'with' of "juliet@capulet.lit" the query would also match messages to or from "juliet@capulet.lit/balcony" and "juliet@capulet.lit/chamber".

4.1.2 Filtering by time received

The <start/> and <end/> elements, if provided, MUST contain timestamps formatted according to the DateTime profile defined in XMPP Date and Time Profiles (XEP-0082) [3]

The <start/> element is used to filter out messages before a certain date/time. If specified, a server MUST only return messages whose timestamp is equal to or later than the given timestamp.

If omitted, the server SHOULD assume the value of <start/> to be equal to the date/time of the earliest message stored in the archive.

Conversely, the <end/> element is used to exclude from the results messages after a certain point in time. If specified, a server MUST only return messages whose timestamp is equal to or earlier than the timestamp given in the <end/> element.

If omitted, the server SHOULD assume the value of <end/> to be equal to the date/time of the most recent message stored in the archive.

Example 4. Querying the archive for all messages in a certain timespan

<iq type='get' id='juliet1'>
  <query xmlns='urn:xmpp:mam:tmp'>
      <start>2010-06-07T00:00:00Z</start>
      <end>2010-07-07T13:23:54Z</end>
  </query>
</iq>
    

Example 5. Querying the archive for all messages after a certain time

<iq type='get' id='juliet1'>
  <query xmlns='urn:xmpp:mam:tmp'>
    <start>2010-08-07T00:00:00Z</start>
  </query>
</iq>
    

4.1.3 Limiting results

Finally, in order for the client or server to limit the number of results transmitted at a time a server MUST support Result Set Management (XEP-0059) [4] and SHOULD support the paging mechanism defined therein. A client MAY include a <set/> element in its query.

For the purposes of this protocol, the UIDs used by RSM correspond with the UIDs of the stanzas stored in the archive.

Example 6. A query using Result Set Management

<iq type='get' id='q29302'>
  <query xmlns='urn:xmpp:mam:tmp'>
      <start>2010-08-07T00:00:00Z</start>
      <set xmlns='http://jabber.org/protocol/rsm'>
         <max>10</max>
      </set>
  </query>
</iq>
    

To conserve resources, a server MAY place a reasonable limit on how many stanzas may be pushed to a client in one request. If a query returns a number of stanzas greater than this limit and the client did not specify a limit using RSM then the server should return a policy-violation error to the client.

Example 7. Server responds to client that requests too many results without RSM

<iq type='error' id='q29302'>
  <error type='modify'>
    <policy-violation xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
    <text xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'>Too many results</text>
  </error>
</iq>
    
If the query did include a <set/> element then the server SHOULD simply return its limited results and in its <iq> result adjust the <before/> and <after/> to reflect the timestamps of the first and last message it is returning to the client. This allows clients to page through results by timestamp.

The result response MUST also include an RSM <set/> element indicating the UID of the first and last message of the (possibly limited) result set. This allows clients to accurately page through messages.

Example 8. Server responds to client with limited results using RSM

<iq type='result' id='q29302'>
  <query xmlns='urn:xmpp:mam:tmp'>
    <start>2010-06-07T00:00:00Z</start>
    <end>2010-07-07T05:03:27Z</end>
    <set xmlns='http://jabber.org/protocol/rsm'>
       <first index='0'>28482-98726-73623</first>
       <last>09af3-cc343-b409f</last>
       <count>20</count>
    </set>
  </query>
</iq>
    

The <first> and <last> elements specify the UID of the first and last returned results (not of the results that matched the query).

The RSM <count> element and the 'index' attribute on the RSM <first> element are optional, but servers SHOULD include them. Please refer to the RSM specification for more information surrounding their meaning and use.

4.1.4 Paging through results

Having previously made a query that returned results limited by the server (as described above), a client can re-send the same request and receive the next 'page' of results. It does this by including a <set> element with its request, containing an <after/> with the UID of the last message it received from the previous query.

Example 9. A page query using Result Set Management

<iq type='get' id='q29303'>
  <query xmlns='urn:xmpp:mam:tmp'>
      <start>2010-08-07T00:00:00Z</start>
      <set xmlns='http://jabber.org/protocol/rsm'>
         <max>10</max>
         <after>09af3-cc343-b409f</after>
      </set>
  </query>
</iq>
    

Note: There is no concept of an "open query", and servers MUST be prepared to receive arbitrary page requests at any time.

4.2 Query results

The server responds to the archive query by transmitting to the client all the messages that match the criteria the client requested. The results are sent as individual stanzas, with the original message encapsulated in a <forwarded/> element as described in Message Forwarding (XEP-0297) [5].

The result messages MUST contain a <result/> element with an 'id' attribute that gives the current message's archive UID. If the client gave a 'queryid' attribute in its initial query, the server MUST also include that in this result element.

The <result/> element contains a <forwarded/> element which SHOULD contain the original message as it was received, and SHOULD also contain a <delay/> element qualified by the 'urn:xmpp:delay' namespace specified in Delayed Delivery (XEP-0203) [6]. The value of the 'stamp' attribute MUST be the time the message was originally received by the forwarding entity.

Example 10. Server returns two matching messages

<message id='aeb213' to='juliet@capulet.lit/chamber'>
  <result xmlns='urn:xmpp:mam:tmp' queryid='f27' id='28482-98726-73623'>
    <forwarded xmlns='urn:xmpp:forward:0'>
      <delay xmlns='urn:xmpp:delay' stamp='2010-07-10T23:08:25Z'/>
      <message to='juliet@capulet.lit/balcony'
        from='romeo@montague.lit/orchard'
        type='chat'>
        <body>Call me but love, and I'll be new baptized; Henceforth I never will be Romeo.</body>
      </message>
    </forwarded>
  </result>
</message>

<message id='aeb214' to='juliet@capulet.lit/chamber'>
  <result xmlns='urn:xmpp:mam:tmp' queryid='f27' id='5d398-28273-f7382'>
    <forwarded xmlns='urn:xmpp:forward:0'>
      <delay xmlns='urn:xmpp:delay' stamp='2010-07-10T23:09:32Z'/>
      <message to='romeo@montague.lit/orchard'
         from='juliet@capulet.lit/balcony'
         type='chat' id='8a54s'>
        <body>What man art thou that thus bescreen'd in night so stumblest on my counsel?</body>
      </message>
    </forwarded>
  </result>
</message>
    

5. Archiving Preferences

Depending on implementation and deployment policies, a server MAY allow the user to have control over the server's archiving behaviour. This specification defines a basic protocol for this, and also allows a server to offer more advanced configuration to a user.

5.1 Simple configuration

If the server supports and allows configuration then it SHOULD implement the protocol defined in this section. This allows the user to configure the following preferences:

Example 11. Updating archiving preferences

  <iq type='set' id='juliet2'>
    <prefs xmlns='urn:xmpp:mam:tmp' default='roster'>
      <always>
        <jid>romeo@montague.lit</jid>
      </always>
      <never>
        <jid>montague@montague.lit</jid>
      </never>
    </prefs>
  </iq>

The server then replies with the applied preferences (note that due to server policies these MAY be different to the preferences sent by the client):

Example 12. Server responds with updated preferences

  <iq type='result' id='juliet1'>
    <prefs xmlns='urn:xmpp:mam:tmp' default='roster'>
      <always>
        <jid>romeo@montague.lit</jid>
      </always>
      <never>
        <jid>montague@montague.lit</jid>
      </never>
    </prefs>
  </iq>
      

5.1.1 Default behaviour

If a JID is in neither the 'always archive' nor the 'never archive' list then whether it is archived depends on this setting, the default.

The 'default' attribute of the 'prefs' element MUST be one of the following values:

5.1.2 Always archive

The <prefs/> element MAY contain an <always/> child element. If present, it contains a list of <jid/> elements, each containing a single JID. The server SHOULD archive any messages to/from this JID (see 'JID matching').

If missing from the preferences, <always/> SHOULD be assumed by the server to be an empty list.

5.1.3 Never archive

The <prefs/> element MAY contain an <never/> child element. If present, it contains a list of <jid/> elements, each containing a single JID. The server SHOULD NOT archive any messages to/from this JID (see 'JID matching').

If missing from the preferences, <never/> SHOULD be assumed by the server to be an empty list.

5.2 Advanced configuration

In addition to this protocol, a server MAY offer more advanced configuration to the user through Ad-Hoc Commands (XEP-0050) [7]. Such an interface might, for example, allow the user to configure what types of messages to store, or set a limit on how long messages should remain in the archive.

If supported, such a configuration command SHOULD be presented on the well-defined command node of "urn:xmpp:mam#configure".

5.3 JID matching

5.3.1 General rules

When comparing the message target JID against the user's roster (ie. when the user has set default='roster') the comparison MUST use the bare target JID (that is, stripped of any resource).

For matching against entries in either the 'allow' or 'never' lists, for each listed JID:

5.3.2 Outgoing messages

For outgoing messages, the server MUST use the value of the 'to' attribute as the target JID.

5.3.3 Incoming messages

For incoming messages, the server MUST use the value of the 'from' attribute as the target JID.

6. Determining support

If a server or other entity hosts archives and supports MAM queries, it MUST advertise the 'urn:xmpp:mam:tmp' feature in response to Service Discovery (XEP-0030) [8] requests made to archiving JIDs (i.e. JIDs hosting an archive, such as users' bare JIDs):

Example 13. Client queries for server features

  <iq type='get' id='disco1' to='juliet@capulet.lit' from='juliet@capulet.lit/balcony'>
      <query xmlns='http://jabber.org/protocol/disco#info'/>
  </iq>

Example 14. Server responds with features

  <iq type='result' id='disco1' from='juliet@capulet.lit' to='juliet@capulet.lit/balcony'>
      <query xmlns='http://jabber.org/protocol/disco#info'>
          ...
          <feature var='urn:xmpp:mam:tmp'/>
          ...
      </query>
  </iq>

7. Security Considerations

7.1 Spoofing of 'archived'

Clients and servers may receive messages containing <archived/> elements that have not been verified. If proper handling of received <archived/> elements is not followed, an attacker could disrupt a client's cache of archived message UIDs, and prevent the client from fetching future messages correctly (by using an 'id' that doesn't exist in the archive).

7.2 Data privacy

An archive generally consists of private conversations, and so a server MUST adequately protect an archive from unauthorized third-party access. For example authorized parties for a user's archive would include the just the user, and a MUC archive for a private room might be restricted to room members. An implementation MAY choose to allow access to any archive by server administrators.

A server SHOULD provide a mechanism for a user to disable archiving of messages with all or specific contacts, such as via the configuration protocol described in this document. This allows the user to prevent the archiving of potentially sensitive messages in the first place.

A server MAY automatically prevent certain sensitive messages from being archived. How such messages are identified is beyond the scope of this specification, but technologies such as Security Labels in XMPP (XEP-0258) [9] may be used, for example.

8. XMPP Registrar Considerations

9. Acknowledgements

Many thanks to Kevin Smith, Dave Cridland, Kim Alvefur, Yann Leboulanger and Lance Stout for their input and feedback on this specification.


Appendices


Appendix A: Document Information

Series: XEP
Number: 0313
Publisher: XMPP Standards Foundation
Status: Experimental
Type: Standards Track
Version: 0.2
Last Updated: 2013-05-31
Approving Body: XMPP Council
Dependencies: XMPP Core, XEP-0030, XEP-0059, XEP-0297
Supersedes: None
Superseded By: None
Short Name: mam
Schema: <http://www.xmpp.org/schemas/archive-management.xsd>
Source Control: HTML
This document in other formats: XML  PDF


Appendix B: Author Information

Matthew Wild

Email: me@matthewwild.co.uk
JabberID: me@matthewwild.co.uk


Appendix C: Legal Notices

Copyright

This XMPP Extension Protocol is copyright © 1999 - 2014 by the XMPP Standards Foundation (XSF).

Permissions

Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the "Specification"), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.

Disclaimer of Warranty

## NOTE WELL: This Specification is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. ##

Limitation of Liability

In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.

IPR Conformance

This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which can be found at <http://xmpp.org/about-xmpp/xsf/xsf-ipr-policy/> or obtained by writing to XMPP Standards Foundation, 1899 Wynkoop Street, Suite 600, Denver, CO 80202 USA).

Appendix D: Relation to XMPP

The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 6120) and XMPP IM (RFC 6121) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.


Appendix E: Discussion Venue

The primary venue for discussion of XMPP Extension Protocols is the <standards@xmpp.org> discussion list.

Discussion on other xmpp.org discussion lists might also be appropriate; see <http://xmpp.org/about/discuss.shtml> for a complete list.

Errata can be sent to <editor@xmpp.org>.


Appendix F: Requirements Conformance

The following requirements keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".


Appendix G: Notes

1. XEP-0136: Message Archiving <http://xmpp.org/extensions/xep-0136.html>.

2. XEP-0059: Result Set Management <http://xmpp.org/extensions/xep-0059.html>.

3. XEP-0082: XMPP Date and Time Profiles <http://xmpp.org/extensions/xep-0082.html>.

4. XEP-0059: Result Set Management <http://xmpp.org/extensions/xep-0059.html>.

5. XEP-0297: Message Forwarding <http://xmpp.org/extensions/xep-0297.html>.

6. XEP-0203: Delayed Delivery <http://xmpp.org/extensions/xep-0203.html>.

7. XEP-0050: Ad-Hoc Commands <http://xmpp.org/extensions/xep-0050.html>.

8. XEP-0030: Service Discovery <http://xmpp.org/extensions/xep-0030.html>.

9. XEP-0258: Security Labels in XMPP <http://xmpp.org/extensions/xep-0258.html>.


Appendix H: Revision History

Note: Older versions of this specification might be available at http://xmpp.org/extensions/attic/

Version 0.2 (2013-05-31)

Document the ability to page through results by message UIDs, define the <archived/> element, and various minor improvements.

(mw)

Version 0.1 (2012-04-18)

Initial version, to much rejoicing.

(mw)

END