Clarify terminology around aggregations

I've done my best to remove the word "bundle", because I feel like it causes
more confusion than it provides. Instead I have favoured "aggregated child
events" which I think is clearer.

Some general clarification around these parts of the spec.
This commit is contained in:
Richard van der Hoff 2023-01-31 19:53:24 +00:00
parent d55af02738
commit 3d3f28c848
7 changed files with 96 additions and 79 deletions

View file

@ -0,0 +1 @@
Clarify the sections of the specification concerning aggregation of child events.

View file

@ -1979,7 +1979,7 @@ This specification describes the following relationship types:
* [Threads](#threading). * [Threads](#threading).
* [References](#reference-relations) * [References](#reference-relations)
#### Aggregations #### Aggregations of child events
{{% added-in v="1.3" %}} {{% added-in v="1.3" %}}
@ -1993,14 +1993,12 @@ of times that `key` was used by child events.
The actual aggregation format depends on the `rel_type`. The actual aggregation format depends on the `rel_type`.
Aggregations are sometimes automatically included by a server alongside the parent When an event is served to the client through the APIs listed below, a
event. This is known as a "bundled aggregation" or "bundle" for simplicity. The `m.relations` property is included under `unsigned` if the event has child
act of doing this is "bundling". events which can be aggregated and point at it. The `m.relations` property is
an object keyed by `rel_type` and value being the type-specific aggregated
When an event is served to the client through the APIs listed below, a `m.relations` property format for that `rel_type`. This `m.relations` property is known as a "bundled
is included under `unsigned` if the event has child events which can be aggregated and point aggregation".
at it. The `m.relations` property is an object keyed by `rel_type` and value being the type-specific
aggregated format for that `rel_type`, also known as the bundle.
For example (unimportant fields not included): For example (unimportant fields not included):
@ -2036,10 +2034,10 @@ For example (unimportant fields not included):
} }
``` ```
Note how the `org.example.possible_annotations` bundle is an array compared to the Note how the `org.example.possible_annotations` aggregation is an array, while in the
`org.example.possible_thread` bundle where the server is summarising the state of `org.example.possible_thread` aggregation where the server is summarising the state of
the relationship in a single object. Both are valid ways to aggregate, and their the relationship in a single object. Both are valid ways to aggregate: the format of an
exact types depend on the `rel_type`. aggregation depends on the `rel_type`.
{{% boxes/warning %}} {{% boxes/warning %}}
State events do not currently receive bundled aggregations. This is not State events do not currently receive bundled aggregations. This is not
@ -2066,11 +2064,11 @@ such as `/initialSync`.
While this functionality allows the client to see what was known to the server at the While this functionality allows the client to see what was known to the server at the
time of handling, the client should continue to aggregate locally if it is aware of time of handling, the client should continue to aggregate locally if it is aware of
the relationship type's behaviour. For example, a client might increment a `count` the relationship type's behaviour. For example, a client might internally increment a `count`
on a parent event's bundle if it saw a new child event which referenced that parent. in a parent event's aggregation data if it saw a new child event which referenced that parent.
The bundle provided by the server only includes child events which were known at the The aggregation provided by the server only includes child events which were known at the
time the client would receive the bundle. For example, in a single `/sync` response time the client would receive the aggregation. For example, in a single `/sync` response
with the parent and multiple child events the child events would have already been with the parent and multiple child events the child events would have already been
included on the parent's `m.relations` field. Events received in future syncs would included on the parent's `m.relations` field. Events received in future syncs would
need to be aggregated manually by the client. need to be aggregated manually by the client.
@ -2080,7 +2078,7 @@ Events from [ignored users](#ignoring-users) do not appear in the aggregation
from the server, however clients might still have events from ignored users cached. Like from the server, however clients might still have events from ignored users cached. Like
with normal events, clients will need to de-aggregate child events sent by ignored users to with normal events, clients will need to de-aggregate child events sent by ignored users to
avoid them being considered in counts. Servers must additionally ensure they do not avoid them being considered in counts. Servers must additionally ensure they do not
consider child events from ignored users when preparing a bundle for the client. consider child events from ignored users when preparing an aggregation for the client.
{{% /boxes/note %}} {{% /boxes/note %}}
When a parent event is redacted, the child events which pointed to that parent remain, however When a parent event is redacted, the child events which pointed to that parent remain, however
@ -2089,7 +2087,7 @@ to de-aggregate or disassociate the event once the relationship is lost. Clients
aggregation or which handle redactions locally should do the same. aggregation or which handle redactions locally should do the same.
It is suggested that clients perform local echo on aggregations — for instance, aggregating It is suggested that clients perform local echo on aggregations — for instance, aggregating
a new child event into a bundle optimistically until the server returns a failure or the client a new child event into a parent event optimistically until the server returns a failure or the client
gives up on sending the event, at which point the event should be de-aggregated and an gives up on sending the event, at which point the event should be de-aggregated and an
error or similar shown. The client should be cautious to not aggregate an event twice if error or similar shown. The client should be cautious to not aggregate an event twice if
it has already optimistically aggregated the event. Clients are encouraged to take this it has already optimistically aggregated the event. Clients are encouraged to take this
@ -2098,7 +2096,7 @@ likely using the transaction ID as a temporary event ID until a proper event ID
{{% boxes/warning %}} {{% boxes/warning %}}
Due to history visibility restrictions, child events might not be visible to the user Due to history visibility restrictions, child events might not be visible to the user
if they are in a section of history the user cannot see. This means any bundles which would if they are in a section of history the user cannot see. This means any aggregations which would
normally include those events will be lacking them and the client will not be able to normally include those events will be lacking them and the client will not be able to
locally aggregate the events either — relating events of importance (such as votes) should locally aggregate the events either — relating events of importance (such as votes) should
take into consideration history visibility. take into consideration history visibility.

View file

@ -197,7 +197,7 @@ replacement event.
Note that there can be multiple events with an `m.replace` relationship to a Note that there can be multiple events with an `m.replace` relationship to a
given event (for example, if an event is edited multiple times). These should given event (for example, if an event is edited multiple times). These should
be [aggregated](#aggregations) by the homeserver. be [aggregated](#aggregations-of-child-events) by the homeserver.
The aggregation format of `m.replace` relationships gives the `event_id`, The aggregation format of `m.replace` relationships gives the `event_id`,
`origin_server_ts`, and `sender` of the **most recent** replacement event. The `origin_server_ts`, and `sender` of the **most recent** replacement event. The
@ -205,8 +205,9 @@ most recent event is determined by comparing `origin_server_ts`; if two or more
replacement events have identical `origin_server_ts`, the event with the replacement events have identical `origin_server_ts`, the event with the
lexicographically largest `event_id` is treated as more recent. lexicographically largest `event_id` is treated as more recent.
This aggregation is bundled under the `unsigned` property as `m.relations` for any As with any other aggregation of child events, the `m.replace` aggregation is
event that is the target of an `m.replace` relationship. For example: included under the `m.relations` property in `unsigned` for any event that is
the target of an `m.replace` relationship. For example:
```json ```json
{ {
@ -224,22 +225,22 @@ event that is the target of an `m.replace` relationship. For example:
} }
``` ```
If the original event is However, if the original event is [redacted](#redactions), any replacement
[redacted](#redactions), any events are *not* aggregated and `m.replace` is omitted from the aggregation
`m.replace` relationship should **not** be bundled with it (whether or not any returned under `m.relations` (whether or not any subsequent replacements are
subsequent replacements are themselves redacted). Note that this behaviour is themselves redacted). Note that this behaviour is specific to the `m.replace`
specific to the `m.replace` relationship. See also [redactions of edited relationship. See also [redactions of edited
events](#redactions-of-edited-events) below. events](#redactions-of-edited-events) below.
##### Server-side replacement of content ##### Server-side replacement of content
Whenever an `m.replace` is to be bundled with an event as above, the server Whenever an `m.replace` is to be bundled with its parent event as above, the server
should also modify the content of the original event according to the should also modify the content of the original event according to the
`m.new_content` of the most recent replacement event (determined as above). `m.new_content` of the most recent replacement event (determined as above).
An exception applies to [`GET /_matrix/client/v3/rooms/{roomId}/event/{eventId}`](#get_matrixclientv3roomsroomideventeventid), An exception applies to [`GET /_matrix/client/v3/rooms/{roomId}/event/{eventId}`](#get_matrixclientv3roomsroomideventeventid),
which should return the unmodified event (though the relationship should still which should return the unmodified event (though the replacement event should still
be bundled, as described above). be included under `m.relations`, as described above).
#### Client behaviour #### Client behaviour
@ -280,8 +281,9 @@ subsequent edits, from the visible timeline. In this situation, homeservers
will return an empty `content` for the original event as with any other will return an empty `content` for the original event as with any other
redacted event, and as redacted event, and as
[above](#server-side-aggregation-of-mreplace-relationships) the replacement [above](#server-side-aggregation-of-mreplace-relationships) the replacement
events will not be bundled with the original event. Note that the subsequent edits are events will not be included in the aggregation bundled with the original
not actually redacted themselves: they simply serve no purpose within the visible timeline. event. Note that the subsequent edits are not actually redacted themselves:
they simply serve no purpose within the visible timeline.
#### Edits of replies #### Edits of replies

View file

@ -19,11 +19,12 @@ messages.
##### Server-side aggregation of `m.reference` ##### Server-side aggregation of `m.reference`
The aggregation format of `m.reference` relations consists of a single `chunk` property, The [aggregation](#aggregations-of-child-events) format of `m.reference`
which lists all the events which `m.reference` the event (the parent). Currently, relations consists of a single `chunk` property, which lists all the events
only a single `event_id` field is present on the events in the `chunk`. which `m.reference` the event (the parent). Currently, only a single `event_id`
field is present on the events in the `chunk`.
An example `m.reference` would be: For example, given an event with the following `m.reference` relationship:
```json ```json
{ {
@ -38,7 +39,7 @@ An example `m.reference` would be:
} }
``` ```
The [bundle](#aggregations) under `m.relations` would appear similar to the following: The aggregation would appear similar to the following:
```json ```json
{ {

View file

@ -11,12 +11,14 @@ Clients SHOULD render threads differently to regular messages or replies in the
as by providing some context to what is going on in the thread but keeping the full conversation as by providing some context to what is going on in the thread but keeping the full conversation
history behind a disclosure. history behind a disclosure.
Threads are established using a `rel_type` of `m.thread` and reference the *thread root* (the Threads are established using a `rel_type` of `m.thread` and reference the
first event in a thread). It is not possible to create a thread from an event with a `rel_type`, *thread root* (the first event in a thread). It is not possible to create a
which includes not being able to nest threads. All conversation in a thread reference the thread thread from an event which itself is the child of an event relationship (i.e.,
root instead of the most recent message, unlike rich reply chains. one with an `m.relates_to` property). It is therefore also not possible to nest
threads. All events in a thread reference the thread root instead of the
most recent message, unlike rich reply chains.
As a worked example, the following represents a thread and how it'd be formed: As a worked example, the following represents a thread and how it would be formed:
```json ```json
{ {
@ -128,11 +130,11 @@ clients is used to create a reply within a thread: clients should render the eve
##### Validation of `m.thread` relationships ##### Validation of `m.thread` relationships
Servers SHOULD reject client requests which attempt to start a thread off an event with a Servers SHOULD reject client requests which attempt to start a thread off an
`rel_type`. If the client attempts to target an event which already has an `m.thread`, event with an `m.relates_to` property. If the client attempts to target an event which itself
`m.reference`, or any other `rel_type` then it should receive a HTTP 400 error response has an `m.relates_to` property, then it should receive a HTTP 400 error
with appropriate error message, as per the [standard error response](#standard-error-response) response with appropriate error message, as per the [standard error
structure. response](#standard-error-response) structure.
{{% boxes/note %}} {{% boxes/note %}}
A specific error code is not currently available for this case: servers should use `M_UNKNOWN` A specific error code is not currently available for this case: servers should use `M_UNKNOWN`
@ -141,12 +143,16 @@ alongside the HTTP 400 status code.
##### Server-side aggregation of `m.thread` relationships ##### Server-side aggregation of `m.thread` relationships
Given threads always reference the thread root, an event can have multiple "child" events which Given threads always reference the thread root, an event can have multiple
then form the thread itself. These events should be [aggregated](#aggregations) by the server. "child" events which then form the thread itself. These events should be
[aggregated](#aggregations-of-child-events) by the server.
The aggregation for threads includes some information about the user's participation in the thread, The aggregation for threads includes some information about the user's participation in the thread,
the approximate number of events in the thread (as known to the server), and the most recent event the approximate number of events in the thread (as known to the server), and the most recent event
in the thread (topologically). This is then bundled into the event as `m.thread`: in the thread (topologically).
As with any other aggregation of child events, the `m.thread` aggregation is
included under the `m.relations` property in `unsigned` for the thread root. For example:
```json ```json
{ {
@ -165,6 +171,11 @@ in the thread (topologically). This is then bundled into the event as `m.thread`
"content": { "content": {
"msgtype": "m.text", "msgtype": "m.text",
"body": "Woo! Threads!" "body": "Woo! Threads!"
},
"unsigned": {
"m.relations": {
// ...
}
} }
}, },
"count": 7, "count": 7,
@ -178,15 +189,17 @@ in the thread (topologically). This is then bundled into the event as `m.thread`
`latest_event` is the most recent event (topologically to the server) in the thread sent by an `latest_event` is the most recent event (topologically to the server) in the thread sent by an
un-[ignored user](#ignoring-users). un-[ignored user](#ignoring-users).
Note that any bundled aggregations on `latest_event` should also be present. The server should be Note that, as in the example above, child events of the `latest_event` should
careful to avoid loops, though loops are not currently possible due to `m.thread` not being possible themselves be aggregated and included under `m.relations` for that event. The
to target an event with a `rel_type` already. server should be careful to avoid loops, though loops are not currently
possible due to `m.thread` not being permitted to target an event with an
`m.relates_to` property.
`count` is simply the number of events using `m.thread` as a `rel_type` pointing to the target event. `count` is simply the number of events using `m.thread` as a `rel_type` pointing to the target event.
It does not include events sent by [ignored users](#ignoring-users). It does not include events sent by [ignored users](#ignoring-users).
`current_user_participated` is `true` when the authenticated user is either: `current_user_participated` is `true` when the authenticated user is either:
1. The `sender` of the event receiving the bundle (they sent the thread root). 1. The `sender` of the thread root event.
2. The `sender` of an event which references the thread root with a `rel_type` of `m.thread`. 2. The `sender` of an event which references the thread root with a `rel_type` of `m.thread`.
#### Querying threads in a room #### Querying threads in a room

View file

@ -32,8 +32,7 @@ paths:
x-addedInMatrixVersion: "1.4" x-addedInMatrixVersion: "1.4"
summary: Retrieve a list of threads in a room, with optional filters. summary: Retrieve a list of threads in a room, with optional filters.
description: |- description: |-
Paginates over the thread roots in a room, ordered by the `latest_event` of each thread root Paginates over the thread roots in a room.
in its bundle.
operationId: getThreadRoots operationId: getThreadRoots
security: security:
- accessToken: [] - accessToken: []
@ -85,8 +84,8 @@ paths:
chunk: chunk:
type: array type: array
description: |- description: |-
The thread roots, ordered by the `latest_event` in each event's aggregation bundle. All events The thread roots, ordered by the `latest_event` in each event's aggregated children. All events
returned include bundled [aggregations](/client-server-api/#aggregations). returned include bundled [aggregations](/client-server-api/#aggregations-of-child-events).
If the thread root event was sent by an [ignored user](/client-server-api/#ignoring-users), the If the thread root event was sent by an [ignored user](/client-server-api/#ignoring-users), the
event is returned redacted to the caller. This is to simulate the same behaviour of a client doing event is returned redacted to the caller. This is to simulate the same behaviour of a client doing

View file

@ -44,33 +44,36 @@ Stylistic notes
General General
~~~~~~~ ~~~~~~~
Try to write clearly and unambiguously. Remember that many readers will not * Try to write clearly and unambiguously. Remember that many readers will not
have English as their first language. have English as their first language.
Prefer British English (colour, -ise) to American English. * Prefer British English (colour, -ise) to American English.
The word "homeserver" is spelt thus (rather than "home server", "Homeserver", * The word "homeserver" is spelt thus (rather than "home server", "Homeserver",
or (argh) "Home Server"). However, an identity server is two words. or (argh) "Home Server"). However, an identity server is two words.
An "identity server" (spelt thus) implements the Identity Service API (also spelt * An "identity server" (spelt thus) implements the Identity Service API (also spelt
thus). However, "Application Services" (spelt thus) implement the Application Service thus). However, "Application Services" (spelt thus) implement the Application Service
API. Application Services should not be called "appservices" in documentation. API. Application Services should not be called "appservices" in documentation.
.. Rationale: "homeserver" distinguishes from a "home server" which is a server .. Rationale: "homeserver" distinguishes from a "home server" which is a server
you have at home. "identity server" is clear, whereas "identityserver" is you have at home. "identity server" is clear, whereas "identityserver" is
horrible. horrible.
Lists should: * Lists should:
* Be introduced with a colon. * Be introduced with a colon.
* Be used where they provide clarity. * Be used where they provide clarity.
* Contain entries which start with a capital and end with a full stop. * Contain entries which start with a capital and end with a full stop.
When talking about properties in JSON objects, prefer the word "property" to "field", * When talking about properties in JSON objects, prefer the word "property" to "field",
"member", or various other alternatives. For example: "this property will be set to "member", or various other alternatives. For example: "this property will be set to
X if ...". Also avoid the term "key" unless you are specifically talking about the X if ...". Also avoid the term "key" unless you are specifically talking about the
*name* of a property - and be mindful of the scope for confusion with cryptographic *name* of a property - and be mindful of the scope for confusion with cryptographic
keys. keys.
* "i.e." (*id est*) is an abbreviation and hence is written with two full
stops. So is "e.g." (*exempli gratia*).
Changes between spec versions Changes between spec versions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~