mirror of
https://github.com/matrix-org/matrix-spec
synced 2026-05-16 23:10:43 +02:00
Improve "Voice over IP" module of CS API (#2374)
* Convert m.call.* schemas syntax to YAML For consistency. Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr> * Clarify user ID format in m.call.invite Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr> * Add links to definitions Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr> * Improve call schemas To look more consistent with other schemas. Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr> * Clarify URI format in GET /voip/turnServer Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr> * Add changelog Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr> * Fix regex Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr> --------- Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
This commit is contained in:
parent
8bedf3882c
commit
ba960f8d32
|
|
@ -0,0 +1 @@
|
||||||
|
Clarify formats of string types.
|
||||||
|
|
@ -1,20 +1,20 @@
|
||||||
|
|
||||||
### Voice over IP
|
### Voice over IP
|
||||||
|
|
||||||
This module outlines how two users in a room can set up a Voice over IP
|
This module outlines how two users in a room can set up a Voice over IP (VoIP)
|
||||||
(VoIP) call to each other. Voice and video calls are built upon the
|
call to each other. Voice and video calls are built upon the [WebRTC 1.0
|
||||||
WebRTC 1.0 standard. Call signalling is achieved by sending [message
|
standard](https://www.w3.org/TR/webrtc/). Call signalling is achieved by sending
|
||||||
events](#events) to the room. In this version of the spec, only two-party
|
[message events](#events) to the room. In this version of the spec, only
|
||||||
communication is supported (e.g. between two peers, or between a peer
|
two-party communication is supported (e.g. between two peers, or between a peer
|
||||||
and a multi-point conferencing unit). Calls can take place in rooms with
|
and a multi-point conferencing unit). Calls can take place in rooms with
|
||||||
multiple members, but only two devices can take part in the call.
|
multiple members, but only two devices can take part in the call.
|
||||||
|
|
||||||
All VoIP events have a `version` field. This is used to determine whether
|
All VoIP events have a `version` field. This is used to determine whether
|
||||||
devices support this new version of the protocol. For example, clients can use
|
devices support this new version of the protocol. For example, clients can use
|
||||||
this field to know whether to expect an `m.call.select_answer` event from their
|
this field to know whether to expect an [`m.call.select_answer`](#mcallselect_answer)
|
||||||
opponent. If clients see events with `version` other than `0` or `"1"`
|
event from their opponent. If clients see events with `version` other than `0`
|
||||||
(including, for example, the numeric value `1`), they should treat these the
|
or `"1"` (including, for example, the numeric value `1`), they should treat
|
||||||
same as if they had `version` == `"1"`.
|
these the same as if they had `version` == `"1"`.
|
||||||
|
|
||||||
Note that this implies any and all future versions of VoIP events should be
|
Note that this implies any and all future versions of VoIP events should be
|
||||||
backwards-compatible. If it does become necessary to introduce a non
|
backwards-compatible. If it does become necessary to introduce a non
|
||||||
|
|
@ -29,10 +29,10 @@ lowercase alphanumeric characters is recommended. Parties in the call are identi
|
||||||
`(user_id, party_id)`.
|
`(user_id, party_id)`.
|
||||||
|
|
||||||
The client adds a `party_id` field containing this ID to the top-level of the content of all VoIP events
|
The client adds a `party_id` field containing this ID to the top-level of the content of all VoIP events
|
||||||
it sends on the call, including `m.call.invite`. Clients use this to identify remote echo of their own
|
it sends on the call, including [`m.call.invite`](#mcallinvite). Clients use this to identify remote echo
|
||||||
events: since a user may call themselves, they cannot simply ignore events from their own user. This
|
of their own events: since a user may call themselves, they cannot simply ignore events from their own
|
||||||
field also identifies different answers sent by different clients to an invite, and matches `m.call.candidates`
|
user. This field also identifies different answers sent by different clients to an invite, and matches
|
||||||
events to their respective answer/invite.
|
[`m.call.candidates`](#mcallcandidates) events to their respective answer/invite.
|
||||||
|
|
||||||
A client implementation may choose to use the device ID used in end-to-end cryptography for this purpose,
|
A client implementation may choose to use the device ID used in end-to-end cryptography for this purpose,
|
||||||
or it may choose, for example, to use a different one for each call to avoid leaking information on which
|
or it may choose, for example, to use a different one for each call to avoid leaking information on which
|
||||||
|
|
@ -44,15 +44,16 @@ A grammar for `party_id` is defined [below](#grammar-for-voip-ids).
|
||||||
#### Politeness
|
#### Politeness
|
||||||
In line with [WebRTC perfect negotiation](https://w3c.github.io/webrtc-pc/#perfect-negotiation-example)
|
In line with [WebRTC perfect negotiation](https://w3c.github.io/webrtc-pc/#perfect-negotiation-example)
|
||||||
there are rules to establish which party is polite in the process of renegotiation. The callee is
|
there are rules to establish which party is polite in the process of renegotiation. The callee is
|
||||||
always the polite party. In a glare situation, the politeness of a party is therefore determined by
|
always the polite party. In a [glare](#glare) situation, the politeness of a party is therefore
|
||||||
whether the inbound or outbound call is used: if a client discards its outbound call in favour of
|
determined by whether the inbound or outbound call is used: if a client discards its outbound call
|
||||||
an inbound call, it becomes the polite party.
|
in favour of an inbound call, it becomes the polite party.
|
||||||
|
|
||||||
#### Call Event Liveness
|
#### Call Event Liveness
|
||||||
`m.call.invite` contains a `lifetime` field that indicates how long the offer is valid for. When
|
[`m.call.invite`](#mcallinvite) contains a `lifetime` field that indicates how long the offer is
|
||||||
a client receives an invite, it should use the event's `age` field in the sync response plus the
|
valid for. When a client receives an invite, it should use the event's `age` field in the
|
||||||
time since it received the event from the homeserver to determine whether the invite is still valid.
|
[`GET /sync`](#get_matrixclientv3sync) response plus the time since it received the event from the
|
||||||
The use of the `age` field ensures that incorrect clocks on client devices don't break calls.
|
homeserver to determine whether the invite is still valid. The use of the `age` field ensures that
|
||||||
|
incorrect clocks on client devices don't break calls.
|
||||||
|
|
||||||
If the invite is still valid *and will remain valid for long enough for the user to accept the call*,
|
If the invite is still valid *and will remain valid for long enough for the user to accept the call*,
|
||||||
it should signal an incoming call. The amount of time allowed for the user to accept the call may
|
it should signal an incoming call. The amount of time allowed for the user to accept the call may
|
||||||
|
|
@ -83,7 +84,7 @@ Clients should aim to send a small number of candidate events, with guidelines:
|
||||||
|
|
||||||
#### End-of-candidates
|
#### End-of-candidates
|
||||||
An ICE candidate whose value is the empty string means that no more ICE candidates will
|
An ICE candidate whose value is the empty string means that no more ICE candidates will
|
||||||
be sent. Clients must send such a candidate in an `m.call.candidates` message.
|
be sent. Clients must send such a candidate in an [`m.call.candidates`](#mcallcandidates) message.
|
||||||
The WebRTC spec requires browsers to generate such a candidate, however note that at time of writing,
|
The WebRTC spec requires browsers to generate such a candidate, however note that at time of writing,
|
||||||
not all browsers do (Chrome does not, but does generate an `icegatheringstatechange` event). The
|
not all browsers do (Chrome does not, but does generate an `icegatheringstatechange` event). The
|
||||||
client should send any remaining candidates once candidate generation finishes, ignoring timeouts above.
|
client should send any remaining candidates once candidate generation finishes, ignoring timeouts above.
|
||||||
|
|
@ -130,36 +131,48 @@ or not there have been any changes to the Matrix spec.
|
||||||
A call is set up with message events exchanged as follows:
|
A call is set up with message events exchanged as follows:
|
||||||
|
|
||||||
```nohighlight
|
```nohighlight
|
||||||
Caller Callee
|
+---------+ +---------+
|
||||||
[Place Call]
|
| Caller | | Callee |
|
||||||
m.call.invite ----------->
|
+---------+ +---------+
|
||||||
m.call.candidate -------->
|
| |
|
||||||
[..candidates..] -------->
|
(Places Call) |
|
||||||
[Answers call]
|
|------- m.call.invite ------->|
|
||||||
<--------------- m.call.answer
|
|----- m.call.candidate ------>|
|
||||||
m.call.select_answer ----------->
|
|----- [..candidates..] ------>|
|
||||||
[Call is active and ongoing]
|
| |
|
||||||
<--------------- m.call.hangup
|
| (Answers call)
|
||||||
|
|<------ m.call.answer --------|
|
||||||
|
|--- m.call.select_answer --->|
|
||||||
|
. .
|
||||||
|
. (Call is active and ongoing) .
|
||||||
|
. .
|
||||||
|
| (Ends call)
|
||||||
|
|<------ m.call.hangup --------|
|
||||||
```
|
```
|
||||||
|
|
||||||
Or a rejected call:
|
Or a rejected call:
|
||||||
|
|
||||||
```nohighlight
|
```nohighlight
|
||||||
Caller Callee
|
+---------+ +---------+
|
||||||
m.call.invite ------------>
|
| Caller | | Callee |
|
||||||
m.call.candidate --------->
|
+---------+ +---------+
|
||||||
[..candidates..] --------->
|
| |
|
||||||
[Rejects call]
|
(Places Call) |
|
||||||
<-------------- m.call.hangup
|
|------- m.call.invite ------->|
|
||||||
|
|----- m.call.candidate ------>|
|
||||||
|
|----- [..candidates..] ------>|
|
||||||
|
| |
|
||||||
|
| (Rejects call)
|
||||||
|
|<------ m.call.reject --------|
|
||||||
```
|
```
|
||||||
|
|
||||||
Calls are negotiated according to the WebRTC specification.
|
Calls are negotiated according to the WebRTC specification.
|
||||||
|
|
||||||
In response to an incoming invite, a client may do one of several things:
|
In response to an incoming invite, a client may do one of several things:
|
||||||
* Attempt to accept the call by sending an `m.call.answer`.
|
* Attempt to accept the call by sending an [`m.call.answer`](#mcallanswer).
|
||||||
* Actively reject the call everywhere: send an `m.call.reject` as per above, which will stop the call from
|
* Actively reject the call everywhere: send an [`m.call.reject`](#mcallreject) as per above, which
|
||||||
ringing on all the user's devices and the caller's client will inform them that the user has
|
will stop the call from ringing on all the user's devices and the caller's client will inform
|
||||||
rejected their call.
|
them that the user has rejected their call.
|
||||||
* Ignore the call: send no events, but stop alerting the user about the call. The user's other
|
* Ignore the call: send no events, but stop alerting the user about the call. The user's other
|
||||||
devices will continue to ring, and the caller's device will continue to indicate that the call
|
devices will continue to ring, and the caller's device will continue to indicate that the call
|
||||||
is ringing, and will time the call out in the normal way if no other device responds.
|
is ringing, and will time the call out in the normal way if no other device responds.
|
||||||
|
|
@ -224,8 +237,8 @@ As calls are "placed" to rooms rather than users, the glare resolution
|
||||||
algorithm outlined below is only considered for calls which are to the
|
algorithm outlined below is only considered for calls which are to the
|
||||||
same room. The algorithm is as follows:
|
same room. The algorithm is as follows:
|
||||||
|
|
||||||
- If an `m.call.invite` to a room is received whilst the client is
|
- If an [`m.call.invite`](#mcallinvite) to a room is received whilst the
|
||||||
**preparing to send** an `m.call.invite` to the same room:
|
client is **preparing to send** an `m.call.invite` to the same room:
|
||||||
- the client should cancel its outgoing call and instead
|
- the client should cancel its outgoing call and instead
|
||||||
automatically accept the incoming call on behalf of the user.
|
automatically accept the incoming call on behalf of the user.
|
||||||
- If an `m.call.invite` to a room is received **after the client has
|
- If an `m.call.invite` to a room is received **after the client has
|
||||||
|
|
|
||||||
|
|
@ -44,6 +44,8 @@ paths:
|
||||||
type: array
|
type: array
|
||||||
items:
|
items:
|
||||||
type: string
|
type: string
|
||||||
|
format: uri
|
||||||
|
pattern: "^turns?:"
|
||||||
description: A list of TURN URIs
|
description: A list of TURN URIs
|
||||||
ttl:
|
ttl:
|
||||||
type: integer
|
type: integer
|
||||||
|
|
|
||||||
|
|
@ -1,44 +1,37 @@
|
||||||
{
|
$schema: https://json-schema.org/draft/2020-12/schema
|
||||||
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
|
||||||
"type": "object",
|
type: object
|
||||||
"description": "This event is sent by the callee when they wish to answer the call.",
|
description: This event is sent by the callee when they wish to answer the call.
|
||||||
"x-weight": 40,
|
x-weight: 40
|
||||||
"allOf": [{
|
allOf:
|
||||||
"$ref": "core-event-schema/room_event.yaml"
|
- $ref: core-event-schema/room_event.yaml
|
||||||
}],
|
properties:
|
||||||
"properties": {
|
content:
|
||||||
"content": {
|
type: object
|
||||||
"type": "object",
|
allOf:
|
||||||
"allOf": [{
|
- $ref: core-event-schema/call_event.yaml
|
||||||
"$ref": "core-event-schema/call_event.yaml"
|
properties:
|
||||||
}],
|
answer:
|
||||||
"properties": {
|
type: object
|
||||||
"answer": {
|
title: Answer
|
||||||
"type": "object",
|
description: The session description object
|
||||||
"title": "Answer",
|
properties:
|
||||||
"description": "The session description object",
|
type:
|
||||||
"properties": {
|
type: string
|
||||||
"type": {
|
enum:
|
||||||
"type": "string",
|
- answer
|
||||||
"enum": ["answer"],
|
description: The type of session description.
|
||||||
"description": "The type of session description."
|
sdp:
|
||||||
},
|
type: string
|
||||||
"sdp": {
|
description: The SDP text of the session description.
|
||||||
"type": "string",
|
required:
|
||||||
"description": "The SDP text of the session description."
|
- type
|
||||||
}
|
- sdp
|
||||||
},
|
sdp_stream_metadata:
|
||||||
"required": ["type", "sdp"]
|
$ref: components/sdp_stream_metadata.yaml
|
||||||
},
|
required:
|
||||||
"sdp_stream_metadata": {
|
- answer
|
||||||
"$ref": "components/sdp_stream_metadata.yaml"
|
type:
|
||||||
}
|
type: string
|
||||||
},
|
enum:
|
||||||
"required": ["answer"]
|
- m.call.answer
|
||||||
},
|
|
||||||
"type": {
|
|
||||||
"type": "string",
|
|
||||||
"enum": ["m.call.answer"]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
|
||||||
|
|
@ -1,53 +1,47 @@
|
||||||
{
|
$schema: https://json-schema.org/draft/2020-12/schema
|
||||||
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
|
||||||
"type": "object",
|
type: object
|
||||||
"description": "This event is sent by the caller when they wish to establish a call.",
|
description: This event is sent by the caller when they wish to establish a call.
|
||||||
"x-weight": 10,
|
x-weight: 10
|
||||||
"allOf": [{
|
allOf:
|
||||||
"$ref": "core-event-schema/room_event.yaml"
|
- $ref: core-event-schema/room_event.yaml
|
||||||
}],
|
properties:
|
||||||
"properties": {
|
content:
|
||||||
"content": {
|
type: object
|
||||||
"type": "object",
|
allOf:
|
||||||
"allOf": [{
|
- $ref: core-event-schema/call_event.yaml
|
||||||
"$ref": "core-event-schema/call_event.yaml"
|
properties:
|
||||||
}],
|
offer:
|
||||||
"properties": {
|
type: object
|
||||||
"offer": {
|
title: Offer
|
||||||
"type": "object",
|
description: The session description object
|
||||||
"title": "Offer",
|
properties:
|
||||||
"description": "The session description object",
|
type:
|
||||||
"properties": {
|
type: string
|
||||||
"type": {
|
enum:
|
||||||
"type": "string",
|
- offer
|
||||||
"enum": ["offer"],
|
description: The type of session description.
|
||||||
"description": "The type of session description."
|
sdp:
|
||||||
},
|
type: string
|
||||||
"sdp": {
|
description: The SDP text of the session description.
|
||||||
"type": "string",
|
required:
|
||||||
"description": "The SDP text of the session description."
|
- type
|
||||||
}
|
- sdp
|
||||||
},
|
lifetime:
|
||||||
"required": ["type", "sdp"]
|
type: integer
|
||||||
},
|
description: The time in milliseconds that the invite is valid for. Once the invite age exceeds this value, clients should discard it. They should also no longer show the call as awaiting an answer in the UI.
|
||||||
"lifetime": {
|
invitee:
|
||||||
"type": "integer",
|
type: string
|
||||||
"description": "The time in milliseconds that the invite is valid for. Once the invite age exceeds this value, clients should discard it. They should also no longer show the call as awaiting an answer in the UI."
|
description: The ID of the user being called. If omitted, any user in the room can answer.
|
||||||
},
|
x-addedInMatrixVersion: '1.7'
|
||||||
"invitee": {
|
format: mx-user-id
|
||||||
"type": "string",
|
pattern: "^@"
|
||||||
"description": "The ID of the user being called. If omitted, any user in the room can answer.",
|
sdp_stream_metadata:
|
||||||
"x-addedInMatrixVersion": "1.7"
|
$ref: components/sdp_stream_metadata.yaml
|
||||||
},
|
required:
|
||||||
"sdp_stream_metadata": {
|
- offer
|
||||||
"$ref": "components/sdp_stream_metadata.yaml"
|
- lifetime
|
||||||
}
|
type:
|
||||||
},
|
type: string
|
||||||
"required": ["offer", "lifetime"]
|
enum:
|
||||||
},
|
- m.call.invite
|
||||||
"type": {
|
|
||||||
"type": "string",
|
|
||||||
"enum": ["m.call.invite"]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
|
||||||
|
|
@ -1,29 +1,23 @@
|
||||||
{
|
$schema: https://json-schema.org/draft/2020-12/schema
|
||||||
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
|
||||||
"type": "object",
|
type: object
|
||||||
"description": "This event is sent by the caller's client once it has decided which other client to talk to, by selecting one of multiple possible incoming `m.call.answer` events. Its `selected_party_id` field indicates the answer it's chosen. The `call_id` and `party_id` of the caller is also included. If the callee's client sees a `select_answer` for an answer with party ID other than the one it sent, it ends the call and informs the user the call was answered elsewhere. It does not send any events. Media can start flowing before this event is seen or even sent. Clients that implement previous versions of this specification will ignore this event and behave as they did before.",
|
description: This event is sent by the caller's client once it has decided which other client to talk to, by selecting one of multiple possible incoming `m.call.answer` events. Its `selected_party_id` field indicates the answer it's chosen. The `call_id` and `party_id` of the caller is also included. If the callee's client sees a `select_answer` for an answer with party ID other than the one it sent, it ends the call and informs the user the call was answered elsewhere. It does not send any events. Media can start flowing before this event is seen or even sent. Clients that implement previous versions of this specification will ignore this event and behave as they did before.
|
||||||
"x-addedInMatrixVersion": "1.7",
|
x-addedInMatrixVersion: '1.7'
|
||||||
"x-weight": 50,
|
x-weight: 50
|
||||||
"allOf": [{
|
allOf:
|
||||||
"$ref": "core-event-schema/room_event.yaml"
|
- $ref: core-event-schema/room_event.yaml
|
||||||
}],
|
properties:
|
||||||
"properties": {
|
content:
|
||||||
"content": {
|
type: object
|
||||||
"type": "object",
|
allOf:
|
||||||
"allOf": [{
|
- $ref: core-event-schema/call_event.yaml
|
||||||
"$ref": "core-event-schema/call_event.yaml"
|
properties:
|
||||||
}],
|
selected_party_id:
|
||||||
"properties": {
|
type: string
|
||||||
"selected_party_id": {
|
description: The `party_id` field from the answer event that the caller chose.
|
||||||
"type": "string",
|
required:
|
||||||
"description": "The `party_id` field from the answer event that the caller chose."
|
- selected_party_id
|
||||||
},
|
type:
|
||||||
},
|
type: string
|
||||||
"required": ["selected_party_id"]
|
enum:
|
||||||
},
|
- m.call.select_answer
|
||||||
"type": {
|
|
||||||
"type": "string",
|
|
||||||
"enum": ["m.call.select_answer"]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue