Clarify that arbitrary unicode is allowed in user/room IDs and room aliases.

Signed-off-by: Tulir Asokan <tulir@maunium.net>
This commit is contained in:
Tulir Asokan 2023-05-01 13:18:59 +03:00
parent cdbf44eef0
commit 6215072fd3
2 changed files with 17 additions and 2 deletions

View file

@ -0,0 +1 @@
Clarify that arbitrary unicode is allowed in user/room IDs and room aliases.

View file

@ -598,6 +598,13 @@ character set:
extended_user_id_char = %x21-39 / %x3B-7E ; all ASCII printing chars except : extended_user_id_char = %x21-39 / %x3B-7E ; all ASCII printing chars except :
##### User IDs over federation
Due to a lack of validation in original Matrix homeserver implementations,
the localpart of user IDs over federation may contain any valid unicode
codepoints except `:`. A future spec change may create a new room version
to disallow such user IDs.
##### Mapping from other character sets ##### Mapping from other character sets
In certain circumstances it will be desirable to map from a wider In certain circumstances it will be desirable to map from a wider
@ -645,6 +652,10 @@ Room IDs are case-sensitive. They are not meant to be
human-readable. They are intended to be treated as fully opaque strings human-readable. They are intended to be treated as fully opaque strings
by clients. by clients.
The localpart of a room ID (`opaque_id` above) may contain any valid
unicode codepoints except `:`, but it is recommended to only include
ASCII letters and digits when generating them.
#### Room Aliases #### Room Aliases
A room may have zero or more aliases. A room alias has the format: A room may have zero or more aliases. A room alias has the format:
@ -655,8 +666,11 @@ The `domain` of a room alias is the [server name](#server-name) of the
homeserver which created the alias. Other servers may contact this homeserver which created the alias. Other servers may contact this
homeserver to look up the alias. homeserver to look up the alias.
Room aliases MUST NOT exceed 255 bytes (including the `#` sigil and the The localpart of a room alias may contain any valid unicode codepoints
domain). except `:`.
Room aliases MUST NOT exceed 255 bytes as UTF-8 (including the `#` sigil
and the domain).
#### Event IDs #### Event IDs