Page MenuHomePhabricator

Test Matrix
Open, Needs TriagePublic

Description

Draft hypothesis: If we set up a public Matrix server, bridge it to select Slack and IRC channels, and invite staff and community to test it, we’ll know whether Matrix is a viable solution for incident response and understand the associated costs.

Test plan:

  • Clients:
    • All the use cases below should be tested on the web interface (Element), and the Android and IOS apps (ElementX, presumably - it's slightly less feature complete than the old Element app, but that one is abandonware).
    • The more fundamental use cases should also be tested with the Windows / Mac / Linux desktop apps (not a dealbreaker if those don't work well, they are basically just browser-in-a-box, but nice to know)
  • Basic UX:
    • Is the interface in general intuitive and easy to use?
    • Does it meet basic accessibility checks?
    • Is it easy to format rich text, add images etc?
    • Is the thread implementation decent?
    • Can users see messages from before they joined (if the channel is configured accordingly)?
    • Is it reasonably fast?
    • Is it easy to create a new room? Is it easy to bridge it?
    • Is the Slack bridge reasonable, in either direction? (See here for more specific questions.)
    • Is the IRC bridge reasonable, in either direction? (Can it handle long messages? Does it degrade rich text etc. reasonably? Does it degrade complex actions like message editing? Past pain points have been netsplits, flood control on the IRC side, and side effects of IRC anti-abuse functionality (especially setting +r).
    • Stretch goals: bridge to Discord, Telegram, Mattermost (WMDE instance)
  • Inviting users to a public channel:
    • Ask someone with a Wikimedia SUL account to join a channel (this assumes we'll use SUL login as the only Matrix authentication method - we could have multiple, but that brings its own mess) and see if the UX is reasonable
      • Check especially how the username discrepancy between Wikimedia (basically any character, can be very long) and Matrix (basically ASCII) is handled
    • Same for someone who doesn't have a SUL account
    • Test how easy it is to use a wiki page link to join if you already have Matrix set up
    • Test specifically the user experience for someone who is already using Matrix but on a different home server (this is probably a pain point at least on apps which don't support multiple accounts, so we need how to work around that)
    • Ask someone already using Matrix to find a given topic via channel search
  • Inviting users to a private channel:
    • A user who is on Matrix gets invited to a private channel
    • A user who is not on Matrix gets invited to a private channel (is there some built-in workflow to invite them before their account gets created?)
    • A user who is on Matrix requests to be invited to a private channel (is there a "knock" feature?)
  • Pings / presence
    • Is there a way to configure the client to show when you are online? What if there are multiple clients (e.g. browser + a mobile phone)?
    • Is there a way to configure the client to not show when you are online? (We probably want that to be possible for privacy reasons.)
    • Can you publicly ping (@-mention) someone who is using Matrix? Is the phone notification reasonable? Is the desktop notification reasonable? How easy is it to find out their username?
    • Same for private ping (direct message)
    • Is it possible to watch (get notified for all comments in) specific channels / threads?
    • Are there reasonable notification controls (work time, priority channels)?
    • Can you publicly + privately ping someone who is using Slack? How easy is it to find out their username?
    • Can you publicly + privately ping someone who is using IRC? How easy is it to find out their username?
  • Bot API
    • Port 1-2 IRC bots over to Matrix. Should involve some complex formatted output, and controls (e.g. the bot reacting to ! tags). Test whether it works reliably, can take advantage of the richer format of Matrix, still appears reasonably on IRC, and the DX of the porting process (is the Matrix documentation good? community support? are there good libraries? is it easy to debug?)
    • Port over a Slack bot that uses features unavailable in IRC (e.g. reacji controls) or if there isn't anything, come up with a mock use case.
  • Moderation tools
    • Do basic moderation options (deleting another user's message, voicing/silencing, kicking/banning, opping/deopping) work well and are they easy to use?
    • Do they carry over reasonably over the IRC bridge?
    • Is there a reasonable flagging/reporting functionality? How flexibly can reports be routed?
    • Can users protect themselves on their own (e.g. use blocklists)?
    • (more privacy/legal than moderation) Is it possible to limit how long messages are retained? Can it be done per channel?
    • Maybe we should try out Mjolnir or Draupnir, although at our scale probably not needed.

(This only includes user testing criteria. SRE and ITS will probably have their own criteria about evaluating self-hosting and compatibility with the existing office infrastructure.)

ITS would suggest testing the following items about Administration and governance:

  • Are there approval controls for third party extensions (allowlists, blocklists, admin approval triggers, audit trails, etc.)?
  • Are there data retention controls?
  • Can we use SCIM to automate user creation, update, and deactivation/deletion?
  • Can we set up enforce an Okta SSO integration (SAML or OIDC) for our email domain (wikimedia.org)?
  • Can we configure authentication polices for native logins (e.g., password requirements, MFA enforcement)?
  • Can admin roles with customized privileges be created and applied to groups of users?
  • What kind of reporting metrics can be accessed via the admin user interface or the API?

Details

Due Date
Mar 31 2025, 12:00 AM

Event Timeline

leila updated the task description. (Show Details)
leila set Due Date to Mar 31 2025, 12:00 AM.Dec 13 2024, 4:47 PM

These are requirements we have stated in the WE3.3.1 recommendations doc ("C" being must-have and "NTH" being nice to have):

  • C1: Global channels - the system should support open channels that anyone can join, without having to wait for WMF action.
  • C2: Private channels - the system should support private channels, and it should be easy and fast to invite non-WMF users to them.
  • C3: Moderation possible - the system should provide tools to handle spam, abuse and misbehavior.
  • C4: Encryption of low level channels - for legal and security reasons, all information needs to be encrypted in transit.
  • C5: Usability - the system should meet expectations for modern software and not be a major productivity or efficiency bottleneck.
  • C6: Ability to ping offline people - the system should allow messaging people who are not connected to the internet or not running their chat software at the moment.
  • C7: Data retention policy flexibility - it is sometimes important for legal reasons to limit how long the chat history of some specific channel is retained on the server.
  • C8: Open well-documented API for bots - we have lots of bots that do crucial incident response related work (e.g. alerting, notifying about code changes, logging) and these need to stay operational.
  • NTH1: Emojis - emojis and reaction emojis can compress a lot of meaning into easy-to-scan images, allow complex workflows, and can be used as a primitive UI.

This is the draft evaluation criteria from the same document (skipped one that ended up not being relevant):

  • Usability:
    • Is the interface intuitive and easy to use?
    • Is it reasonably fast?
    • Is there a dedicated mobile app for Android and iOS, and if so, how much of the web client’s functionality does it cover?
    • Is there a desktop client (and for which OSes)?
    • Are notifications manageable? (E.g. is it easy to find new comments in threads? Can you follow/unfollow threads?)
    • Does it meet common accessibility criteria?
    • Is the bridge capable of matching actions to the equivalent action in the other system?
  • Security & safety
  • Moderation support:
    • Can users protect themselves on their own (e.g. use blocklists)?
    • Is it possible to flag offending comments?
  • Enterprise functionality:
    • Can the system use an external identity provider (e.g. via OAuth)?
    • If we consider to move all communications to this platform in the future → does it support multiple external identity providers (e.g. Okta for staff vs. Wikimedia SUL for volunteers)?
    • If we consider to move all communications to this platform in the future → does it support automatic account creation / deactivation via the identity provider (e.g. SCIM)?
    • If there is some sort of app system where apps can be granted permissions, are there policy controls for that?
  • Complexity of self-hosting:
    • Is it widely adopted by others (especially other open communities)?
    • Is it well-documented? Is the architecture understandable and reasonable?
    • Is the developer community (or organization) supportive and easy to approach?
    • What are their release practices? How do they handle security releases?
    • How resource-intensive is it to operate it?
    • What is the state of orchestration and monitoring tooling?

Based on those, here is a rough outline of what we should test IMO:

  • Clients:
    • All the use cases below should be tested on the web interface (Element), and the Android and IOS apps (ElementX, presumably - it's slightly less feature complete than the old Element app, but that one is abandonware).
    • The more fundamental use cases should also be tested with the Windows / Mac / Linux desktop apps (not a dealbreaker if those don't work well, they are basically just browser-in-a-box, but nice to know)
  • Basic UX:
    • Is the interface in general intuitive and easy to use?
    • Does it meet basic accessibility checks?
    • Is it easy to format rich text, add images etc?
    • Is the thread implementation decent?
    • Can users see messages from before they joined (if the channel is configured accordingly)?
    • Is it reasonably fast?
    • Is it easy to create a new room? Is it easy to bridge it?
    • Is the Slack bridge reasonable, in either direction? (See here for more specific questions.)
    • Is the IRC bridge reasonable, in either direction? (Can it handle long messages? Does it degrade rich text etc. reasonably? Does it degrade complex actions like message editing? Past pain points have been netsplits, flood control on the IRC side, and side effects of IRC anti-abuse functionality (especially setting +r).
    • Stretch goals: bridge to Discord, Telegram, Mattermost (WMDE instance)
  • Inviting users to a public channel:
    • Ask someone with a Wikimedia SUL account to join a channel (this assumes we'll use SUL login as the only Matrix authentication method - we could have multiple, but that brings its own mess) and see if the UX is reasonable
      • Check especially how the username discrepancy between Wikimedia (basically any character, can be very long) and Matrix (basically ASCII) is handled
    • Same for someone who doesn't have a SUL account
    • Test how easy it is to use a wiki page link to join if you already have Matrix set up
    • Test specifically the user experience for someone who is already using Matrix but on a different home server (this is probably a pain point at least on apps which don't support multiple accounts, so we need how to work around that)
    • Ask someone already using Matrix to find a given topic via channel search
  • Inviting users to a private channel:
    • A user who is on Matrix gets invited to a private channel
    • A user who is not on Matrix gets invited to a private channel (is there some built-in workflow to invite them before their account gets created?)
    • A user who is on Matrix requests to be invited to a private channel (is there a "knock" feature?)
  • Pings / presence
    • Is there a way to configure the client to show when you are online? What if there are multiple clients (e.g. browser + a mobile phone)?
    • Is there a way to configure the client to not show when you are online? (We probably want that to be possible for privacy reasons.)
    • Can you publicly ping (@-mention) someone who is using Matrix? Is the phone notification reasonable? Is the desktop notification reasonable? How easy is it to find out their username?
    • Same for private ping (direct message)
    • Is it possible to watch (get notified for all comments in) specific channels / threads?
    • Are there reasonable notification controls (work time, priority channels)?
    • Can you publicly + privately ping someone who is using Slack? How easy is it to find out their username?
    • Can you publicly + privately ping someone who is using IRC? How easy is it to find out their username?
  • Bot API
    • Port 1-2 IRC bots over to Matrix. Should involve some complex formatted output, and controls (e.g. the bot reacting to ! tags). Test whether it works reliably, can take advantage of the richer format of Matrix, still appears reasonably on IRC, and the DX of the porting process (is the Matrix documentation good? community support? are there good libraries? is it easy to debug?)
    • Port over a Slack bot that uses features unavailable in IRC (e.g. reacji controls) or if there isn't anything, come up with a mock use case.
  • Moderation tools
    • Do basic moderation options (deleting another user's message, voicing/silencing, kicking/banning, opping/deopping) work well and are they easy to use?
    • Do they carry over reasonably over the IRC bridge?
    • Is there a reasonable flagging/reporting functionality? How flexibly can reports be routed?
    • Can users protect themselves on their own (e.g. use blocklists)?
    • (more privacy/legal than moderation) Is it possible to limit how long messages are retained? Can it be done per channel?
    • Maybe we should try out Mjolnir or Draupnir, although at our scale probably not needed.

(This doesn't include most of the Enterprise functionality and Complexity of self-hosting items, I don't feel competent to suggest specifics for those.)

T186061: Evaluate Matrix / Element as the recommended chat system for Wikimedia and T222458: Evaluate Element as recommended IRC client have some comments with past pain points + for T341762: Provide chat support for WCNA in Matrix there is a Telegram thread somewhere. Probably worth re-checking the issues that came up back then.

ITS would suggest testing the following items:

Administration and governance:

  • Are there approval controls for third party extensions (allowlists, blocklists, admin approval triggers, audit trails, etc.)?
  • Are there data retention controls?
  • Can we use SCIM to automate user creation, update, and deactivation/deletion?
  • Can we set up enforce an Okta SSO integration (SAML or OIDC) for our email domain (wikimedia.org)?
  • Can we configure authentication polices for native logins (e.g., password requirements, MFA enforcement)?
  • Can admin roles with customized privileges be created and applied to groups of users?
  • What kind of reporting metrics can be accessed via the admin user interface or the API?