Page MenuHomePhabricator

Reply v2.0: conduct usability tests (usertesting.com)
Closed, ResolvedPublic

Description

This task represents the work involved with evaluating the usability of Version 2.0 of the Reply tool with people who have little-to-no-experience using Wikipedia talk pages.

This task is a a follow up to the usability test we ran with Version 1.0 of the Reply tool.

Test goals

  • To understand whether the new features introduced in version 2.0 of the Replying tool are intuitive for people who have little-to-no-experience using Wikipedia talk pages.
  • To understand whether the Reply tool as a whole is intuitive for people who have little-to-no-experience using Wikipedia talk pages.
  • To identify parts of the Replying tool that could be improved to make it easier for people to start using and to complete what they set out to accomplish.

Test details

  • Platforms: desktop
  • Account type: logged in and logged out
  • Text input default: source and visual
    • Variant A: test participants will be shown the tool's source mode when they click the "Reply" link for the first time
    • Variant B: test participants will be shown the tool's visual mode when they click the "Reply" link for the first time
  • Test pages: each test participant should be linked to a separate talk page.
    • Each talk page link should include ?dtvisual=1 at the end of the URL to ensure the visual mode is available to test participants.

Workflow to test

In addition the Reply tool's core workflows (e.g. draft and publish a response in that discussion, locate published response on the talk page), we would like to test these new workflows Version 2.0 is introducing:

  • Style the content of the comment they are writing (e.g. bold and italicize).
  • Ping another user. Scenarios:
    • Ping someone who has commented in the discussion the tester is commenting in
    • Ping someone who has commented in a different discussion on the talk page the tester is commenting on
    • Ping someone who has not commented on the talk page the tester is commenting on
  • Change the user the tester had selected to ping, to a different user
    • Read: if they initially had selected USER A to ping have them ping USER B
  • Delete the ping they had entered into the text input
  • Insert a link to another Wikipedia page into the comment they are writing
  • Be made aware when someone responds to something they've said

Test script

To be added by @iamjessklein

  • Each test participant should be explicitly asked what they understand the visual and source affordances to mean and do.//

Done

  • Draft test script
  • Run test
  • Document test findings and recommendations on ticket, complete with standout quotes and links to the parts of the tests where the quotes occur.

Related Objects

StatusSubtypeAssignedTask
OpenNone
Openiamjessklein
Resolvediamjessklein
DuplicateNone
ResolvedEsanders
Openiamjessklein
Resolvedppelberg
Resolvedppelberg
Resolvedppelberg
Resolvedppelberg
ResolvedEsanders
Resolvedppelberg
Resolvedppelberg
ResolvedEsanders
ResolvedDLynch
Resolvediamjessklein
ResolvedEsanders
Resolvediamjessklein
ResolvedEsanders
Resolvedppelberg
ResolvedBUG REPORTppelberg
ResolvedBUG REPORTEsanders

Event Timeline

ppelberg created this task.Feb 26 2020, 3:50 AM
JTannerWMF updated the task description. (Show Details)Mar 2 2020, 11:03 AM
JTannerWMF added a comment.EditedMar 2 2020, 11:05 AM

This task can't be picked up by @iamjessklein until the prototype is done, due to the team's desire to repeat the Replying 1.0 testing strategy. Once the prototype is complete, testing on usertesting.com can happen, in parallel with T246191.

ppelberg updated the task description. (Show Details)Apr 27 2020, 9:48 PM
ppelberg updated the task description. (Show Details)Apr 27 2020, 11:32 PM
ppelberg updated the task description. (Show Details)May 5 2020, 4:49 PM

Updating the task description with the specific below in response to the feedback we mentioned receiving in T246191#6107400 and T246191#6091905:

  • Ping/@ mention another user. Scenarios:
    • Ping someone who has commented in the discussion the tester is commenting in
    • Ping someone who has commented in a different discussion on the talk page the tester is commenting on
    • Ping someone who has not commented on the talk page the tester is commenting on

cc @iamjessklein

Note on timing based on the conversation @iamjessklein had today:

  • This work starting depends on us having a prototype available for test participants to use.
ppelberg updated the task description. (Show Details)May 9 2020, 1:34 AM

Note on timing based on the conversation @iamjessklein had today:

  • This work starting depends on us having a prototype available for test participants to use.

The prototype is ready for testing. See: https://en.wikipedia.beta.wmflabs.org/wiki/User_talk:Ppelberg-test?dtvisual=1.

I've also updated the task description with more details about the test. I am assigning this task over to @iamessklein now.

ppelberg reassigned this task from ppelberg to iamjessklein.May 9 2020, 1:36 AM
Alsee added a comment.EditedMay 10 2020, 1:01 AM

@ppelberg when I edit an article and toggle back and forth between wikitext and visual editors, it does not corrupt the wikitext.

When I toggle back and forth between wikitext and visual modes in Flow, it corrupts the wikitext. In some cases the corrupt expands like a cancer tumor where the corruption grows larger and larger each time you toggle back and forth between the two modes. It also grows each time you edit the comment, functionally adding another round-trip.

I tested your new feature. Sadly, I was unsurprised by the result. You built Flow again.

Simple test case:

normal '''bold ''bold-italics''' italics'' normal

(On a round trip the markup is altered, expanded, and a nowiki is inserted.)

I also tested by grabbing a random table and copy-pasting it in. (Sorry, I didn't invest any time trying to minimize the test case.) It demonstrates the more severe version of the problem. If you past the following example and toggle twenty times between Source and Visual, it will expand and expand and expand and expaaaaaaaaand with an extra sixty lines added at the top and twenty lines added at the bottom. (Each round trip progressively adds three lines at the top and one at the bottom:)

{| class="wikitable" style="float: right; clear: right; margin-left: 2em;"
|+ Chemical composition of the crust<ref name=brown_mussett1981/>
!rowspan="2"|Compound
!rowspan="2"|Formula
!colspan="2"|Composition
|-
!style="font-size: smaller;"|Continental
!style="font-size: smaller;"|Oceanic
|-
|[[silica]]
|style="text-align: center;"|SiO<sub>2</sub>
|style="text-align: right;"|60.2%
|style="text-align: right;"|48.6%
|-
|[[Aluminum oxide|alumina]]
|style="text-align: center;"|Al<sub>2</sub>O<sub>3</sub>
|style="text-align: right;"|15.2%
|style="text-align: right;"|16.5%
|-
|[[Calcium oxide|lime]]
|style="text-align: center;"|CaO
|style="text-align: right;"|5.5%
|style="text-align: right;"|12.3%
|-
|[[Magnesium oxide|magnesia]]
|style="text-align: center;"|MgO
|style="text-align: right;"|3.1%
|style="text-align: right;"|6.8%
|-
|[[iron(II) oxide]]
|style="text-align: center;"|FeO
|style="text-align: right;"|3.8%
|style="text-align: right;"|6.2%
|-
|[[sodium oxide]]
|style="text-align: center;"|Na<sub>2</sub>O
|style="text-align: right;"|3.0%
|style="text-align: right;"|2.6%
|-
|[[potassium oxide]]
|style="text-align: center;"|K<sub>2</sub>O
|style="text-align: right;"|2.8%
|style="text-align: right;"|0.4%
|-
|[[iron(III) oxide]]
|style="text-align: center;"|Fe<sub>2</sub>O<sub>3</sub>
|style="text-align: right;"|2.5%
|style="text-align: right;"|2.3%
|-
|[[water (molecule)|water]]
|style="text-align: center;"|H<sub>2</sub>O
|style="text-align: right;"|1.4%
|style="text-align: right;"|1.1%
|-
|[[carbon dioxide]]
|style="text-align: center;"|CO<sub>2</sub>
|style="text-align: right;"|1.2%
|style="text-align: right;"|1.4%
|-
|[[titanium dioxide]]
|style="text-align: center;"|TiO<sub>2</sub>
|style="text-align: right;"|0.7%
|style="text-align: right;"|1.4%
|-
|[[phosphorus pentoxide]]
|style="text-align: center;"|P<sub>2</sub>O<sub>5</sub>
|style="text-align: right;"|0.2%
|style="text-align: right;"|0.3%
|-
!colspan="2"|Total
!style="text-align: right;"|99.6%
!style="text-align: right;"|99.9%
|}

Every time I test the product throwing random wikitext at it, it literally takes me seconds to find new ways wikitext support is completely broken. It's like you didn't even bother testing it with anything other than basic text-comments. Flow team made the same mistake - thinking they were building a chat-board. Big error. Our talk pages are a wiki workplace. As part of that work we post any and all wikitext in our comments. We expect genuine and correct wikitext support - not corruption or other breakage.

@Aklapper, please review.
@Alsee. Thanks for testing. It looks like the reply tool without the visual mode as in pilot on nl.wp and 3 other wikis currently does not support tables, and probably does not support a lot of other wikitext either, except bold, italic, and underline, wiki links and template. No wonder the unsupported markup doesn't do well in a roundtrip around the visual mode.
The source mode of editing a talk page still exist. Senior editors like you can easily switcht to the source mode of a talk page without using the reply tool.
The developers are developing a new tool, which is availabe in a test wiki. Reply 1.0 is in pilot mode on 4 wikis. Just as a way of finding out what works, and what not.
Your wikitext:
normal '''bold ''bold-italics''' italics'' normal
is not correct wiki markup.

In reply to T246190#6121892: @Alsee: https://www.mediawiki.org/wiki/Bug_management/Phabricator_etiquette asks you to criticize ideas instead of people if you'd like to be active here. If not getting personal is too hard then I suggest that you may want to spend your time somewhere else. Thanks for your understanding!

Alsee added a comment.May 13 2020, 5:37 PM

@Aklapper the word "you" was intended as a collective noun for the Foundation. If you maintain that raising concern with the Foundation's design and testing workflow is somehow "personal" I would be concerned with your definitions.

@ppelberg as the recipient of my comment, I ask that you inform both me and Aklapper whether you felt my comment was a problematic personal attack so that one of us can adjust our behavior accordingly.

@Alsee: It's an interesting idea to start a comment with an individual name (and explicitly calling an individual "the recipient of your comment") but then state that "you" was intended as a plural. In that case I may assume that any use of the word "we" in your comments translates to "Alsee"? :)
Apart from that, please see again my previous comment how to bring up criticism and input in a constructive and acceptable way. Thanks a lot!

Updates
These are notes notes from this morning's standup and the chat Jess and I had in Slack on Friday:

  • Each test participant should be linked to a different talk page
  • Each talk page link that is shared with test participants should include ?dtvisual=1 at the end to ensure the visual mode is available to them considering T253668 is not likely to have landed before the tests begin.
  • Each test participant should be asked explicitly what they understand the two modes – source and visual – to mean and do.

The points above have been incorporated into the task description.

ppelberg updated the task description. (Show Details)May 26 2020, 7:36 PM

Updates
During today's team meeting, we decided to run two variations of this test:

  • Variant A: test participants will be shown the tool's source mode when they click the "Reply" link for the first time
  • Variant B: test participants will be shown the tool's visual mode when they click the "Reply" link for the first time

The above has been added to the task description.

ppelberg updated the task description. (Show Details)May 28 2020, 2:14 AM
iamjessklein added a comment.EditedMay 28 2020, 4:54 PM

In order to do this, we need:

  • 5 unique test accounts with username/pw
  • 5 test user pages set up with the talk page, where the reply will provide the test participant with the tools source mode by default
  • 5 test user pages set up with the talk page, where the reply will provide the test participant with the tools visual mode by default
  • protocol
  • run test 1
  • run test 2
  • log results

@Esanders can you help me with the first three items of the list?

update: We need 6 unique test accounts so that we can use one to simulate a user who posted on Alice's page (with a proper username)

matmarex removed a subscriber: matmarex.May 28 2020, 7:46 PM

Shared the simple protocol with @ppelberg via a google doc, just waiting for a general thumbs up before moving on to running the tests.

Demian removed a subscriber: Demian.May 29 2020, 1:10 AM
ppelberg added a comment.EditedJun 1 2020, 5:23 PM

Shared the simple protocol with @ppelberg via a google doc, just waiting for a general thumbs up before moving on to running the tests.

Great. As of 29-May the comments I have should be in the testing doc.

Update: 5 tests are running now on usertesting.com.
These tests default to the visual mode of the editor.

I will post the results after I log the tests, then I will run the next round where the default is source mode.

All the tests are logged - you can read the write up here. I will be running the second round of tests (with source mode as default) next.

Findings

We ran a test on usertesting.com on June 1, 2020 - The test recruited 5 random, technically - advanced web users. Participants were directed to an article page set up on the prototype server . The detailed findings can be found on the limited access test log.

  • 5 tests were conducted
  • Participation on Desktop: 3 male ; 2 female
  • All participants had familiarity with Wikipedia and some participants had previously edited an article.

Clicking reply
βœ… 5 out of 5 successfully completed the task
✏️ 1 user had difficulty identifying the reply button and took quite a while until they started the task.

Writing comment
βœ… 5 out of 5 successfully completed the task

Adding a link
βœ… 5 out of 5 successfully completed the task

Formatting comment
βœ…5 out of 5 successfully completed the task

@-mentioning the user they are replying to
βœ… 3 out of 5 successfully completed the task
"so it automatically brings it in, that's useful." UT-A
🚫 2 out of 5 did not complete the task successfully
✏️2 users completely typed out the username and therefore it never recognized the user/links to their talk page
✏️ One participant correctly typed @ sign and wrote in the username correctly except they did not capitalize the first letter so it was not converted into a user link. However this user did not notice that and believes they did this correctly.

@-mentioning a user not in the conversation
βœ… 4 out of 5 successfully completed the task
β€œI like that it suggests the user name and you don't have to type it out.” UT-A
🚫 1 out of 5 did not complete the task successfully

Deleting an @-mention
βœ… 4 out of 5 successfully completed the task
🚫 1 out of 5 did not complete the task successfully
✏️The user who deleted it unsuccessfully did so because they added it unsuccessfully to begin with.

Posting the comment
βœ… 5 out of 5 successfully completed the task
β€œWhen you say post it - does that mean it is going to be visible to everyone?” - UT-B

Locating the posted comment
βœ… 5 out of 5 successfully completed the task
β€œIt would be better if the date and time stamps were first so you could sort them in that order instead of having to look to the end." UT-D

Miscellaneous Notes:

On page appearance:
* "Doesn't look like the sort of usual messaging interface you might see on a website." - UT-A
*"Watch this page" - I'm assuming it means getting notifications, I'm not sure.” UT-A

  • "I don't know what difference between visual and source is." UT-E
  • "Can I click this? I don't know if this will {erase my visual content that I already wrote] if I click source, I'm afraid, so I won't." UT-B (referring to clicking source mode)
  • β€œMaybe change the font of the reply because I can't see the difference.” UT- B
  • One user (UT-B) was confused and afraid to click the "typewriter button" because it overlapped with the reply button
  • One user (UT-B) referred to this page as their "edit home page"

On accessibility issues:

  • One user (UT-B) has trouble identifying any links on page due to accessibility issues with color.

On overall design:
"This is a crude way to interact with people compared to some other messaging system like Slack. But within the context of Wikipedia I guess it's within their toolset and something to get used to. It would help if there was a wiki on how to use this wiki." - UT-D

Update: I've posted second round of tests on usertesting.com (this round has source mode as default).

Findings (Part 2)

We ran a test on usertesting.com on June 3, 2020 - The test recruited 5 random, technically - advanced web users. Participants were directed to an article page set up on the prototype server . The detailed findings can be found on the limited access test log.

  • 5 tests were conducted
  • Participation:Desktop: 2 male ; 3 female
  • 5 participants were desktop web users
  • All participants had familiarity with Wikipedia and some participants had previously edited an article.

Clicking reply
βœ… 4 out of 5 successfully completed the task
🚫 One user looked everywhere and found the user via revision history and then proceeded to edit on the user page of the commenter (Blathersfan) instead

Writing comment
βœ… 4 out of 5 successfully completed the task

Adding a link
βœ… 3 out of 5 successfully completed the task
✏️ One participant explained how he was hunting for tips on how to do markdown

Formatting comment
βœ…3 out of 5 successfully completed the task
β€œI think the one thing that was difficult was the formatting of the text. I like using markup languages because they make editing quick and simple, however I did not know the syntax required for the markup I was attempting which made it difficult to know if I would be displaying my intended message.” UT-F

@-mentioning the user they are replying to
βœ… 3 out of 5 successfully completed the task
🚫 1 out of 5 did not complete the task successfully
✏️1 users completely typed out the username and therefore it never recognized the user/links to their talk page

@-mentioning a user not in the conversation
βœ… 3 out of 5 successfully completed the task
🚫 1 out of 5 did not complete the task successfully

Deleting an @-mention
βœ…3 out of 5 successfully completed the task
🚫 2 out of 5 did not complete the task successfully
✏️The users who deleted it unsuccessfully did so because they added it unsuccessfully to begin with.

Posting the comment
βœ… 4 out of 5 successfully completed the task

Locating the posted comment
βœ…4 out of 5 successfully completed the task

Miscellaneous Notes:
On page appearance:
"One thing I'd like was if there was some sort of vertical line that separated some of the blocks in the thread. because sometimes these threads get long and indentation can get confusing." UT- F
"I'm a little confused about what User talk: Alice is and who "devwiki" is." UT-F
Several testers thought that the part of the interface that says β€œfrom devwiki” was indicating that the message was sent through dev wiki

On interface design:

  • One tester was confused by there being both an edit and reply button - UTG
  • Several users tried to link by highlighting and left clicking
  • Several testers did not know what β€œwatch this page” meant but eventually were able to intuit the affordance
  • "I didn't expect to see "Alice1" but it makes sense that it automatically inserts your username." - UTG
  • One tester thought that "source mode" meant adding references or sources.

On noticing "visual mode":

  • β€œI didn't notice that I had the Visual tools available at first. I don't think I would have clicked on the "Visual" tab if I hadn't been asked about the tool.” UTH
  • One tester assumed "visual mode" meant adding a picture, but then clicked on it and understood that source is code and visual is where she should write. - UTH

On having a good experience:

  • "it was pretty straightforward. streamlined process here i like the fact that it's very stripped down and doesn't feel data intense.so not overwhelming by any stretch of the imagination. the colors are pleasing to the eye... simple to navigate interface. Overall it works the way i'd expect it to." UTI
iamjessklein updated the task description. (Show Details)Jun 8 2020, 4:10 PM
ppelberg closed this task as Resolved.Jun 15 2020, 10:10 PM