Ideas for performance perception studies
Closed, DeclinedPublic
Actions

Assigned To

Authored By

	• Gilles
	Jan 9 2018, 12:14 PM

Description

We probably have some old tasks related to this, but I wanted to use this task as an ideas dumping ground separate from the parent task, while I review research papers and ideas come up.

The value of performance stability versus average performance

We've floated the idea a few times in the past that we could study performance perception in the real world by making the site slower on purpose for a group of users and studying the effect on their behavior. I think a refinement on that idea would be both to introduce a high random variance in performance, as well as bad, but consistent performance.

It's possible that a website that runs fast 99% of the time and has a very slow page load every now and then is more frustrating to the user than one that is always average in its performance, absolutely consistently.

In the context of what our team is doing, if one phenomenon is a lot worse than the other, this would dramatically shift our focus. If consistency is the most important thing, tackling high percentiles should be our main focus. If faster response is the most important factor, it confirms that our current approach of focusing on making things faster across the board is the right one.

This could be studied in a controlled environment or "in the wild" by intentionally slowing down page loads. The challenge lies in how we measure that users are more satisfied with one scenario than another. By asking them? By measuring session length?

This might work as an opt-in study, with a browser plugin that either doesn't affect load time, inserts randomly slow pageloads, or slows down all pageloads if necessary so that page load time is made very consistent. Then measuring time spent on wikis over a long period of time.

How results might be actionable: if we find that stability is more important than average performance, it might encourage us focus on improving high percentiles and extreme cases more than performance across the board. If average performance matters more, this would reinforce our current focus.

Performance perception thresholds and granularity

In a controlled environment, it would be interesting to identify the latency threshold for the core mechanics we want to study (eg. reading, editing).

There is a limit to what humans can perceive as being instantaneous, and it seems to depend on context (since studies suggest that audio and haptic latency thresholds are different). It might also depend on age and background, again with studies suggesting that younger people have lower latency thresholds.

This would inform us on what the limit is, beyond which optimizing is pointless, as people can't tell the difference. We could also use this study to study how satisfied people are at different thresholds when they start perceiving latency. Taking a pessimistic example, if moving the needle from say 100ms to 30ms response time increases user satisfaction only from 80 to 82%, it might not justify some the budget allocated to a project aiming to achieve such performance improvements.

Studying this would require a lab setup and big enough cohort of participants.

How results might be actionable: knowing what users consider to be an "instant wiki pageload" would inform us on the point beyond which further optimization is futile. Furthermore, identifying the granularity for a perceivable performance difference, it would inform decisions to pursue a given optimization, if the expected savings are below the threshold of what users would perceive to be different.

Measure user interaction with JS-enhanced elements

Some of the above ideas assume that visual progress is king. If we run a survey and ask people to tell us how fast the page loaded, they will probable assume that we're talking about visual progress, and not about page interaction. However, on pages where people's main task is to interact with JS-enhanced elements, fast visual progression might not be the bottleneck of frustration, if the interactive elements people are waiting for aren't usable when visual progress is complete. The most obvious example being the visual editor, which is highly interactive and enhanced by JS.

Therefore I think it would be interesting to measure our user interaction rate with JS-enhanced features on all pages, to see if any patterns emerge. This might help us categorize pages where optimizing for faster interactivity is the most important factor for user satisfaction, versus optimizing for faster visual completion.

Related Objects
Search...

Status	Assigned	Task
Resolved	• Gilles	T165272 Review research on performance perception
Declined	• Gilles	T184510 Ideas for performance perception studies
Resolved	• Gilles	T187299 User-perceived page load performance study
Resolved	• Whatamidoing-WMF	T188503 Identify two wikis to run a research study and get approval from their respective communities
Resolved	• Gilles	T195840 Track when a CentralNotice banner was displayed to the user in NavTiming
Declined	• Gilles	T196163 Add ability to render/inject QuickSurveys server-side to loggedin users
Resolved	• Gilles	T196528 Setup dashboard for performance survey responses
Resolved	• Gilles	T196772 Performance survey shouldn't appear on category page
Resolved	• Gilles	T196775 Record monotonic time of survey impression
Resolved	• Gilles	T197607 Add ability to oversample specific pages
Resolved	• Gilles	T197609 Collect ResourceTiming data of top article image
Declined	• Gilles	T197610 Record which DC served the request
Invalid	• Gilles	T197611 Measure approximate top paragraph timing
Resolved	• Gilles	T197974 Record transferSize in Navigation Timing data
Resolved	• Gilles	T204921 Rename EventLogging column surveyInstanceToken to pageviewToken in QuickSurveysResponses for consistency
Declined	None	T204922 Rename column on old hive data for a few tables
Resolved	• Gilles	T205533 Create views to simplify access to renamed columns on NavigationTiming and Quicksurveys schemas
Resolved	• Gilles	T205580 Microbenchmark device power and record results in NavigationTiming
Resolved	• Whatamidoing-WMF	T212304 Ask the eswiki community whether we can run the performance perception survey there
Resolved	Slaporte	T217318 A Large-scale Study of Wikipedia Users' Quality of Experience: data release
Declined	• Gilles	T224248 Record order of randomized survey options
Declined	• Gilles	T224252 Memorize and store prior pageviews performance in the current session
Declined	• Gilles	T224253 Add secondary question(s) for the performance study
Resolved	• Gilles	T180667 Collect RUMSpeedIndex from users

Event Timeline

• Gilles triaged this task as Low priority.Jan 9 2018, 12:14 PM

• Gilles created this task.

• Gilles updated the task description. (Show Details)Jan 12 2018, 12:01 PM

• Gilles moved this task from Inbox, needs triage to Backlog: Maintenance, non-prioritized on the Performance-Team board.Jan 16 2018, 2:22 PM

• Gilles updated the task description. (Show Details)Jan 31 2018, 12:05 PM

• Gilles updated the task description. (Show Details)Feb 1 2018, 11:33 AM

• Gilles updated the task description. (Show Details)Feb 8 2018, 12:24 PM

• Gilles updated the task description. (Show Details)Feb 20 2018, 2:08 PM

• 238482n375 removed • Gilles as the assignee of this task.Jun 15 2018, 8:02 AM

• 238482n375 lowered the priority of this task from Low to Lowest.

• 238482n375 moved this task from Next Up to In Code Review on the Analytics-Kanban board.

• 238482n375 added projects: Analytics-Kanban, acl*security, Wikimedia-VE-Campaigns (S2-2018), Scap (Scap3-Adoption-Phase2), AbuseFilter, Data-release, Hashtags, LabsDB-Auditor, Ladies-That-FOSS-MediaWiki, Language-2018-Apr-June, Language-2018-Jan-Mar, HHVM, HAWelcome.

• 238482n375 edited subscribers, added: • 238482n375; removed: Aklapper.

This comment was removed by Krinkle.

• 238482n375 set Security to Software security bug.Jun 15 2018, 8:06 AM

• 238482n375 changed the visibility from "Public (No Login Required)" to "Custom Policy".

This comment was removed by Krinkle.

Aklapper assigned this task to • Gilles.Jun 15 2018, 11:50 AM

Aklapper raised the priority of this task from Lowest to Low.

Aklapper removed projects: HAWelcome, HHVM, Language-2018-Jan-Mar, Language-2018-Apr-June, Ladies-That-FOSS-MediaWiki, LabsDB-Auditor, Hashtags, Data-release, AbuseFilter, Scap (Scap3-Adoption-Phase2), Wikimedia-VE-Campaigns (S2-2018), acl*security, Analytics-Kanban.

Aklapper edited subscribers, added: Aklapper; removed: • 238482n375.

Restricted Application added a project: acl*security. · View Herald TranscriptJun 15 2018, 11:50 AM

Aklapper removed a project: acl*security.Jun 15 2018, 11:50 AM

Aklapper changed the visibility from "Custom Policy" to "Public (No Login Required)".

• Gilles lowered the priority of this task from Low to Lowest.May 27 2019, 5:05 AM

• Gilles updated the task description. (Show Details)Jun 6 2019, 9:40 AM

I've removed the gaze idea. Since our early experiments with a gaze-tracking device have shown that this might be a distorted way to look at things. Subjects seemed to be unaware that their eyes locked on things and couldn't correlate this to higher interest. Also, gaze behaviour varied a lot between subjects, which suggests that we wouldn't find a one-size-fits-all solution to prioritisation. Our vision doesn't work in a laser-like tunnel, which is a model these devices end up pushing you towards.

Removed video-based idea, as there are enough studies of that kind, and the task becomes a "game", disconnected from the reality of the pageload's effect on peoples' perception.

Actually, let's close this. We're already going to answer the questions these ideas were trying to answer with some of the already-filed followups to the perception study (variability), and with new APIs (event timing + tracking clicks) for the interaction part.

• Gilles closed subtask T187299: User-perceived page load performance study as Resolved.Feb 25 2020, 2:59 PM

Ideas for performance perception studiesClosed, DeclinedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Ideas for performance perception studies
Closed, DeclinedPublic
Actions

Related Objects
Search...