Page MenuHomePhabricator

Add instrumentation to submit report page
Closed, DeclinedPublic3 Estimated Story Points

Description

This task involves the instrumentation of various elements of the submit report page so that we evaluate how people are engaging with it.

Requirements

We should instrument this such that we know:

  • The type of harassment that is being reported
  • How many people submit a report
  • How many people go back/ abandon the flow at this point
  • How long does it take to file a report (from clicking "report" button to submitting the report)

We should also include

  • metadata (e.g. character count of the submitted report)
  • server-side instrumentation

Event Timeline

Do we also want to include metadata like the character count of the submitted report? Should we also consider reporting the time-on-task (from clicking "report button" to filing the report, how long did it take)?

I also think we should add server-side instrumentation to the API submission endpoint. The reason for this is that something like 10--20% of users with ad blockers will block transmission of client-side event logging requests. If we emit an event on the server-side at the point of submission, we'll have an accurate count of how many users are utilizing the tool.

Agreed. I have updated the description accordingly.

Do we also want to include metadata like the character count of the submitted report? Should we also consider reporting the time-on-task (from clicking "report button" to filing the report, how long did it take)?

If we do want metadata that isn't already in the list of dimensions in the metrics and instrumentation plan, we should add it there. @kostajh, I know you're quite familiar with the Event Platform, so maybe I'm missing something, but it seems like discussing the dimensions ad-hoc like this will make it hard to create the schema (T343601).

As I side note, I don't really see the value of collecting the character count, although there's no harm in it either.

I also think we should add server-side instrumentation to the API submission endpoint. The reason for this is that something like 10--20% of users with ad blockers will block transmission of client-side event logging requests. If we emit an event on the server-side at the point of submission, we'll have an accurate count of how many users are utilizing the tool.

That's a valid point. At first glance, I am neutral about whether this is worth the added complexity. The raw number of events submitted is not one of our success metrics (there is metric 2, the ratio of certain abandoned reports to successful reports, but it's a ratio of two numbers equally affected by ad block). Even among things we want to check just from curiosity, we are mostly interested in ratios (the report completion rate or the proportion of reports for the different types of incidents).

I too have an instinctive desire to know the true number of submissions, but as a data scientist I've had to come to terms with the fact that all our numbers are more or less affected by noise (undetected bots in the pageview data, bizarrely suspicious event data that we decide to exclude in the analysis, mysterious validation errors, plain ol' bit flips somewhere along the line, etc.) and that as long as the noise is consistent, it generally doesn't affect the insights.

If we do this, we would still want the client-side instrumentation of report submission, so that we have numbers that are comparable to the other client-side events.

cc @Iflorez; once she's back from leave in mid-August, she will be taking over my role in these discussions.

As I side note, I don't really see the value of collecting the character count, although there's no harm in it either.

On second thought, I do see a bit of value in it, as some information about how much information the user entered. I've added it to the metrics and instrumentation plan.

I also think we should add server-side instrumentation to the API submission endpoint. The reason for this is that something like 10--20% of users with ad blockers will block transmission of client-side event logging requests. If we emit an event on the server-side at the point of submission, we'll have an accurate count of how many users are utilizing the tool.

I discussed this issue with Madalina and for now I think we'll use client-side instrumentation only. We can easily add server-side submit events later if we feel a real need for them.

@Madalina @kostajh
A quick metric update note which impacts instrumentation:
We've adjusted metric 2 and thus have also updated the basic instrumentation to add one more item.

Metric 2
previously: the ratio of certain abandoned reports to successful reports
now: the submission rate of reports that users have started to write

With the language adjustment, measuring metric 2 will now be clearer and more direct; the measurement will pull data points from step 3 and step 4 in the now updated basic instrumentation.

This is obsolete, new instrumentation tickets will be added.