Under a 1% of VirtualPageViews are being dropped due to long URIs. We aim to shorten the URL length by removing source_url from the events.
This seems to happen because the encoded source_title is way too long, and exceeds the varnish URL size. You can see that the event data is truncted before it is finished.
Jun 11 14:05:05 eventlog1002 eventlogging-processor@client-side-06[11969]: 2018-06-11 14:05:05,376 [11969] (MainThread) Unable to process: ?%7B%22event%22%3A%7B%22source_page_id%22%3A346904%2C%22source_namespace%22%3A4%2C%22source_title%22%3A%22%E1%83%95%E1%83%98%E1%83%99%E1%83%98%E1%83%A1_%E1%83%A3%E1%83%A7%E1%83%95%E1%83%90%E1%83%A0%E1%83%A1_%E1%83%AB%E1%83%94%E1%83%92%E1%83%9A%E1%83%94%E1%83%91%E1%83%98%2F%E1%83%AB%E1%83%94%E1%83%92%E1%83%9A%E1%83%94%E1%83%91%E1%83%98%E1%83%A1_%E1%83%A1%E1%83%98%E1%83%90%2F%E1%83%99%E1%83%90%E1%83%AE%E1%83%94%E1%83%97%E1%83%98%2F%E1%83%A1%E1%83%90%E1%83%92%E1%83%90%E1%83%A0%E1%83%94%E1%83%AF%E1%83%9D%E1%83%A1_%E1%83%9B%E1%83%A3%E1%83%9C%E1%83%98%E1%83%AA%E1%83%98%E1%83%9E%E1%83%90%E1%83%9A%E1%83%98%E1%83%A2%E1%83%94%E1%83%A2%E1%83%98%22%2C%22source_url%22%3A%22https%3A%2F%2Fka.wikipedia.org%2Fwiki%2F%25E1%2583%2595%25E1%2583%2598%25E1%2583%2599%25E1%2583%2598%25E1%2583%259E%25E1%2583%2594%25E1%2583%2593%25E1%2583%2598%25E1%2583%2590%3A%25E1%2583%2595%25E1%2583%2598%25E1%2583%2599%25E1%2583%2598%25E1%2583%25A1_%25E1%2583%25A3%25E1%2583%25A7%25E1%2583%2595%25E1%2583%2590%25E1%2583%25A0%25E1%2583%25A1_%25E1%2583%25AB%25E1%2583%2594%25E1%2583%2592%25E1%2583%259A%25E1%2583%2594%25E1%2583%2591%25E1%2583%2598%2F%25E1%2583%25AB%25E1%2583%2594%25E1%2583%2592%25E1%2583%259A%25E1%2583%2594%25E1%2583%2591%25E1%2583%2598%25E1%2583%25A1_%25E1%2583%25A1%25E1%2583%2598%25E1%2583%2590%2F%25E1%2583%2599%25E1%2583%2590%25E1%2583%25AE%25E1%2583%2594%25E1%2583%2597%25E1%2583%2598%2F%25E1%2583%25A1%25E1%2583%2590%25E1%2583%2592%25E1%2583%2590%25E1%2583%25A0%25E1%2583%2594%25E1%2583%25AF%25E1%2583%259D%25E1%2583%25A1_%25E1%2583%259B%25E1%2583%25A3%25E1%2583%259C%25E1%2583%2598%25E1%2583%25AA%25E1%2583%2598%25E1%2583%259E%25E1%2583%2590%25E1%2583%259A%25E1%2583%2598%25E1%2583%25A2%25E1%2583%2594%25E1%2583%25A2%25E1%2583%2598%22%2C%22page_title%22%3A%22%E1%83%92%E1%83%98%E1%83%9D%E1%83%A0%E1%83%92%E1%83%98%E1%83%AC%E1%83%9B%E1%83%98%E1%83%9C%E1%83%93%E1%83%90%22%2C%22page_id%22%3A35638%2C%22page_namespace%22%3A0%7D%2C%22revision%22%3A17780078%2C%22schema%22%3A%22VirtualPageV
Acceptance criteria
- Inside the Popups code, before sending a VirtualPageView event, ensure the source_url field is limited to 1400 characters.
- N.B. In rEPOP4e43f0cf9e94: Truncate source_url to max 1000 characters, we opted to truncate the source_url field to 1000 characters.
Developer notes
The character limit is calculated in https://phabricator.wikimedia.org/T196904#4303878
Sign off steps
- After deployment we'll want to make sure the error rate is negligable. If not, we may want to consider relying on page ids and dropping the title as well (will not be done as part of this task).
- As @mforns says in T196904#4422959, the error rate will never be 0 as other potentially-long properties may cause the encoded event to overflow the limit.
QA steps
On beta cluster with page previews enabled please visit https://en.wikipedia.beta.wmflabs.org/wiki/Special:AllPages. When a popup is visible for more than 1 second the VirtualPageView event should be triggered in the network tab. (https://meta.wikimedia.org/wiki/Schema:VirtualPageView )
Verify that "en.wikipedia.beta.wmflabs.org%2Fwiki%2FSpecial%3AAllPages" appears in the string
Search for "1ვიკის უყვარს ძეგლები/ძეგლების სია/კახეთი/საგარეჯოს მუნიციპალიტეტი"
Hover over this one, check the above
When this is in production, we will look at the error logs to determine whether this is having the desired effect.