Page MenuHomePhabricator

Create views to simplify access to renamed columns on NavigationTiming and Quicksurveys schemas
Closed, ResolvedPublic

Description

With the standardization of the "pageviewToken" column, views could make it easier to query across timespans before/after the renaming occured.

Event Timeline

Gilles created this task.
Gilles renamed this task from Create views to simplify access to renamed columns to Create views to simplify access to renamed columns on NavigationTiming and Quicksurveys schemas.Sep 26 2018, 11:45 AM
Johan subscribed.

Removing tag as I suppose it was kept from the parent task rather than because this deserves being mentioned in Tech News. Please re-add if I'm mistaken.

First try making a view, doesn't include recent additions that haven't been deployed yet (CentralNoticeTiming, CpuBenchmark, deviceMemory). Right now it's only taking care of one column remaining, when there is likely another one coming up for QuickSurveysResponses.

CREATE VIEW IF NOT EXISTS unifiedPerformanceSurvey AS SELECT

CASE
WHEN qsr.event.surveyResponseValue IN ('ext-quicksurveys-example-internal-survey-answer-positive') THEN 1
WHEN qsr.event.surveyResponseValue IN ('ext-quicksurveys-example-internal-survey-answer-negative') THEN -1
END AS response,
nt.event.connectStart AS page_connectStart,
nt.event.connectEnd AS page_connectEnd,
nt.event.domComplete AS page_domComplete,
nt.event.domInteractive AS page_domInteractive,
nt.event.fetchStart AS page_fetchStart,
nt.event.loadEventStart AS page_loadEventStart,
nt.event.loadEventEnd AS page_loadEventEnd,
nt.event.redirecting AS page_redirecting,
nt.event.requestStart AS page_requestStart,
nt.event.responseStart AS page_responseStart,
nt.event.responseEnd AS page_responseEnd,
nt.event.secureConnectionStart AS page_secureConnectionStart,
nt.event.mediaWikiLoadEnd AS page_mediaWikiLoadEnd,
nt.event.firstPaint AS page_firstPaint,
nt.event.transferSize AS page_transferSize,
nt.event.RSI AS page_RUMSpeedIndex,
rt.event.startTime AS topimage_startTime,
rt.event.workerStart AS topimage_workerStart,
rt.event.redirectStart AS topimage_redirectEnd,
rt.event.fetchStart AS topimage_fetchStart,
rt.event.domainLookupStart AS topimage_domainLookupStart,
rt.event.domainLookupEnd AS topimage_domainLookupEnd,
rt.event.connectStart AS topimage_connectStart,
rt.event.secureConnectionStart AS topimage_secureConnectionStart,
rt.event.connectEnd AS topimage_connectEnd,
rt.event.requestStart AS topimage_requestStart,
rt.event.responseStart AS topimage_responseStart,
rt.event.responseEnd AS topimage_responseEnd,
rt.event.encodedBodySize AS topimage_encodedBodySize,
rt.event.decodedBodySize AS topimage_decodedBodySize,
rt.event.transferSize AS topimage_transferSize,
qsi.event.performanceNow - nt.event.loadEventEnd AS survey_viewTime,
qsr.event.editCountBucket AS user_editCountBucket,
nt.event.netinfoEffectiveConnectionType AS effectiveConnectionType,
nt.event.originCountry AS country,
nt.useragent.browser_family AS browserFamily,
nt.useragent.browser_major AS browserMajor,
nt.useragent.browser_minor AS browserMinor,
nt.useragent.device_family AS deviceFamily,
nt.useragent.os_family AS osFamily,
nt.useragent.os_major AS osMajor,
nt.useragent.os_minor AS osMinor,
nt.ip AS ip,
nt.recvfrom AS recvfrom,
nt.wiki AS wiki

FROM event.quicksurveysresponses AS qsr

LEFT OUTER JOIN event.resourcetiming rt ON qsr.event.surveyInstanceToken = rt.event.pageviewToken AND rt.year = 2018

INNER JOIN event.quicksurveyinitiation qsi ON qsr.event.surveyInstanceToken = qsi.event.surveyInstanceToken

INNER JOIN event.navigationtiming nt ON qsr.event.surveyInstanceToken = nt.event.stickyRandomSessionId

WHERE qsr.year = 2018 AND qsi.year = 2018 AND nt.year = 2018
AND qsr.event.surveyCodeName = "perceived-performance-survey"
AND qsr.event.surveyResponseValue IN ('ext-quicksurveys-example-internal-survey-answer-positive', 'ext-quicksurveys-example-internal-survey-answer-negative')
AND qsr.event.namespaceid = 0
AND qsi.event.performanceNow - nt.event.loadEventEnd < 10000

UNION ALL

SELECT

CASE
WHEN qsr.event.surveyResponseValue IN ('ext-quicksurveys-example-internal-survey-answer-positive') THEN 1
WHEN qsr.event.surveyResponseValue IN ('ext-quicksurveys-example-internal-survey-answer-negative') THEN -1
END AS response,
nt.event.connectStart AS page_connectStart,
nt.event.connectEnd AS page_connectEnd,
nt.event.domComplete AS page_domComplete,
nt.event.domInteractive AS page_domInteractive,
nt.event.fetchStart AS page_fetchStart,
nt.event.loadEventStart AS page_loadEventStart,
nt.event.loadEventEnd AS page_loadEventEnd,
nt.event.redirecting AS page_redirecting,
nt.event.requestStart AS page_requestStart,
nt.event.responseStart AS page_responseStart,
nt.event.responseEnd AS page_responseEnd,
nt.event.secureConnectionStart AS page_secureConnectionStart,
nt.event.mediaWikiLoadEnd AS page_mediaWikiLoadEnd,
nt.event.firstPaint AS page_firstPaint,
nt.event.transferSize AS page_transferSize,
nt.event.RSI AS page_RUMSpeedIndex,
rt.event.startTime AS topimage_startTime,
rt.event.workerStart AS topimage_workerStart,
rt.event.redirectStart AS topimage_redirectEnd,
rt.event.fetchStart AS topimage_fetchStart,
rt.event.domainLookupStart AS topimage_domainLookupStart,
rt.event.domainLookupEnd AS topimage_domainLookupEnd,
rt.event.connectStart AS topimage_connectStart,
rt.event.secureConnectionStart AS topimage_secureConnectionStart,
rt.event.connectEnd AS topimage_connectEnd,
rt.event.requestStart AS topimage_requestStart,
rt.event.responseStart AS topimage_responseStart,
rt.event.responseEnd AS topimage_responseEnd,
rt.event.encodedBodySize AS topimage_encodedBodySize,
rt.event.decodedBodySize AS topimage_decodedBodySize,
rt.event.transferSize AS topimage_transferSize,
qsi.event.performanceNow - nt.event.loadEventEnd AS survey_viewTime,
qsr.event.editCountBucket AS user_editCountBucket,
nt.event.netinfoEffectiveConnectionType AS effectiveConnectionType,
nt.event.originCountry AS country,
nt.useragent.browser_family AS browserFamily,
nt.useragent.browser_major AS browserMajor,
nt.useragent.browser_minor AS browserMinor,
nt.useragent.device_family AS deviceFamily,
nt.useragent.os_family AS osFamily,
nt.useragent.os_major AS osMajor,
nt.useragent.os_minor AS osMinor,
nt.ip AS ip,
nt.recvfrom AS recvfrom,
nt.wiki AS wiki

FROM event.quicksurveysresponses AS qsr

LEFT OUTER JOIN event.resourcetiming rt ON qsr.event.surveyInstanceToken = rt.event.pageviewToken AND rt.year = 2018

INNER JOIN event.quicksurveyinitiation qsi ON qsr.event.surveyInstanceToken = qsi.event.surveyInstanceToken

INNER JOIN event.navigationtiming nt ON qsr.event.surveyInstanceToken = nt.event.pageviewToken

WHERE qsr.year = 2018 AND qsi.year = 2018 AND nt.year = 2018
AND qsr.event.surveyCodeName = "perceived-performance-survey"
AND qsr.event.surveyResponseValue IN ('ext-quicksurveys-example-internal-survey-answer-positive', 'ext-quicksurveys-example-internal-survey-answer-negative')
AND qsr.event.namespaceid = 0
AND qsi.event.performanceNow - nt.event.loadEventEnd < 10000;

It's already extremely slow to query it (35 minutes per query!).

At this point, given the data size (10s of thousands of rows), it seems like I'm better off exporting the rows of each table of interest, and then doing the joins offline, in an sqlite database for example.