As part of the webrefine process we need to calculate the tags ("portal", "wikidata") that will be used to later split the webrequest dataset into smaller sets that are more query-able.
Description
Description
Details
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Add tagging as part of webrequest refine process | analytics/refinery | master | +4 -2 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Duplicate | None | T143819 Data request for logs from SparQL interface at query.wikidata.org | |||
Declined | None | T169798 Create UDFs for analyzing SPARQL queries | |||
Declined | None | T164019 Webrequest tagging and distribution. Measuring non-pageview requests | |||
Resolved | • Nuria | T171760 Add tagging to webrequest refine process |
Event Timeline
Comment Actions
Alter we need to run:
alter table webrequest add columns (tags array<string> COMMENT 'List containing tags qualifying the request, ex: [portal, wikidata]. Will be used to split webrequest into
smaller subsets.')
Comment Actions
Change 367940 had a related patch set uploaded (by Nuria; owner: Nuria):
[analytics/refinery@master] Add tagging as part of webrequest refine process
Comment Actions
Tested this code with some fake inserts on 1002, will test bit a bit more data, i just used 1 hour.
Comment Actions
Change 367940 merged by Joal:
[analytics/refinery@master] Add tagging as part of webrequest refine process