JAllemandou (joal)
Data Engineer

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Feb 11 2015, 6:02 PM (158 w, 1 d)
Availability
Available
IRC Nick
joal
LDAP User
Unknown
MediaWiki User
JAllemandou (WMF)

Recent Activity

Today

JAllemandou updated the task description for T159962: Spark 2.2.1 as cluster default (working with oozie).
Fri, Feb 23, 8:12 AM · Analytics-Kanban

Yesterday

JAllemandou renamed T159962: Spark 2.2.1 as cluster default (working with oozie) from Spark 2.x as cluster default (working with oozie) to Spark 2.2.1 as cluster default (working with oozie).
Thu, Feb 22, 10:25 AM · Analytics-Kanban
JAllemandou moved T159962: Spark 2.2.1 as cluster default (working with oozie) from Next Up to In Progress on the Analytics-Kanban board.
Thu, Feb 22, 10:25 AM · Analytics-Kanban
JAllemandou claimed T159962: Spark 2.2.1 as cluster default (working with oozie).
Thu, Feb 22, 10:25 AM · Analytics-Kanban
JAllemandou added a comment to T159962: Spark 2.2.1 as cluster default (working with oozie).

Can we try that: https://community.hortonworks.com/questions/114243/oozie-spark2-compatibility.html ??

Thu, Feb 22, 10:24 AM · Analytics-Kanban
JAllemandou renamed T159962: Spark 2.2.1 as cluster default (working with oozie) from Spike: Spark 2.x as cluster default (working with oozie) to Spark 2.x as cluster default (working with oozie).
Thu, Feb 22, 10:24 AM · Analytics-Kanban

Tue, Feb 20

JAllemandou added a comment to T184095: Understand Android app monthly active users and daily active users.

Some ideas of improvement:

  • Use parquet file format instead of default hive format
  • Store data daily instead of hourly (~10Mb per hour in hive format, meaning ~250Mb per day, plus parquet compaction --> Should be good)
  • Add request_count instead of keeping distincts
  • Prevent errors in happending data using OVERWRITE
  • Enforce number of files hive output (default is way too many, therefore small, therefore inefficient)

Updated version of the code below:

Tue, Feb 20, 7:47 PM · Discovery-Analysis (Current work)
JAllemandou added a comment to T187806: Beta: Provide easier way of accessing metrics as defined in Wikistats 1.

@Lydia: You should split by editor type. The editors you are talking about are, I think what we call in Wikistats 2 registered-users editors.
Please let us if I'm wrong!

Tue, Feb 20, 5:00 PM · Analytics, Analytics-Wikistats

Mon, Feb 19

JAllemandou created T187723: Give 'sudo -u yarn' asccess to joal on analytics-hadoop-workers nodes.
Mon, Feb 19, 1:53 PM · Patch-For-Review, Ops-Access-Requests, Operations
JAllemandou added a comment to T186559: Upload XML dumps to hdfs.

@bmansurov No worries :) The whole point of this two things is to work for 'every' wiki :)

Mon, Feb 19, 12:26 PM · Analytics
JAllemandou added a comment to T186559: Upload XML dumps to hdfs.

@bmansurov and @diego : Data is available up to 2018-01 included at hdfs:///user/joal/wmf/data/wmf/mediawiki/wikitext/snaphsot=2018-01.
I think we're not going to put more effort into productionization as of now, but new imports can be done.

Mon, Feb 19, 12:22 PM · Analytics
JAllemandou added a comment to T186559: Upload XML dumps to hdfs.

@bmansurov : There is an example command line in the header-comment of the XmlConverter file.
Little reminder: these two patches deal with huge datasets (2TB of bz2 compressed XML and 18TB of snappy compressed parquet). My wish is really for them to be productionized so that the data they import/compute is not duplicated.

Mon, Feb 19, 12:01 PM · Analytics
JAllemandou added a comment to T186559: Upload XML dumps to hdfs.

Took longer that I reminded but it's done: /user/joal/wmf/data/wmf/mediawiki/wikitext/snapshot=2018-01
Looks like that patches work :)

Mon, Feb 19, 8:32 AM · Analytics

Thu, Feb 15

JAllemandou added a comment to T186602: Monitor and alert if no new data from JsonRefine jobs.

After more thoughts, looks like the current need only needs to cron-check tha new data flows in regularly and email if not.
Accumulators and reports of execution might come in a second round (after using spark2 and having better understood some of its benefits)

Thu, Feb 15, 9:22 PM · Patch-For-Review, Analytics-Kanban, Analytics-EventLogging
JAllemandou added a comment to T186559: Upload XML dumps to hdfs.

@bmansurov : This doc is a it outdated. It should work but there is better tooling now.

Thu, Feb 15, 8:38 PM · Analytics
JAllemandou added a comment to T177965: Beta Release: Resiliency, Rollback and Deployment of Data.

First round of discussion with the team:

  • Things we agre on:
    • using multiple datasources in druid (snapshots) seems the way to go to facilitate rollbacks (naming convention could follow our snapshots: YYYY-MM)
    • Data quality checks using old/new datasources in druid seems also interesting for both data quality and cache warming.
  • Thing still to be discussed: How do we swap from old datasource to new datasource in AQS when we think it's ready (or when the other way around when we rollback). Multiple ideas:
    • Use cassandra as a key/value config store (no deploy needed, change can be pushed via API, but we use a ''data'' store to config)
    • Use a dedicated file with its own repo (deploy needed)
    • Use another conf system (etcd, zookeeper...) -- Again another tool ....

TBD !

Thu, Feb 15, 6:35 PM · Analytics, Analytics-Wikistats
JAllemandou moved T179976: Create scala-spark job to ingest simple data sets from Hive-EventLogging to Druid to Pivot from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Feb 15, 5:01 PM · Analytics-Kanban
JAllemandou moved T185100: Make banner-activity success file cleaner not fail when there's nothing to be cleaned from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Feb 15, 5:01 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T186155: Provide MediaWiki timestamps in Hive-refined EventLogging tables via UDF from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Feb 15, 5:01 PM · Patch-For-Review, Analytics-Kanban, Analytics-EventLogging
JAllemandou moved T186541: Make sqoop cron job report errors if success flags are not written from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Feb 15, 5:01 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T186542: Make sqoop python code write success flags for each table that's fully imported for all wikis from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Feb 15, 5:01 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T167907: Incorporate data from the GeoIP2 ISP database to webrequest from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Feb 15, 5:01 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T183682: Hindi Wikiversity is not showing in Wikimedia Stats from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Feb 15, 5:01 PM · Patch-For-Review, Analytics-Kanban, Analytics-Wikistats, Hindi-Sites
JAllemandou moved T186180: Move non-critical monthly jobs to the nice queue from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Feb 15, 5:01 PM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou moved T171099: CamusPartitionChecker does not work when topic names have '.' or '-' in them. from Ready to Deploy to Done on the Analytics-Kanban board.
Thu, Feb 15, 5:01 PM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou moved T171099: CamusPartitionChecker does not work when topic names have '.' or '-' in them. from In Code Review to Ready to Deploy on the Analytics-Kanban board.
Thu, Feb 15, 10:41 AM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou moved T186180: Move non-critical monthly jobs to the nice queue from In Code Review to Ready to Deploy on the Analytics-Kanban board.
Thu, Feb 15, 10:40 AM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster

Mon, Feb 12

mpopov awarded T186180: Move non-critical monthly jobs to the nice queue a Like token.
Mon, Feb 12, 5:47 PM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou moved T186180: Move non-critical monthly jobs to the nice queue from Next Up to In Code Review on the Analytics-Kanban board.
Mon, Feb 12, 5:40 PM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou edited projects for T186180: Move non-critical monthly jobs to the nice queue, added: Analytics-Kanban; removed Analytics.
Mon, Feb 12, 5:40 PM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou renamed T186180: Move non-critical monthly jobs to the nice queue from Move non-critical monthly jobs to the nice queue to Move Clickstream job to later in the month.
Mon, Feb 12, 5:39 PM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou moved T186542: Make sqoop python code write success flags for each table that's fully imported for all wikis from In Code Review to Ready to Deploy on the Analytics-Kanban board.
Mon, Feb 12, 1:15 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T186541: Make sqoop cron job report errors if success flags are not written from In Code Review to Ready to Deploy on the Analytics-Kanban board.
Mon, Feb 12, 1:15 PM · Patch-For-Review, Analytics-Kanban

Fri, Feb 9

JAllemandou added a comment to T186559: Upload XML dumps to hdfs.

@bmansurov: The last snapshot I realized was beginning of 2017-06 (named 2017-05, since the last full month is May 2017). It's available in two formats:

    • hdfs:///user/joal/wmf/data/raw/mediawiki/xmldumps/20170601 in xml (files are stored inside folder by wiki, formatted as hive partitions)
  • hdfs:///user/joal/wmf/data/wmf/mediawiki/wikitext/snapshot=2017-05 in parquet (same here, files inside folder to be accessible as partitions).
Fri, Feb 9, 8:11 AM · Analytics

Thu, Feb 8

JAllemandou moved T186541: Make sqoop cron job report errors if success flags are not written from Ready to Deploy to In Code Review on the Analytics-Kanban board.
Thu, Feb 8, 9:39 AM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T186541: Make sqoop cron job report errors if success flags are not written from Next Up to Ready to Deploy on the Analytics-Kanban board.
Thu, Feb 8, 9:39 AM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T186542: Make sqoop python code write success flags for each table that's fully imported for all wikis from Next Up to In Code Review on the Analytics-Kanban board.
Thu, Feb 8, 9:39 AM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T186155: Provide MediaWiki timestamps in Hive-refined EventLogging tables via UDF from In Code Review to Ready to Deploy on the Analytics-Kanban board.
Thu, Feb 8, 9:39 AM · Patch-For-Review, Analytics-Kanban, Analytics-EventLogging
JAllemandou moved T186155: Provide MediaWiki timestamps in Hive-refined EventLogging tables via UDF from Next Up to In Code Review on the Analytics-Kanban board.
Thu, Feb 8, 9:39 AM · Patch-For-Review, Analytics-Kanban, Analytics-EventLogging
JAllemandou set the point value for T186542: Make sqoop python code write success flags for each table that's fully imported for all wikis to 3.
Thu, Feb 8, 9:38 AM · Patch-For-Review, Analytics-Kanban
JAllemandou set the point value for T186541: Make sqoop cron job report errors if success flags are not written to 3.
Thu, Feb 8, 9:38 AM · Patch-For-Review, Analytics-Kanban

Tue, Feb 6

JAllemandou claimed T186155: Provide MediaWiki timestamps in Hive-refined EventLogging tables via UDF.
Tue, Feb 6, 6:48 PM · Patch-For-Review, Analytics-Kanban, Analytics-EventLogging

Mon, Feb 5

JAllemandou moved T185419: Mediacounts missing top1000 files after 2018-01-01 from Next Up to Paused on the Analytics-Kanban board.
Mon, Feb 5, 1:12 PM · Patch-For-Review, Analytics-Kanban, Datasets-Webstatscollector, Datasets-Archiving, Analytics-Cluster

Wed, Jan 24

JAllemandou added a comment to T185419: Mediacounts missing top1000 files after 2018-01-01.

Fixed as of 2018-01-23.
@ezachte : Could you launh a backfill of 2018-01-01 to 2018-01-22 ?
Many thanks !

Wed, Jan 24, 6:55 PM · Patch-For-Review, Analytics-Kanban, Datasets-Webstatscollector, Datasets-Archiving, Analytics-Cluster

Jan 23 2018

JAllemandou moved T171099: CamusPartitionChecker does not work when topic names have '.' or '-' in them. from Next Up to In Code Review on the Analytics-Kanban board.
Jan 23 2018, 1:53 PM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou claimed T171099: CamusPartitionChecker does not work when topic names have '.' or '-' in them..
Jan 23 2018, 1:52 PM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou added a comment to T171099: CamusPartitionChecker does not work when topic names have '.' or '-' in them..

Found the precise line: https://github.com/wikimedia/analytics-camus/blob/master/camus-etl-kafka/src/main/java/com/linkedin/camus/etl/kafka/partitioner/DefaultPartitioner.java#L67
Will patch partition checker accordingly.

Jan 23 2018, 11:41 AM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster
JAllemandou added a comment to T184011: Confusing abbreviation on Wikistats 2.0 Alpha.

@Pine: We have discusse this naming convention within the team a while ago, and decided to go for "g" instead of "b" for internationalization reasons.
"b" stands for billion, which is very Engish-centric. In French for instance, the word billion stands for a millionof millions (see https://en.wikipedia.org/wiki/Billion for more precision).
In order to mitigate this issue, we decided to pick the "prefix" of magnitude orders (see https://en.wikipedia.org/wiki/Order_of_magnitude#Uses), in which such a difference doesn't exist.

Jan 23 2018, 9:15 AM · Analytics-Wikistats, Analytics

Jan 22 2018

JAllemandou moved T185344: Add wikis to clickstream generation from In Code Review to Done on the Analytics-Kanban board.
Jan 22 2018, 5:40 PM · Patch-For-Review, Analytics-Kanban

Jan 19 2018

JAllemandou moved T185344: Add wikis to clickstream generation from Next Up to In Code Review on the Analytics-Kanban board.
Jan 19 2018, 8:30 PM · Patch-For-Review, Analytics-Kanban
JAllemandou renamed T185344: Add wikis to clickstream generation from Add french wiki to clickstream generation to Add wikis to clickstream generation.
Jan 19 2018, 8:16 PM · Patch-For-Review, Analytics-Kanban
JAllemandou created T185344: Add wikis to clickstream generation.
Jan 19 2018, 8:15 PM · Patch-For-Review, Analytics-Kanban
JAllemandou merged T160822: Filter local IPs before checking for geo info into T167907: Incorporate data from the GeoIP2 ISP database to webrequest.
Jan 19 2018, 5:04 PM · Patch-For-Review, Analytics-Kanban
JAllemandou merged task T160822: Filter local IPs before checking for geo info into T167907: Incorporate data from the GeoIP2 ISP database to webrequest.
Jan 19 2018, 5:04 PM · Analytics-Kanban, Analytics-Cluster
JAllemandou set the point value for T167907: Incorporate data from the GeoIP2 ISP database to webrequest to 8.
Jan 19 2018, 5:03 PM · Patch-For-Review, Analytics-Kanban

Jan 18 2018

JAllemandou moved T167907: Incorporate data from the GeoIP2 ISP database to webrequest from In Progress to In Code Review on the Analytics-Kanban board.
Jan 18 2018, 7:33 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T184541: Update AQS pageview-top definition from In Code Review to Ready to Deploy on the Analytics-Kanban board.
Jan 18 2018, 7:33 PM · Services (done), Patch-For-Review, Analytics-Kanban, RESTBase-API
JAllemandou moved T163933: Investigate oozie suspended workflows from Ready to Deploy to Done on the Analytics-Kanban board.
Jan 18 2018, 7:32 PM · Analytics-Kanban

Jan 12 2018

JAllemandou added a comment to T177965: Beta Release: Resiliency, Rollback and Deployment of Data.
Jan 12 2018, 7:31 PM · Analytics, Analytics-Wikistats
faidon awarded T167907: Incorporate data from the GeoIP2 ISP database to webrequest a Love token.
Jan 12 2018, 3:59 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T167907: Incorporate data from the GeoIP2 ISP database to webrequest from Next Up to In Progress on the Analytics-Kanban board.
Jan 12 2018, 12:58 PM · Patch-For-Review, Analytics-Kanban
JAllemandou claimed T167907: Incorporate data from the GeoIP2 ISP database to webrequest.
Jan 12 2018, 12:58 PM · Patch-For-Review, Analytics-Kanban
JAllemandou edited projects for T167907: Incorporate data from the GeoIP2 ISP database to webrequest, added: Analytics-Kanban; removed Analytics.
Jan 12 2018, 12:57 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T160822: Filter local IPs before checking for geo info from Next Up to In Code Review on the Analytics-Kanban board.
Jan 12 2018, 12:56 PM · Analytics-Kanban, Analytics-Cluster
JAllemandou moved T177965: Beta Release: Resiliency, Rollback and Deployment of Data from Next Up to In Progress on the Analytics-Kanban board.
Jan 12 2018, 10:05 AM · Analytics, Analytics-Wikistats
JAllemandou claimed T177965: Beta Release: Resiliency, Rollback and Deployment of Data.
Jan 12 2018, 10:04 AM · Analytics, Analytics-Wikistats
JAllemandou added a comment to T177965: Beta Release: Resiliency, Rollback and Deployment of Data.

Plenty of possible different ways here. Listing the two that makes most sense to me:

Jan 12 2018, 10:04 AM · Analytics, Analytics-Wikistats
JAllemandou moved T163933: Investigate oozie suspended workflows from In Progress to Ready to Deploy on the Analytics-Kanban board.
Jan 12 2018, 9:13 AM · Analytics-Kanban
JAllemandou moved T184541: Update AQS pageview-top definition from Next Up to In Code Review on the Analytics-Kanban board.
Jan 12 2018, 8:37 AM · Services (done), Patch-For-Review, Analytics-Kanban, RESTBase-API
JAllemandou claimed T184541: Update AQS pageview-top definition.
Jan 12 2018, 8:37 AM · Services (done), Patch-For-Review, Analytics-Kanban, RESTBase-API
JAllemandou added a comment to T184541: Update AQS pageview-top definition.

Also submitted a PR to restbase: https://github.com/wikimedia/restbase/pull/941

Jan 12 2018, 8:36 AM · Services (done), Patch-For-Review, Analytics-Kanban, RESTBase-API

Jan 11 2018

JAllemandou closed T183951: Document mediawiki history reduced table as Declined.
Jan 11 2018, 8:47 PM · Analytics-Kanban
JAllemandou moved T163933: Investigate oozie suspended workflows from Next Up to In Progress on the Analytics-Kanban board.
Jan 11 2018, 8:45 PM · Analytics-Kanban
JAllemandou moved T176983: Productionize streaming jobs from In Progress to Done on the Analytics-Kanban board.
Jan 11 2018, 11:44 AM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T168550: Make tranquility work with Spark from In Code Review to Done on the Analytics-Kanban board.
Jan 11 2018, 11:44 AM · Analytics-Kanban, Patch-For-Review, Analytics-Cluster

Jan 9 2018

JAllemandou created T184541: Update AQS pageview-top definition.
Jan 9 2018, 5:39 PM · Services (done), Patch-For-Review, Analytics-Kanban, RESTBase-API

Jan 5 2018

JAllemandou added a comment to T143819: Data request for logs from SparQL interface at query.wikidata.org.

@Nuria , @Smalyshev : Given all wikidata-query tagged rows belong in misc, which is super small, I have no objection running jobs either hourly or daily.

Jan 5 2018, 4:51 PM · Analytics, Discovery, Wikidata-Query-Service, Wikidata

Jan 4 2018

JAllemandou added a comment to T183263: "Thank You" campaign .

Also, it's a bit of a mystery to me how do the banner impression data get to Pivot, if they're not found in the wmf.webrequest Hive table first? They should be extracted from that table, then sent to Druid, and only then served to Pivot, if I understand the process correctly? @JAllemandou?

Jan 4 2018, 7:12 PM · WMDE-Fundraising-Funban-2, WMDE-Fundraising-Tech, WMDE-Fun-Team
JAllemandou merged task T183975: continue to improve computation for pages, deletion/restores into T179692: Enhance mediawiki-history page reconstruction with best historical information possible.
Jan 4 2018, 6:05 PM · Analytics
JAllemandou merged T183975: continue to improve computation for pages, deletion/restores into T179692: Enhance mediawiki-history page reconstruction with best historical information possible.
Jan 4 2018, 6:05 PM · Analytics
JAllemandou moved T179692: Enhance mediawiki-history page reconstruction with best historical information possible from Q4 (April 2018) to Q3 (january 2018) on the Analytics board.
Jan 4 2018, 6:04 PM · Analytics

Jan 3 2018

JAllemandou added a comment to T184011: Confusing abbreviation on Wikistats 2.0 Alpha.

Hi!
Eyeballing at pageviews for ENWP over the past 2 years (2016 and 2017) in original wikistats and in wikistats v2 seems to be coherent: ~8g pageviews per month (8 billion, g standing for giga == billion). And 8 * 23 = 184g, which is not far from 176.
Can you tell us more precisely the discrepencies you saw in ENWP?
Thanks !

Jan 3 2018, 2:05 PM · Analytics-Wikistats, Analytics
JAllemandou updated the task description for T181703: Implement digest-only mediawiki_history_reduced dataset in spark.
Jan 3 2018, 12:44 PM · Analytics
JAllemandou updated the task description for T179692: Enhance mediawiki-history page reconstruction with best historical information possible.
Jan 3 2018, 12:39 PM · Analytics
JAllemandou updated subscribers of T183951: Document mediawiki history reduced table.

Actually I made a mistake yesterday: this table is not available in hive. It is temporarily created, loaded into Druid, then deleted. I think documentation is not needed as if it was available. Do you agree @Nuria and @Milimetric ?

Jan 3 2018, 9:02 AM · Analytics-Kanban

Dec 21 2017

JAllemandou added a comment to T183188: Link to 'more info' doesn't always work.

Metrics name links and 'More info' links only work for Reading section. Any other section doesn't have pages as of now.
Do we create a section in our wiki with a page for each metrics? Or do we gather all of them in a specific page?

Dec 21 2017, 1:20 PM · Patch-For-Review, Analytics-Kanban, Analytics-Wikistats

Dec 19 2017

JAllemandou added a comment to T183208: New page stats are inaccurate for fawiki.

Hi @Huji,
Thanks for your ticket, it is very clear and well documented :)
I'll try to give you answers to some of the things you pointed:

    • We know about the chart misalignment (T182817), we will work on correcting this (this is a real bug !).
  • About data mismatch with https://stats.wikimedia.org/EN/ChartsWikipediaFA.htm, it's because stats.wikimedia.org doesn't include non-content pages in its stats. You can tick the checkbox for splitting by page type, untick the 'non-content' box, and the values should match the ones in wikistats a lot more closely :)
  • About the difference in page-numbers you observe, they are due to redirects: the new-page metric (as well as the edited-pages one) doesn't include redirect pages. I have checked in databases: for fa-wiki in our last import there was ~3.8M pages, including ~1.5M redirects (leaving ~2.3M pages with text, whether in content or non-content namespaces) .
Dec 19 2017, 1:59 PM · Analytics, Analytics-Wikistats

Dec 18 2017

JAllemandou moved T178478: Check data from new API endpoints against existing sources from Paused to Done on the Analytics-Kanban board.
Dec 18 2017, 2:24 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T175844: Provide oozie job running ClickStream spark job regularly from Ready to Deploy to Done on the Analytics-Kanban board.
Dec 18 2017, 2:24 PM · Patch-For-Review, Analytics-Kanban

Dec 15 2017

JAllemandou added a comment to T182954: Wikistats Bug : wrong data in Top viewed articles (about frwiki).

Hi @Manu1400
Thanks for this ticket.
As most data in wikistats-v2 is updated monthly, we decided toshow only last month top pageviews.
This might however change in the future :)

Dec 15 2017, 2:27 PM · Pageviews-API, Analytics

Dec 14 2017

JAllemandou added a comment to T182859: Wikistats 2.0 Bug at swwiki .

a history page 0.2 instead of 2.0?

I don't understand this bit ...
The time selector works well on any other page (with charts). The "top" page is special in that it doesn't show time, just a list - I assume this is why.
You can hwever trust other charts :)

Dec 14 2017, 12:13 PM · Analytics, Analytics-Wikistats
JAllemandou added a comment to T182859: Wikistats 2.0 Bug at swwiki .

Thanks @Kipala :), good catch ! Seems related to dates of the data. In Wikistats 2 data is pulled from 2015-10. The time selector at the back doesn't avtually means anything for that specific page ...

Dec 14 2017, 11:31 AM · Analytics, Analytics-Wikistats

Dec 12 2017

JAllemandou added a comment to T166689: Productionize Superset .

@Ottomata : Super cool ! Many thanks :)

Dec 12 2017, 10:13 AM · Analytics-Kanban, Patch-For-Review

Dec 10 2017

JAllemandou moved T179689: Rename historical fields in mediawiki-history from Ready to Deploy to Done on the Analytics-Kanban board.
Dec 10 2017, 10:30 AM · Analytics-Kanban
JAllemandou moved T179074: Fix mediawiki history page reconstruction bug (similar timestamps) from Ready to Deploy to Done on the Analytics-Kanban board.
Dec 10 2017, 10:30 AM · Analytics-Kanban
JAllemandou moved T179690: Fix mediawiki-history page reconstruction bug (deletes and restores) - simple patch from Ready to Deploy to Done on the Analytics-Kanban board.
Dec 10 2017, 10:29 AM · Analytics-Kanban
JAllemandou moved T178504: Update mediawiki_history_reduced oozie job loading AQS druid backend from Ready to Deploy to Done on the Analytics-Kanban board.
Dec 10 2017, 10:29 AM · Analytics-Kanban

Dec 8 2017

JAllemandou added a comment to T176785: Add action api counts to graphite-restbase job.

@Pchelolo: It has indeed happen.
The tak has been moved to done on our kanban, we'll resolve it after we finalize the discussion :)
Thanks !

Dec 8 2017, 10:38 AM · Patch-For-Review, Services (watching), Analytics-Kanban

Dec 7 2017

JAllemandou moved T175844: Provide oozie job running ClickStream spark job regularly from In Code Review to Ready to Deploy on the Analytics-Kanban board.
Dec 7 2017, 9:10 PM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T176785: Add action api counts to graphite-restbase job from Ready to Deploy to Done on the Analytics-Kanban board.
Dec 7 2017, 8:49 PM · Patch-For-Review, Services (watching), Analytics-Kanban