Page MenuHomePhabricator

JAllemandou (joal)
Data Engineer

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Feb 11 2015, 6:02 PM (227 w, 6 d)
Availability
Available
IRC Nick
joal
LDAP User
Unknown
MediaWiki User
JAllemandou (WMF) [ Global Accounts ]

Recent Activity

Today

Samat awarded T226338: Drop of editor numbers for earlier months a Like token.
Tue, Jun 25, 5:27 PM · Analytics-Kanban, Analytics
JAllemandou added a comment to T226338: Drop of editor numbers for earlier months.

Thanks a lot @Samat for the details.
Indeed you were right the difference is to be accounted for a methodological change. I'm sorry not to have noticed right away.
From the month 2019-05 onward, we have changed the way editors were computed by removing the edits on deleted pages.
We did this to be more homogeneous, as other metrics (edits and edited-pages for instance) were already computed with deleted-edits removal.

Tue, Jun 25, 1:54 PM · Analytics-Kanban, Analytics

Yesterday

JAllemandou added a comment to T226227: Keep webrequest_sampled_128's druid segments for more than a week.

Correct (see https://druid.apache.org/docs/latest/tutorials/tutorial-delete-data.html, paragraph How to permanently delete data). We can also use API calls to mark segments as unused if we prefer not using rules.

Mon, Jun 24, 6:52 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou added a comment to T226227: Keep webrequest_sampled_128's druid segments for more than a week.

With better/more precise explanation:

  • In order for data to be dropped from deepstorage, it needs to be unloaded from historical nodes. This can be done in 2 ways: disabling a full datasource, or disabling segments using rules.
  • Once segments are disabled, you can run the kill task to drop them.

Given the need to use rules to disable segments from historical, I'd rather keep the max data in hadoop (no storage issue so far).

Mon, Jun 24, 6:24 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou added a comment to T226227: Keep webrequest_sampled_128's druid segments for more than a week.

@Nuria: We on purpose did it the way it is setup, in order to facilitate loading data in druid in case it is needed (data present in deep-storage for 60 days) while still keeping space on druid.
Having agreed we should keep 1 month of data in druid, I still recommend using rules to unload data after 1 month and keep 60 days in deep storage, as 2 month means 2Tb per server in druid, probably too much.

Mon, Jun 24, 5:39 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou added a comment to T226338: Drop of editor numbers for earlier months.

Hi @Samat, thanks for reaching out.
It would be interesting if you could upload the files again, and also possibly confirm the URL you downloaded data from, as my tests/checks don't show differences that big.
I have checked the number of users only editors for huwiki over 4 years, looking for differences in our last 3 snapshots (we call monthly recomputations snapshots), and while there a very small deletion-drift (difference due to pages being deleted, as they are excluded from statistics computation), they are really not a 5%/10% change, more like -0.05% to -0.10%, and only for 3/4 month before last month.

Mon, Jun 24, 7:57 AM · Analytics-Kanban, Analytics

Fri, Jun 21

JAllemandou moved T225247: Update bot user check in mediawiki-user-history-checker to use historical bot values from Next Up to In Code Review on the Analytics-Kanban board.
Fri, Jun 21, 8:21 AM · Analytics-Kanban, Analytics
JAllemandou set the point value for T225247: Update bot user check in mediawiki-user-history-checker to use historical bot values to 3.
Fri, Jun 21, 8:21 AM · Analytics-Kanban, Analytics
JAllemandou moved T205594: mediawiki_history missing page events from In Progress to In Code Review on the Analytics-Kanban board.
Fri, Jun 21, 8:19 AM · Analytics-Kanban, Analytics-Data-Quality, Analytics, Contributors-Analysis, Product-Analytics
JAllemandou moved T214490: page_creation_timestamp not always correct in mediawiki_history from In Progress to In Code Review on the Analytics-Kanban board.
Fri, Jun 21, 8:19 AM · Analytics-Kanban, Product-Analytics, Analytics-Data-Quality, Analytics
JAllemandou moved T190434: Issues with page deleted dates on data lake from In Progress to In Code Review on the Analytics-Kanban board.
Fri, Jun 21, 8:19 AM · Patch-For-Review, Analytics, Analytics-Kanban
JAllemandou moved T221338: Many revision events in mediawiki_history have missing page and namespace information from In Progress to In Code Review on the Analytics-Kanban board.
Fri, Jun 21, 8:18 AM · Analytics-Kanban, Analytics-Data-Quality, Analytics, Product-Analytics
JAllemandou moved T221825: Mediawiki-history release - Snapshot 2019-05 from In Progress to In Code Review on the Analytics-Kanban board.
Fri, Jun 21, 8:18 AM · Analytics-Kanban, Analytics
JAllemandou moved T220507: Decide: start_timestamp for mediawiki history from In Progress to In Code Review on the Analytics-Kanban board.
Fri, Jun 21, 8:18 AM · Analytics-Kanban, Analytics

Tue, Jun 18

JAllemandou moved T225178: New directories created under /wmf/data/event_sanitized and /wmf/data/event_sanitized are owned by yarn:analytics from Ready to Deploy to Done on the Analytics-Kanban board.
Tue, Jun 18, 4:03 PM · Analytics-Kanban, Patch-For-Review, Analytics

Mon, Jun 17

JAllemandou added a comment to T225786: Investigate varnish behavior change since new ATS-change in webrequest upload.

We can easily get data for older days if needed (we don't drop statistic-data).

Mon, Jun 17, 11:36 AM · Traffic, Analytics, Operations

Fri, Jun 14

JAllemandou added a comment to T225538: Request for a large request data set for caching research and tuning.

Hi @Nuria - Can you confirm the above request is correct for generating the data?

Fri, Jun 14, 12:47 PM · Analytics
JAllemandou updated the task description for T225786: Investigate varnish behavior change since new ATS-change in webrequest upload.
Fri, Jun 14, 10:59 AM · Traffic, Analytics, Operations
JAllemandou renamed T225786: Investigate varnish behavior change since new ATS-change in webrequest upload from Investigate varnish behavior change since new ATS-change in upload to Investigate varnish behavior change since new ATS-change in webrequest upload.
Fri, Jun 14, 10:58 AM · Traffic, Analytics, Operations
Restricted Application added a project to T225786: Investigate varnish behavior change since new ATS-change in webrequest upload: Operations.
Fri, Jun 14, 8:53 AM · Traffic, Analytics, Operations

Thu, Jun 13

JAllemandou moved T225342: Empty hostnames trigger Refine eventlogging failures from In Code Review to Ready to Deploy on the Analytics-Kanban board.
Thu, Jun 13, 4:19 PM · Analytics-Kanban, Patch-For-Review, Analytics

Tue, Jun 11

JAllemandou added a comment to P8605 SWAP incorrect AQS results.

I found a workaround url:

"http://aqs1004.eqiad.wmnet:7232/analytics.wikimedia.org/v1/edited-pages/new/all-projects/all-editor-types/content/monthly/20190501/20190601"
Tue, Jun 11, 6:40 PM
JAllemandou renamed T225247: Update bot user check in mediawiki-user-history-checker to use historical bot values from Remove bot user check from userHistory in mediawiki-history-checker to Update bot user check in mediawiki-user-history-checker to use historical bot values.
Tue, Jun 11, 12:32 PM · Analytics-Kanban, Analytics

Mon, Jun 10

Groceryheist awarded T186559: Provide data dumps in the Analytics Data Lake a Love token.
Mon, Jun 10, 7:52 PM · Research, Analytics
JAllemandou added a comment to T221338: Many revision events in mediawiki_history have missing page and namespace information.

Thanks for offering @Neil_P._Quinn_WMF :)
I'm still working on changing the algorithm, so no need from you as of now.
I'll let you know once I have a test dataset.

Mon, Jun 10, 4:53 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics, Product-Analytics

Sat, Jun 8

JAllemandou created T225343: Refine failure alert seems broken - No alert email sent while jobs were failing.
Sat, Jun 8, 8:22 AM · Analytics-Kanban, Analytics
JAllemandou moved T225342: Empty hostnames trigger Refine eventlogging failures from Next Up to In Code Review on the Analytics-Kanban board.
Sat, Jun 8, 8:14 AM · Analytics-Kanban, Patch-For-Review, Analytics
JAllemandou claimed T225342: Empty hostnames trigger Refine eventlogging failures.
Sat, Jun 8, 8:13 AM · Analytics-Kanban, Patch-For-Review, Analytics
JAllemandou added a comment to T225342: Empty hostnames trigger Refine eventlogging failures.

Issue pinpointed in the new TransformFunction applied to drop non-mediawiki data: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/refine/TransformFunctions.scala#L105

Sat, Jun 8, 7:58 AM · Analytics-Kanban, Patch-For-Review, Analytics
JAllemandou added a comment to T209655: Copy Wikidata dumps to HDFs.

@GoranSMilovanovic : You're welcome :) At some point I'll manage to have that productionize ;)

Sat, Jun 8, 7:21 AM · Wikidata, Research, Analytics

Fri, Jun 7

JAllemandou added a comment to T224957: Pyspark shell shut down automatically.

Spark driver is not launched from the notebook but from the kernel, and it's configuration is not updatable on the fly, so I'm not surprised it doesn't work.
The solution is to bump driver-memory at the kernel level (see my ping to Andrew and Luca in the previous comment).

Fri, Jun 7, 5:46 PM · Analytics
JAllemandou added a comment to T224957: Pyspark shell shut down automatically.

I have reproduced the error. The problem comes from driver-memory I think. I have been able to make the computation succeed for 1 day in python-notebook, and for 1 month in CLI with higher driver memory.

Fri, Jun 7, 11:41 AM · Analytics
JAllemandou moved T225178: New directories created under /wmf/data/event_sanitized and /wmf/data/event_sanitized are owned by yarn:analytics from Next Up to In Code Review on the Analytics-Kanban board.
Fri, Jun 7, 10:58 AM · Analytics-Kanban, Patch-For-Review, Analytics
JAllemandou added a project to T225178: New directories created under /wmf/data/event_sanitized and /wmf/data/event_sanitized are owned by yarn:analytics: Analytics-Kanban.
Fri, Jun 7, 10:57 AM · Analytics-Kanban, Patch-For-Review, Analytics
JAllemandou claimed T225178: New directories created under /wmf/data/event_sanitized and /wmf/data/event_sanitized are owned by yarn:analytics.
Fri, Jun 7, 10:57 AM · Analytics-Kanban, Patch-For-Review, Analytics
JAllemandou added a comment to T225178: New directories created under /wmf/data/event_sanitized and /wmf/data/event_sanitized are owned by yarn:analytics.

Issue found by manual test of DataFrameToHive (I added logging and created a small class using DataFrameToHive to test) on that line: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-spark/src/main/scala/org/wikimedia/analytics/refinery/spark/connectors/DataFrameToHive.scala#L234

Fri, Jun 7, 10:57 AM · Analytics-Kanban, Patch-For-Review, Analytics

Thu, Jun 6

JAllemandou claimed T224957: Pyspark shell shut down automatically.
Thu, Jun 6, 5:56 PM · Analytics
JAllemandou created T225232: Backfill EL new schemas sanitization after ownership issue fixed.
Thu, Jun 6, 4:44 PM · Analytics-Kanban, Analytics

May 23 2019

JAllemandou created T224221: Contributor ID field has empty instances in 2019-05-01 dumps (was 0 in previous month).
May 23 2019, 1:09 PM · MW-1.34-notes (1.34.0-wmf.8; 2019-06-04), Dumps-Generation

May 21 2019

JAllemandou updated subscribers of T223929: Wikistats Bug: Top editors counts and time selection are not displayed correctly.

Following your path, I confirm I have the same problem you do.
Thanks a lot for reporting @Formatierer!

May 21 2019, 10:11 AM · Analytics-Kanban, Analytics, Analytics-Wikistats

May 20 2019

JAllemandou added a comment to T223929: Wikistats Bug: Top editors counts and time selection are not displayed correctly.

Hi @Formatierer - While I definitely see the snapshot, I can't reproduce on wikistats :(

May 20 2019, 6:49 PM · Analytics-Kanban, Analytics, Analytics-Wikistats

May 17 2019

JAllemandou added a comment to T222254: Pyspark on SWAP: Py4JJavaError: Import Error: no module named pyarrow.

NO WAY !!!! I'm super sorry for having derailed that :(

May 17 2019, 6:48 PM · Analytics, Analytics-Cluster
JAllemandou moved T222603: Fix oozie banner_impression monthly job from Ready to Deploy to Done on the Analytics-Kanban board.
May 17 2019, 6:06 PM · Analytics-Kanban, Analytics
JAllemandou claimed T223653: Fix mediawiki_wikitext_history SLA.
May 17 2019, 6:06 PM · Analytics-Kanban, Analytics
JAllemandou set the point value for T223653: Fix mediawiki_wikitext_history SLA to 1.
May 17 2019, 6:06 PM · Analytics-Kanban, Analytics
JAllemandou moved T223653: Fix mediawiki_wikitext_history SLA from Next Up to In Code Review on the Analytics-Kanban board.
May 17 2019, 6:06 PM · Analytics-Kanban, Analytics
JAllemandou created T223653: Fix mediawiki_wikitext_history SLA.
May 17 2019, 6:05 PM · Analytics-Kanban, Analytics

May 16 2019

JAllemandou added a comment to T220977: Investigate surprising rise in mobile page views for wikidata.

A lot trickier :)
We have the wmf_raw.mediawiki_private_cu_changes table in hive, allowing us to compute geo-editors (editors-by-country, aggregated). This table only contains 3 month of data for PII removal reasons. It's probably not enough for what you're after, but I have nothing better (see https://github.com/wikimedia/analytics-refinery/blob/master/oozie/mediawiki/geoeditors/monthly/insert_geoeditors_monthly_data.hql for an example).
I've just created T223444 to submit the general idea of having geo-editors stats split by desktop/mobile.

May 16 2019, 1:09 PM · User-GoranSMilovanovic, Wikidata, WMDE-Analytics-Engineering
JAllemandou created T223444: Update geo-editors job to use tags and report desktop/mobile edits.
May 16 2019, 1:09 PM · Product-Analytics, Analytics
JAllemandou updated the task description for T218819: Investigate discrepancies in editor metrics between Data Lake and MediaWiki replica pipelines .
May 16 2019, 7:48 AM · Product-Analytics
JAllemandou added a parent task for T220456: Many small wikis missing from mediawiki_history dataset: T221825: Mediawiki-history release - Snapshot 2019-05.
May 16 2019, 7:47 AM · Patch-For-Review, Analytics-Kanban, Analytics-Data-Quality, Analytics, Product-Analytics
JAllemandou added a subtask for T221825: Mediawiki-history release - Snapshot 2019-05: T220456: Many small wikis missing from mediawiki_history dataset.
May 16 2019, 7:47 AM · Analytics-Kanban, Analytics
JAllemandou added a comment to T221824: Mediawiki History Release - 2019-04 snapshot.

Ping @JAllemandou the tasks not closed on 2019-04 snapshot should probably be moved to 2019-05 snapshot cc @fdans

May 16 2019, 7:46 AM · Patch-For-Review, Product-Analytics, Analytics-Kanban, Analytics
JAllemandou removed a parent task for T220456: Many small wikis missing from mediawiki_history dataset: T221824: Mediawiki History Release - 2019-04 snapshot.
May 16 2019, 7:46 AM · Patch-For-Review, Analytics-Kanban, Analytics-Data-Quality, Analytics, Product-Analytics
JAllemandou removed a subtask for T221824: Mediawiki History Release - 2019-04 snapshot: T220456: Many small wikis missing from mediawiki_history dataset.
May 16 2019, 7:46 AM · Patch-For-Review, Product-Analytics, Analytics-Kanban, Analytics

May 14 2019

JAllemandou added a comment to T220977: Investigate surprising rise in mobile page views for wikidata.

Hi @Lea_WMDE and @GoranSMilovanovic - I think the answer the your problem is solved in this month snapshot with the revision_tags field of mediawiki_history:

May 14 2019, 4:03 PM · User-GoranSMilovanovic, Wikidata, WMDE-Analytics-Engineering

May 13 2019

JAllemandou moved T220111: Refactor druid data deletion script from In Code Review to Ready to Deploy on the Analytics-Kanban board.
May 13 2019, 2:54 PM · Analytics-Kanban, Analytics
JAllemandou moved T222603: Fix oozie banner_impression monthly job from Done to Ready to Deploy on the Analytics-Kanban board.
May 13 2019, 9:22 AM · Analytics-Kanban, Analytics

May 7 2019

JAllemandou moved T220507: Decide: start_timestamp for mediawiki history from Ready to Deploy to In Progress on the Analytics-Kanban board.
May 7 2019, 5:33 PM · Analytics-Kanban, Analytics
JAllemandou updated the task description for T222425: Fix jobs after mediawiki-history refactor.
May 7 2019, 5:33 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou moved T213770: Remove Zero support in analytics from In Code Review to Ready to Deploy on the Analytics-Kanban board.
May 7 2019, 5:28 PM · Patch-For-Review, Analytics-Kanban, Technical-Debt, Analytics
JAllemandou moved T222422: Mandatory success_email_to parameter in mediawiki_history_check coordinator from In Code Review to Ready to Deploy on the Analytics-Kanban board.
May 7 2019, 5:28 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou moved T222378: The three sqoop jobs that scoop mediawiki history should do in sequence from Ready to Deploy to Done on the Analytics-Kanban board.
May 7 2019, 5:28 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou moved T222378: The three sqoop jobs that scoop mediawiki history should do in sequence from In Code Review to Ready to Deploy on the Analytics-Kanban board.
May 7 2019, 5:28 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou moved T222603: Fix oozie banner_impression monthly job from In Code Review to Ready to Deploy on the Analytics-Kanban board.
May 7 2019, 5:28 PM · Analytics-Kanban, Analytics
JAllemandou moved T222425: Fix jobs after mediawiki-history refactor from In Code Review to Ready to Deploy on the Analytics-Kanban board.
May 7 2019, 5:28 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou added a comment to T222460: 15.wikipedia.org missclassified as a pageview, same for query.wikidata.org.

Another row:

spark.sql("select uri_host, uri_path, uri_query from wmf.webrequest where webrequest_source = 'text' and year = 2019 and month = 5 and day = 6 and hour = 16 and is_pageview and pageview_info['project'] = '15.wikipedia'").show(10, false)
May 7 2019, 7:30 AM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou added a subtask for T221825: Mediawiki-history release - Snapshot 2019-05: T220507: Decide: start_timestamp for mediawiki history.
May 7 2019, 7:29 AM · Analytics-Kanban, Analytics
JAllemandou added a parent task for T220507: Decide: start_timestamp for mediawiki history: T221825: Mediawiki-history release - Snapshot 2019-05.
May 7 2019, 7:29 AM · Analytics-Kanban, Analytics

May 6 2019

JAllemandou moved T220507: Decide: start_timestamp for mediawiki history from Next Up to In Progress on the Analytics-Kanban board.
May 6 2019, 4:31 PM · Analytics-Kanban, Analytics
JAllemandou moved T218824: A few alterblocks events have event_timestamps from before 2001 from Next Up to In Progress on the Analytics-Kanban board.
May 6 2019, 4:31 PM · Analytics-Kanban, Analytics, Analytics-Data-Quality, Product-Analytics
JAllemandou moved T221338: Many revision events in mediawiki_history have missing page and namespace information from Paused to In Progress on the Analytics-Kanban board.
May 6 2019, 4:30 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics, Product-Analytics
JAllemandou renamed T221825: Mediawiki-history release - Snapshot 2019-05 from Mediawiki-history release - WIP for future to Mediawiki-history release - Snapshot 2019-05.
May 6 2019, 4:28 PM · Analytics-Kanban, Analytics
JAllemandou moved T221825: Mediawiki-history release - Snapshot 2019-05 from Next Up to In Progress on the Analytics-Kanban board.
May 6 2019, 4:28 PM · Analytics-Kanban, Analytics
JAllemandou moved T222378: The three sqoop jobs that scoop mediawiki history should do in sequence from Next Up to In Code Review on the Analytics-Kanban board.
May 6 2019, 4:27 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou moved T222603: Fix oozie banner_impression monthly job from Next Up to In Code Review on the Analytics-Kanban board.
May 6 2019, 4:15 PM · Analytics-Kanban, Analytics
JAllemandou claimed T222603: Fix oozie banner_impression monthly job.
May 6 2019, 4:14 PM · Analytics-Kanban, Analytics
JAllemandou added a comment to T222460: 15.wikipedia.org missclassified as a pageview, same for query.wikidata.org.

here are the faulty lines:

spark.sql("select uri_host, uri_path, uri_query from wmf.webrequest where webrequest_source = 'text' and year = 2019 and month = 4 and day = 29 and hour = 6 and is_pageview and pageview_info['project'] = '15.wikipedia'").show(10, false)
May 6 2019, 4:06 PM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou updated the task description for T222603: Fix oozie banner_impression monthly job.
May 6 2019, 11:23 AM · Analytics-Kanban, Analytics
JAllemandou added a comment to T222603: Fix oozie banner_impression monthly job.

A manual fix has been applied to 2018 jobs.

May 6 2019, 11:23 AM · Analytics-Kanban, Analytics
JAllemandou created T222603: Fix oozie banner_impression monthly job.
May 6 2019, 11:22 AM · Analytics-Kanban, Analytics

May 3 2019

JAllemandou claimed T191964: Clickstream dataset for Persian Wikipedia only includes external values.
May 3 2019, 11:23 AM · Analytics-Kanban, Analytics
JAllemandou moved T191964: Clickstream dataset for Persian Wikipedia only includes external values from Paused to Ready to Deploy on the Analytics-Kanban board.
May 3 2019, 11:23 AM · Analytics-Kanban, Analytics
JAllemandou added a comment to T191964: Clickstream dataset for Persian Wikipedia only includes external values.

Hi @Ladsgroup - I'm extremely sorry for not having taken the time to answer you faster :(
I've quickly tested your patch and it seems to work.
I have run it on fawiki and frwiki to compare proportions of link vs other-* link types:

wiki_dblinkother-*
frwiki22460152309938
fawiki437332438971

It looks super good :)
Merging for a deploy next week.

May 3 2019, 11:09 AM · Analytics-Kanban, Analytics
JAllemandou moved T213770: Remove Zero support in analytics from In Progress to In Code Review on the Analytics-Kanban board.
May 3 2019, 9:35 AM · Patch-For-Review, Analytics-Kanban, Technical-Debt, Analytics
JAllemandou moved T213770: Remove Zero support in analytics from Next Up to In Progress on the Analytics-Kanban board.
May 3 2019, 9:01 AM · Patch-For-Review, Analytics-Kanban, Technical-Debt, Analytics
JAllemandou moved T219177: Add user_is_bot_by to MediaWiki history from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:01 AM · Patch-For-Review, Analytics-Kanban, Analytics-Wikistats, Analytics
JAllemandou moved T211950: Add partial blocks to mediawiki history tables from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:01 AM · Analytics-Kanban, Product-Analytics, Anti-Harassment, Analytics
JAllemandou moved T218463: Some registered users have null values for event_user_text and event_user_text_historical in mediawiki_history from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:01 AM · Analytics-Kanban, Patch-For-Review, Analytics, Analytics-Data-Quality, Product-Analytics
JAllemandou moved T178587: Update wikimedia-history revision data with deleted field (and find it a new name?) from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:01 AM · Analytics-Kanban, Patch-For-Review, Analytics
JAllemandou moved T161149: Provide edit tags in the Data Lake edit data from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:01 AM · Analytics-Kanban, Analytics
JAllemandou moved T206883: mediawiki_history datasets have null user_text for IP edits from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:01 AM · Analytics-Kanban, Product-Analytics, Analytics-Data-Quality, Analytics
JAllemandou moved T219484: Fix mediawiki-history-checker after field rename from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:00 AM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T167608: Add caused_by_user_text to mediawiki_page_history from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:00 AM · Analytics-Kanban, Analytics
JAllemandou moved T213603: Coordinate work on minor changes for Edit Data Quality from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:00 AM · Patch-For-Review, Analytics-Kanban
JAllemandou moved T221460: Remove dead code from refinery/oozie folders from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:00 AM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou moved T221824: Mediawiki History Release - 2019-04 snapshot from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:00 AM · Patch-For-Review, Product-Analytics, Analytics-Kanban, Analytics
JAllemandou moved T222141: Mediawiki-History fixes before deploy from Ready to Deploy to Done on the Analytics-Kanban board.
May 3 2019, 9:00 AM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou merged T222294: refinery-sqoop-mediawiki-production on an-coord1001 needed restart into T222378: The three sqoop jobs that scoop mediawiki history should do in sequence.
May 3 2019, 8:59 AM · Patch-For-Review, Analytics-Kanban, Analytics
JAllemandou merged task T222294: refinery-sqoop-mediawiki-production on an-coord1001 needed restart into T222378: The three sqoop jobs that scoop mediawiki history should do in sequence.
May 3 2019, 8:59 AM · Analytics-Kanban, Analytics-Cluster