Page MenuHomePhabricator

CristianCantoro (Cristian Consonni)
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Oct 22 2014, 1:44 AM (260 w, 6 d)
Availability
Available
LDAP User
CristianCantoro
MediaWiki User
CristianCantoro [ Global Accounts ]

Recent Activity

Sep 13 2019

CristianCantoro added a comment to T232755: Visiting proxied desktop version on mobile gets you redirected to mobile (*.m.wikipedia.org).

It seems that I have had a problem with Caddy's cache plugin. Now, the server is back up.

Sep 13 2019, 10:18 PM · Wikimedia-Apache-configuration, Mobile

Sep 12 2019

CristianCantoro created T232755: Visiting proxied desktop version on mobile gets you redirected to mobile (*.m.wikipedia.org).
Sep 12 2019, 4:02 PM · Wikimedia-Apache-configuration, Mobile
CristianCantoro added a comment to T232213: [[Special:Random]] when used in a proxied website redirects to the main domain wikipedia.org.

I've fixed this, it seems that Caddy implemented a directive to handle this case, here's the pull request caddy#2144.

Sep 12 2019, 12:49 PM

Sep 7 2019

CristianCantoro added a comment to T232213: [[Special:Random]] when used in a proxied website redirects to the main domain wikipedia.org.

Have you considered asking Caddy for support with their software? :)

Sep 7 2019, 10:56 AM

Sep 6 2019

CristianCantoro added a comment to T232213: [[Special:Random]] when used in a proxied website redirects to the main domain wikipedia.org.

https://github.com/caddyserver/caddy/issues/1011 looks a lot like the issue you're experiencing

Sep 6 2019, 2:48 PM
CristianCantoro added a comment to T232213: [[Special:Random]] when used in a proxied website redirects to the main domain wikipedia.org.

I could reproduce this bug on multiple browsers and I have no reason to believe this is a client-related issue, nor I think this is a configuration issue of my proxy.

What does your proxy do with the Location: header? As it kinda seems like it's not capturing it and potentially just passing it through

Sep 6 2019, 1:06 PM
CristianCantoro added a comment to T232188: "Proxy mode" for third-party proxies (to disable login or editing buttons).

There are many other small quirks, such as the fact that [[Special:Random]] redirects automatically to the main domain (wikipedia.org), maybe I should file a separate bug for this.

Yes, please do, with good steps to reproduce. :)

Sep 6 2019, 12:27 PM · MediaWiki-General
CristianCantoro created T232213: [[Special:Random]] when used in a proxied website redirects to the main domain wikipedia.org.
Sep 6 2019, 12:26 PM
CristianCantoro updated the task description for T232188: "Proxy mode" for third-party proxies (to disable login or editing buttons).
Sep 6 2019, 12:16 PM · MediaWiki-General
CristianCantoro created T232188: "Proxy mode" for third-party proxies (to disable login or editing buttons).
Sep 6 2019, 9:04 AM · MediaWiki-General

Apr 30 2019

CristianCantoro added a comment to T222151: Reset my (User:CristianCantoro) Two-Factor Authentication.

Which website is this about? I assume LDAP on wikitech.wikimedia.org and not SUL on mediawiki.org?

Apr 30 2019, 1:16 PM · wikitech.wikimedia.org, Trust-and-Safety
CristianCantoro created T222151: Reset my (User:CristianCantoro) Two-Factor Authentication.
Apr 30 2019, 9:55 AM · wikitech.wikimedia.org, Trust-and-Safety
CristianCantoro claimed T204527: cloudvps: osmit project trusty deprecation.

I am taking on, together with @Geofrizz and @Alessandro.palmas, the administration of these machines.

Apr 30 2019, 9:39 AM · Cloud-VPS (Ubuntu Trusty Deprecation)

Dec 8 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

I have also published the pagecounts-ez files, for the period from 2007-12-09 to 2011-11-15, these are the same files as are available through the Google Drive link above, but hosted by my University.
http://cricca.disi.unitn.it/datasets/pagecounts-ez/.

Dec 8 2018, 10:40 PM · Analytics
CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

it may be of interest that I have published the sorted pagecounts-raw dataset. You can find it at: http://cricca.disi.unitn.it/datasets/pagecounts-raw-sorted/.
There are more info at this page: http://disi.unitn.it/~consonni/datasets/wikipedia-pagecounts-raw-sorted/.

Dec 8 2018, 3:46 PM · Analytics

Nov 14 2018

CristianCantoro created T209469: Score generation errors.
Nov 14 2018, 11:22 AM · tool-wscontest

Jul 23 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

That works, let me know if you need to take them down before I get to copy them, and I'll try to squeeze it in.

Jul 23 2018, 6:42 PM · Analytics

Jul 22 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

@CristianCantoro where do you have access? The files have to end up on terbium at some point, but I can move them to the right place if you put them anywhere.

Jul 22 2018, 3:28 PM · Analytics

Jul 12 2018

CristianCantoro added a comment to T199461: Add checksums pageviews dataset.

This isn't a regression, it's a requested "feature" right? ie there wasn't checksums before that are now gone

Jul 12 2018, 9:16 PM · good first bug, Analytics, Datasets-General-or-Unknown
CristianCantoro created T199461: Add checksums pageviews dataset.
Jul 12 2018, 4:35 PM · good first bug, Analytics, Datasets-General-or-Unknown
CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

I am done with the computation, I have processed all pages untill 2011-11-15. I have 1432 files averaging ~400MB in size, for a total of 581GB total. I can transfer them to WMF server if you tell me where.

Jul 12 2018, 1:22 PM · Analytics

Jul 3 2018

CristianCantoro added a comment to T198692: Android mobile App needs a way to implement a community blocking as the one currently ongoin on the web version of itwiki. .

Sibling iOS bug T198693

Jul 3 2018, 10:06 AM · Wikipedia-Android-App-Backlog
CristianCantoro added a comment to T198693: iOS mobile App needs a way to implement a community sitewide block or protest blackout.
Jul 3 2018, 10:06 AM · Wikipedia-iOS-App-Backlog
CristianCantoro created T198693: iOS mobile App needs a way to implement a community sitewide block or protest blackout.
Jul 3 2018, 10:05 AM · Wikipedia-iOS-App-Backlog
CristianCantoro renamed T198692: Android mobile App needs a way to implement a community blocking as the one currently ongoin on the web version of itwiki. from The mobile App need a way to implement a community blocking as the one currently ongoin on the web version of itwiki. to Android mobile App needs a way to implement a community blocking as the one currently ongoin on the web version of itwiki. .
Jul 3 2018, 10:04 AM · Wikipedia-Android-App-Backlog
CristianCantoro created T198692: Android mobile App needs a way to implement a community blocking as the one currently ongoin on the web version of itwiki. .
Jul 3 2018, 9:58 AM · Wikipedia-Android-App-Backlog

Jun 19 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

This is great, thanks very much Cristian. I've been out with an injury for a few weeks. Now I have to catch up with other stuff but I'll get back to this not too far in the future.

Jun 19 2018, 6:16 PM · Analytics

Jun 14 2018

CristianCantoro awarded T163060: Create the front-end of Tools for the Wikisouce Anniversary Proofreading Contest a Party Time token.
Jun 14 2018, 9:34 AM · Indic-TechCom, tool-wscontest, Tools, Wikisource

May 28 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

Ok, I have written a script that uses GNU Parallel to process multiple days at the same time. Using 6 cores I was able to process 23 days worth of data in a little more than 4 hours, as expeced.

May 28 2018, 2:17 PM · Analytics

May 26 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

Ok, I am done writing the new "streaming" script. It takes ~70 minutes on a single core to process one day. About the RAM, it takes 20GB at peak (when reading the input data and sorting the rows), but then it uses ~4GB, and it is using just one core.

May 26 2018, 4:04 PM · Analytics

May 25 2018

CristianCantoro added a comment to T163060: Create the front-end of Tools for the Wikisouce Anniversary Proofreading Contest.

@Samwilson, thanks for the heads up. I have added you as a maintaner of the tool wscontest.

May 25 2018, 10:43 AM · Indic-TechCom, tool-wscontest, Tools, Wikisource
CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

So, I am basically writing another script that does not use Spark but simply process the data in a streaming fashion (the basic idea of the algorithm is: take one day worth of data, sort them by page and then process the data stream one line at a time).

May 25 2018, 10:38 AM · Analytics
CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

I have run other tests and they took between 8 to 9.5 h using between 34GB to 36.5GB on a single machine with 8 cores. Also, I limited the problem with the data from 2007-12-10 to a few files. (I suspect the root of the problem may be some corrupt file).

May 25 2018, 12:21 AM · Analytics

May 23 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

I have run the script over 1 day worth of data (2007-12-11) , it took a little more than 8 hours (484 minuts) and around 34GB of RAM on a single machine with 8 cores. I am testing on another day (2007-12-12).

May 23 2018, 11:39 AM · Analytics

May 22 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

I worked on this during the Wikimedia hackathon and now I have a final version of the code that computes the daily total and the compact string representation for hourly views from the pagecounts-raw data.

May 22 2018, 5:38 PM · Analytics

May 16 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

The code is in this repo if anybody cares to take a look:
https://github.com/CristianCantoro/merge-pageviews

I only took a very quick look so apologies if I missed some nuance, but I think you might be trying to crunch all the data in memory before outputting, which would definitely use up all your memory (pandas is not very light and the files are big). So here's a complicated writeup https://indico.io/blog/fast-method-stream-data-from-big-data-sources/ that essentially boils down to: "use the yield concept" to process only what you need in memory and dump it back out to your output before going on to the next thing.
But of course, this is mostly solved for you in Hadoop where you can just write some logic and the parallelism is HDFS's job, with no work from you. You can even put a Hive table on top of all the source files and then do the transformation with a simple SQL statement.

May 16 2018, 9:18 PM · Analytics
CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

Thanks so much Cristian, I'll wait for your update. Now, I thought of another thing. If you transform the data on your machine, you'd have to then upload it which may take a really long time.
Instead of that, do you want to run the script(s) on our machines? It would duplicate the processing but eliminate the transfer, which is more of a bottleneck. Either way is ok with me, whatever you prefer. If you do want to run it on our machines, I can help debug the logic and launch it myself, or we can get you access to do it yourself.

May 16 2018, 3:29 PM · Analytics

May 15 2018

CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

@Milimetric, no problem.

May 15 2018, 3:15 PM · Analytics

May 8 2018

CristianCantoro updated the task description for T193759: Add legacy per-article pagecounts data (prior to 2015).
May 8 2018, 2:32 PM · Analytics
CristianCantoro added a comment to T193759: Add legacy per-article pagecounts data (prior to 2015).

@CristianCantoro: I'm sorry I didn't think of this, but isn't this what pagecounts-ez already did? https://dumps.wikimedia.org/other/pagecounts-ez/merged/
Oh but you have them going back to 2007. Hm, maybe using your data we can resolve this task: https://phabricator.wikimedia.org/T188041 and then it would be available in a fairly compressed way for everyone.

May 8 2018, 2:19 PM · Analytics
CristianCantoro added a comment to T188041: Generate pagecounts-ez data back to 2008.

I'm coming from T193759, I can help with this. Is the script doing the merge available? I can run it on one of my machines and let it run even for several days.

May 8 2018, 2:18 PM · Analytics
CristianCantoro added a comment to T193759: Add legacy per-article pagecounts data (prior to 2015).

Anyway, I am totally ok with uploading these data, I think I just need a server where to save them.

May 8 2018, 1:25 PM · Analytics
CristianCantoro added a comment to T193759: Add legacy per-article pagecounts data (prior to 2015).

In the meantime, maybe we can host the data on dumps as files?

May 8 2018, 11:12 AM · Analytics

May 3 2018

CristianCantoro created T193759: Add legacy per-article pagecounts data (prior to 2015).
May 3 2018, 4:09 PM · Analytics
CristianCantoro added a watcher for Pageviews-API: CristianCantoro.
May 3 2018, 9:54 AM

May 2 2018

CristianCantoro added a comment to T193647: Lost 2FA token for Wikitech wiki.

Thank you! I was able to login!

May 2 2018, 6:05 PM · cloud-services-team (Kanban), wikitech.wikimedia.org, Trust-and-Safety
CristianCantoro added a comment to T193647: Lost 2FA token for Wikitech wiki.

[...] Can you create a text file in your $HOME on Toolforge that references this bug and then ping us here?

May 2 2018, 5:54 PM · cloud-services-team (Kanban), wikitech.wikimedia.org, Trust-and-Safety
CristianCantoro added a comment to T193616: [REQUEST] Merge Phabricator Users.

I was able to:

May 2 2018, 5:52 PM · Phabricator
CristianCantoro created T193647: Lost 2FA token for Wikitech wiki.
May 2 2018, 2:58 PM · cloud-services-team (Kanban), wikitech.wikimedia.org, Trust-and-Safety
CristianCantoro added a comment to T193616: [REQUEST] Merge Phabricator Users.

That's what I tried to explain in T193616#4174283. The workaround would be creating yet another LDAP account, link to that LDAP account from your Phab account, then unlink the SUL account from your Phab account, then link that (now free) SUL account to your other Phab account that you want to keep.
[...]
I already explained above that this is not possible.

May 2 2018, 12:48 PM · Phabricator
CristianCantoro added a comment to T193616: [REQUEST] Merge Phabricator Users.

However, that requires that the SUL or LDAP account is not yet connected to another existing Phab account, which is the case here. :-/

May 2 2018, 12:15 PM · Phabricator
CristianCantoro added a comment to T193616: [REQUEST] Merge Phabricator Users.

Ok, if I go (logged in as CristianCantoro_SUL) to https://phabricator.wikimedia.org/settings/user/CristianCantoro_SUL/page/external/ and I try to disconnect it I get the following message:

May 2 2018, 12:04 PM · Phabricator
CristianCantoro added a comment to T193616: [REQUEST] Merge Phabricator Users.

I see that most of my activity is done with CristianCantoro, I would like to keep this. I think I can disconnect CristianCantoro_SUL and re-connect my Mediawiki accounts to CristianCantoro. I will try it now.

May 2 2018, 12:01 PM · Phabricator
CristianCantoro added a comment to T193616: [REQUEST] Merge Phabricator Users.

Commenting here with the other account to confirm the request.

May 2 2018, 10:40 AM · Phabricator

Nov 28 2017

CristianCantoro added a comment to T163747: Add support for Tor or other proxy support to the Wikipedia Android App.

I second what Legoktm is saying: adding the option (or default) of using Orbot to route the traffic of the Wikipedia is completely independent of creating a Tor hidden service.

Nov 28 2017, 2:44 AM · Patch-For-Review, Tor, Wikipedia-Android-App-Backlog

Oct 31 2017

CristianCantoro created T179370: Add MANIFEST.in.
Oct 31 2017, 10:59 AM · OABot
CristianCantoro updated the task description for T179367: Add a testing infrastructure.
Oct 31 2017, 10:53 AM · OABot
CristianCantoro updated the task description for T179367: Add a testing infrastructure.
Oct 31 2017, 10:53 AM · OABot
CristianCantoro created T179369: Add setup.py.
Oct 31 2017, 10:51 AM · OABot
CristianCantoro created T179368: New directory structure for the project.
Oct 31 2017, 10:26 AM · OABot
CristianCantoro created T179367: Add a testing infrastructure.
Oct 31 2017, 10:19 AM · OABot

Oct 29 2017

CristianCantoro created T179254: Create a sample config.yaml for reference.
Oct 29 2017, 6:15 PM · OABot
CristianCantoro added a comment to T166308: Provide a way to reject links (instead of skip).

I would also suggest adding the possibility of having a field or a drop-down menu with the motivation for rejection, I think it would be useful to know.
Some reasons that I could think of:

Oct 29 2017, 10:05 AM · OABot
CristianCantoro added a member for OABot: CristianCantoro.
Oct 29 2017, 9:29 AM
CristianCantoro added a watcher for OABot: CristianCantoro.
Oct 29 2017, 9:29 AM

Aug 9 2017

CristianCantoro triaged T172859: Allow free-text editing for better links to add as Normal priority.
Aug 9 2017, 9:44 AM · OABot
CristianCantoro created T172859: Allow free-text editing for better links to add.
Aug 9 2017, 9:44 AM · OABot

Jun 19 2017

CristianCantoro added a comment to T168218: Tor hidden service for WMF websites.

Thanks Tim for filing this bug report. Two further considerations:

Jun 19 2017, 6:57 AM · Wikimedia-General-or-Unknown, Tor
CristianCantoro added a comment to T156847: Core should be aware of the domain it is running on and render mobile domains where necessary.

I would also add that this feature would be particularly useful when visiting from mobile. At the moment, users visiting a mirror of Wikipedia (say en.vikiansiklopedi.org) get redirect of en.m.wikipedia.org regardless of the original domain (instead of en.m.vikiansiklopedi.org).

Jun 19 2017, 1:12 AM · Developer-Wishlist (2017), MediaWiki-General

Jun 17 2017

CristianCantoro added a comment to T156847: Core should be aware of the domain it is running on and render mobile domains where necessary.

This bug affects the project wikimirror.

Jun 17 2017, 9:29 PM · Developer-Wishlist (2017), MediaWiki-General

Jun 6 2017

CristianCantoro created T167154: Delete wikidata-ldf project on wmflabs.
Jun 6 2017, 3:35 PM · cloud-services-team (Kanban), Cloud-VPS (Project-requests), Wikidata

Nov 3 2015

CristianCantoro added a comment to T114010: Track email clickthroughs on donate wiki.

We have received today some questions from Italian WMF donors that have noticed the links. They were asking if the emails were real and they were thinking the email was a phishing attempt.

Nov 3 2015, 7:06 PM · Fundraising Sprint Dirt Farming, Fundraising Sprint Cat Herding, Fundraising Sprint Bloodletting 2016, Fundraising Sprint Asbestos Removal 2016, Fundraising Sprint Zapp, Patch-For-Review, Fundraising Sprint Yo La Tengo, Fundraising Sprint X-Ray Spex, Fundraising Sprint William Shatner, Fundraising-Backlog

Oct 20 2015

CristianCantoro added a comment to T115282: Two small instances: for WikiToLearn development.

any updates about this?

Oct 20 2015, 11:55 AM · Cloud-Services

Oct 15 2015

CristianCantoro added a comment to T115282: Two small instances: for WikiToLearn development.

Hi yuvipanda,

Oct 15 2015, 3:47 PM · Cloud-Services

Oct 14 2015

CristianCantoro reopened T115282: Two small instances: for WikiToLearn development as "Open".

Sorry for re-opening but I wanted to keep track of requests here.

Oct 14 2015, 8:03 PM · Cloud-Services
CristianCantoro reopened T115282: Two small instances: for WikiToLearn development, a subtask of T76375: [DO NOT USE] New Labs project requests (tracking) [superseded by #cloud-vps-project-requests], as Open.
Oct 14 2015, 8:03 PM · User-bd808, Tracking-Neverending, Cloud-Services

Oct 12 2015

CristianCantoro created T115282: Two small instances: for WikiToLearn development.
Oct 12 2015, 9:15 PM · Cloud-Services

Jul 10 2015

CristianCantoro created T105457: New project: WikidataLDF.
Jul 10 2015, 9:11 AM · Cloud-Services

Dec 31 2014

CristianCantoro committed rTWCA5001abb4f80c: deleting .pyc files (authored by CristianCantoro).
deleting .pyc files
Dec 31 2014, 6:27 PM
CristianCantoro committed rTWCA672b03888e76: added ORM (authored by CristianCantoro).
added ORM
Dec 31 2014, 6:27 PM
CristianCantoro committed rTWCAda17361d603c: Commented out installation check (authored by CristianCantoro).
Commented out installation check
Dec 31 2014, 6:27 PM
CristianCantoro committed rTWCA113b259e5102: Ignoring .pyc files (authored by CristianCantoro).
Ignoring .pyc files
Dec 31 2014, 6:27 PM
CristianCantoro committed rTWCA3e1567157a5a: fixed comments, added CHANGELOG, produced TIFF (authored by CristianCantoro).
fixed comments, added CHANGELOG, produced TIFF
Dec 31 2014, 6:27 PM
CristianCantoro committed rTWCA551be02a29cc: first commit (authored by CristianCantoro).
first commit
Dec 31 2014, 6:27 PM