Data collection has started.
Just turned off the survey.
Tue, Mar 19
Talked to Miriam, and she made an announcement today. We'll wait two days and deploy on Thursday if everything is fine by then.
Mon, Mar 18
The survey has been deployed.
No deployment took place at 2pm because Gerrit was down. I'll try and deploy later today.
@Usmanmuhd I don't think you should be overly specific in the proposal. You can start by saying that the goal of the project is to improve the article recommendation pipeline. Also, don't be tricked by the number of tasks. Are you sure you can get those tasks done and pushed to production in 12 weeks? Just doing the coding part doesn't mean your changes will automatically be enabled in production. You'll learn all this as we go along.
@Usmanmuhd saw your IRC message about template and the specs. I don't think we have a template. Let me know of your questions about the specs and I'll try and fill in the task description.
Fri, Mar 15
@leila we did staging because we wanted to make sure that the back end can handle the load. Now that we know it can, we can safely use the intended sampling rates. I'm not sure of other reasons why staging is needed. Maybe @Miriam knows?
@Ottomata a heads up that we'll be collecting citation data starting March 20th which lasts one month. The sampling rate is 100% for the schema CitationUsage and 33.3% for the schema CitationUsagePageLoad as before (similar to the second round of data collection: T203253).
@Capt_Swing I'll update the code.
Thu, Mar 14
Wed, Mar 13
A candidate task for later.
@leila I'm afraid I cannot manage more than two students. According to The DOs and DON’Ts of Google Summer of Code: Mentor Edition, even in my second year of mentoring, it's not advised to mentor more than one student.
Tue, Mar 12
@Dantraztrev thanks for the interest, but we already have two students for this project. If a task is assigned to someone, you cannot work on that task.
@Subhang65523 thanks for the interest. We already have two students who want to work on this project. Please check out other projects.
Fri, Mar 8
@Ottomata I need a little help. So here's the situation. The code to generate python whl files is in research/article-recommender/deploy. I can generate those files and upload them to Archiva no problem; see this for example.
Thu, Mar 7
I was not aware of the DOI case. Thanks for bringing it up. I think in that case it makes sense to use the URL only and ignore the external flag. It will probably take a couple of weeks to get this adjustment made in code and shipped to production. If time is of concern, then let's derive whether a link is external or not during analysis. Please let me know what you prefer.
I was relying on a local virtual environment on stat1007. I have refactored the code and created a package and uploaded it to PyPi so that I can make it a dependency of the Oozie script. This way I can have a simple entry point for Oozie that depends on this external package. If we want to add more recommendation types, or improve the article recommendation code, then all I have to do is update the package and point Oozie to use the new version.
Wed, Mar 6
If so, perhaps instead of relying on markup for this notion of external-ness, we should consider calculating it based on each link's href attribute? @bmansurov, what do you think?
I think we should use both signals (i.e. the 'external' flag and the link URL) because other than the bug mentioned (T217567), the 'external' flag is pretty accurate. It's unfortunate that a related bug (T13477) has been open for many years and won't be fixed any time soon.
@Nuria having thought about your comment, I think I misunderstood you the first time. I think you mean I should explore whether the way Discovery is doing this is applicable to the research's use case. Please let me know if I got it wrong this time too.
The survey has been undeployed.
Tue, Mar 5
@Capt_Swing yep, 3/18 sounds good.
Yes, that's correct.
@Shivanshbindal9 awesome. Feel free to assign it to yourself.
@Shivanshbindal9 welcome! Usmanmuhd is starting with T216721: Remove duplicate Wikidata items from article recommendations, so perhaps you can start with T216750: Article recommendation API: replace WDQS with MW API? Take a look and let me know if that's something you want to work on. Alternatively, take a look at T215222: Recommendation API translation endpoint stopped working too.
@Usmanmuhd sure! Depending on how far we go, I can add more tasks in the future.
@Usmanmuhd on IRC you mentioned that you'd submit a patch for this task. If you started working on the task, feel free to assign it to yourself. Also feel free to ask questions here, on IRC, or via email.
Turns out wikilink cannot be used in this case, that's why the template was forced to use the external link syntax.
Mon, Mar 4
@Aklapper thanks for the links. For posterity, I've submitted my request at https://en.wikipedia.org/w/index.php?title=Template_talk:No_article_text&type=revision&diff=886164298&oldid=863150939&diffmode=source.
Hi @RyanSteinberg. Thanks for the analysis.
It's bmansurov at wikimedia dot org.
@Usmanmuhd sure, email is fine. You can also join us at #wikimedia-research on freenode. It's probably best to keep these tasks separate and create a parent task.
@Stabgan would you be interested in working on this task too?
@Usmanmuhd would you be interested in working on this task too?
@Stabgan take a look at the task and let me know if you're interested in working on it.
Fri, Mar 1
I've updated the readme. Here's the file: https://analytics.wikimedia.org/datasets/one-off/article-recommender/20181130.tar.gz
The pathch's been merged. Thanks for the fix, Jon!
@Usmanmuhd, welcome! The link's been fixed. There are no micro tasks for this project, unless you want to split up the work into meaningful parts and work on them separately. But I think the project is self containing.
Thu, Feb 28
@Isaac OK, thanks for looking into this. I think it's worth keeping in mind when doing analysis. If we want an event triggered on right-click + "open in new tab", then we should create a task for it.
@srishakatux thanks for the feedback. Please let me know if anything else is needed for this task to be considered ready for work.
Wed, Feb 27
Interestingly, opening the survey in a new tab did not trigger the QuickSurveysResponses schema as far as I could tell from the client side (by watching the network tab in the web console), which is odd though not a blocker (the QuickSurveyInitiation schema gives not all but enough information in most cases)
It maybe worth creating a separate task for this.
Yes, the message was wrong initially and we fixed it later. Thanks for helping debug the issue, Jon!
That's correct, we're talking about the reader demographics survey. Strangely, I don't see the error in the console and mw.msg('Reader-demographics-1-privacy') is returning something that looks like this: