Page MenuHomePhabricator

RyanSteinberg
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Apr 10 2018, 4:22 AM (189 w, 5 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
RyanSteinberg [ Global Accounts ]

Recent Activity

Sep 5 2019

RyanSteinberg added a comment to T232068: notebook1004 - /srv is full.

I just deleted some files and I'm compressing others. I didn't realize space was so tight ... my apologies.

Sep 5 2019, 1:45 PM · SRE, Analytics-Clusters

Mar 22 2019

RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

@Lauren.maggio can you make sure someone on your team is ready to start assessing the quality of the data when we start the staging and data collection on 2019-03-20? We should aim to catch any major issues in the first 24 hours, please.

Mar 22 2019, 6:49 AM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic

Mar 13 2019

RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

Hi @leila. I don't think we ever collectively defined what an external link was in our schema. Using the external class, in my opinion, is a large problem that negatively impacts the strength of our research. I'm not sure how to account for it in analysis since the data for extClick events would miss a considerable number of links that editors clearly intend to be external reference links.

Mar 13 2019, 3:16 PM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic

Mar 8 2019

RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

My team discussed this today and reached consensus that comparing links with the document's hostname is preferred. This new definition of external seems cleaner and well worth the wait. Of course I'm happy to hear from others if there are objections to this change. Thank you!

Mar 8 2019, 4:41 AM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic

Mar 7 2019

RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

The instrumentation code only reports extClick events on links explicitly coded with class external. It's simple to exclude internal links that were miscoded as external, but what about the reverse? Links that are coded as internal but are really external won't be represented in click data at all. It looks like interwiki links are a potential problem here. For example interwiki doi links get the class extiw not external so would be missed. See ref 5 on Diamantane for an example. Interwiki doi alone represents a good number of links that my team would surely think of as external: see first 500. And reviewing the interwikimap, I see other base interwiki hostnames that seem "external": merriam-webster.com, handle.net, google.com, etc.

Mar 7 2019, 2:04 AM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic

Mar 6 2019

RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

Sorry guys, it's hard for me to find time for this project during the week.
New review: https://github.com/ryanmax/wiki-citation-usage/blob/master/data-regression-2019-03-05.ipynb

Mar 6 2019, 2:05 AM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic

Mar 1 2019

RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

Hi @bmansurov. I reviewed data today, specifically looking at section_id and freely_accessible elements.

Mar 1 2019, 9:34 PM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic

Feb 25 2019

RyanSteinberg reopened T209298: access to analytics-privatedata-users for @toddleroux, @Afandian, & @RyanSteinberg as "Open".

I'm reopening this so someone can take a look at @toddleroux's access issue. See above. His ssh-key entry appears to be missing a key type (ssh-rsa) at the beginning of this line: https://github.com/wikimedia/puppet/blob/production/modules/admin/data/data.yaml#L3123

Feb 25 2019, 5:26 AM · Patch-For-Review, SRE-Access-Requests, SRE, Research-Programs
RyanSteinberg reopened T209298: access to analytics-privatedata-users for @toddleroux, @Afandian, & @RyanSteinberg, a subtask of T171231: [Objective 11.1.2] Research on citation/external link usage, as Open.
Feb 25 2019, 5:25 AM · Epic, Research-Programs

Feb 12 2019

RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

Hi @Miriam. This sounds good to me. I should have some time to look at the sampled data this Friday. Thank you!

Feb 12 2019, 7:56 PM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic

Feb 6 2019

RyanSteinberg added a comment to T214727: LDAP nda access for afandian2 and toddleroux.

Excellent! Thank you, @Dzahn. I think this ticket can be closed.

Feb 6 2019, 7:35 PM · LDAP-Access-Requests
RyanSteinberg added a comment to T214727: LDAP nda access for afandian2 and toddleroux.

Thank you for your help, @Dzahn. Like me, these two users are both formal collaborators and should already have NDAs on file as per T209298. Again, thank you for your help.

Feb 6 2019, 6:38 PM · LDAP-Access-Requests

Jan 25 2019

RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

@bmansurov I don't think I have access to deployment-eventlog05.deployment-prep.eqiad.wmflabs or any of the wmflabs machines.

Jan 25 2019, 9:29 PM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic
RyanSteinberg created T214727: LDAP nda access for afandian2 and toddleroux.
Jan 25 2019, 9:03 PM · LDAP-Access-Requests
RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

Hi @bmansurov I interacted with a beta cluster page and expected to see usage data flow into event.citationusage. Am I looking in the right place or do I just need to be more patient?

Jan 25 2019, 8:31 PM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic

Jan 24 2019

RyanSteinberg added a comment to T212937: Citation Usage instrumentation issues.

@Lauren.maggio and I just discussed citation_identifier_label. Using a more comprehensive list of identifiers from citation style 1 would improve the data quality of this element. That said, we don't think we'll be able to make meaningful use of it and recommend dropping the element altogether. @bmansurov can you remove citation_identifier_label? Thank you for your patience on this one.

Jan 24 2019, 6:08 PM · MW-1.33-notes (1.33.0-wmf.16; 2019-02-05), Research
RyanSteinberg added a comment to T213969: Citation Usage: run third round of data collection.

Thank you for the testing instructions @bmansurov. I will plan to review beta cluster pages and test data tomorrow. I'm still waiting to hear back from my team on T212937#4893106 and whether or not citation_identifier_label data is useful. In the meantime, I will update that task with a more comprehensive list of identifiers that I think should be used if citation_identifier_label remains. Sorry this has taken so long.

Jan 24 2019, 5:42 PM · MW-1.36-notes (1.36.0-wmf.9; 2020-09-15), Performance-Team, Analytics-Radar, Research, Patch-For-Review, Epic

Jan 18 2019

RyanSteinberg updated subscribers of T212937: Citation Usage instrumentation issues.

@bmansurov
Thank you for the fix to *freely_accessible*

Jan 18 2019, 6:33 PM · MW-1.33-notes (1.33.0-wmf.16; 2019-02-05), Research
RyanSteinberg added a comment to T212937: Citation Usage instrumentation issues.

@RyanSteinberg re: *citation_in_text_refs*,

As for the implementation, we're looking at the number of backlinks preceding the external link.

Jan 18 2019, 6:14 PM · MW-1.33-notes (1.33.0-wmf.16; 2019-02-05), Research

Jan 15 2019

RyanSteinberg updated the task description for T213768: LDAP nda access for RyanSteinberg.
Jan 15 2019, 4:32 PM · LDAP-Access-Requests

Jan 14 2019

RyanSteinberg created T213768: LDAP nda access for RyanSteinberg.
Jan 14 2019, 10:53 PM · LDAP-Access-Requests

Nov 20 2018

RyanSteinberg added a comment to T209298: access to analytics-privatedata-users for @toddleroux, @Afandian, & @RyanSteinberg.

wikitech info for @RyanSteinberg

Nov 20 2018, 10:01 PM · Patch-For-Review, SRE-Access-Requests, SRE, Research-Programs

Nov 14 2018

RyanSteinberg added a comment to T209298: access to analytics-privatedata-users for @toddleroux, @Afandian, & @RyanSteinberg.

public key:
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIBN1rS7OObcft7lDa9+H45kLfkdGHwlJ6rL2Fm2IPsMB

Nov 14 2018, 5:26 PM · Patch-For-Review, SRE-Access-Requests, SRE, Research-Programs

Nov 13 2018

RyanSteinberg added a comment to T209298: access to analytics-privatedata-users for @toddleroux, @Afandian, & @RyanSteinberg.

I don't seem to have access to Office Wiki and I don't see an option to create an account. Should I share my public SSH key here or wait for Office Wiki access?

Nov 13 2018, 10:32 PM · Patch-For-Review, SRE-Access-Requests, SRE, Research-Programs

Apr 10 2018

RyanSteinberg updated subscribers of T191086: Instrument and collect data via CitationUsage schema.

@leila, a few of questions:

  1. Is sectionID the ID of an HTML element? For example, for a link appearing on https://en.wikipedia.org/wiki/Book#History, is the ID "History"?
Apr 10 2018, 5:55 AM · Research, Discovery-Search, MW-1.32-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), Research-Archive, Performance-Team (Radar), Research-2017-18-Q4