Page MenuHomePhabricator

LostEnchanter (Liudmila Kalina)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Oct 24 2020, 10:10 AM (185 w, 3 d)
Availability
Available
IRC Nick
lost_enchanter
LDAP User
LostEnchanter
MediaWiki User
LostEnchanter [ Global Accounts ]

Recent Activity

Mar 7 2021

LostEnchanter closed T273767: Detect "data" modules , a subtask of T270827: Detect similarity in Lua sourcecodes, as Resolved.
Mar 7 2021, 7:56 PM · Abstract Wikipedia team
LostEnchanter closed T273767: Detect "data" modules as Resolved.
Mar 7 2021, 7:56 PM · Abstract Wikipedia team

Mar 2 2021

LostEnchanter added a comment to T274787: Create web service for providing results of analysis work .

@tanny411 Thanks for the help with Dataframes, they are way faster when used like that.

Mar 2 2021, 11:17 AM · Abstract Wikipedia team
LostEnchanter added a comment to T274787: Create web service for providing results of analysis work .

Instead of having a list of hardcoded families, maybe you can use the meta_table acquired data to get a list of all families as well as language and display them. To avoid repeated calls, just saving them in a array should work when app initializes.
You will understand the app performance issues better, so I'm leaving the decision to you, let me know where I can jump in.

Mar 2 2021, 8:46 AM · Abstract Wikipedia team
LostEnchanter updated subscribers of T274787: Create web service for providing results of analysis work .

Today was the second day of fighting with rolling out of production version, but I believe I finally fixed everything for the current version, so the main functionality is working - you can test it at https://abstract-wiki-ds.toolforge.org/
I'm planning onto fixing css tomorrow - and maybe adding language filter, which for now is left behind, as it takes too long to check all the entries.

Mar 2 2021, 1:12 AM · Abstract Wikipedia team

Feb 16 2021

LostEnchanter updated the task description for T274787: Create web service for providing results of analysis work .
Feb 16 2021, 9:24 PM · Abstract Wikipedia team

Feb 15 2021

LostEnchanter lowered the priority of T273000: Change database access code to work with replicas redesign from High to Low.
Feb 15 2021, 1:46 PM · Abstract Wikipedia team
LostEnchanter raised the priority of T273000: Change database access code to work with replicas redesign from Low to Medium.
Feb 15 2021, 1:46 PM · Abstract Wikipedia team
LostEnchanter triaged T274787: Create web service for providing results of analysis work as High priority.
Feb 15 2021, 1:46 PM · Abstract Wikipedia team
LostEnchanter moved T274787: Create web service for providing results of analysis work from To triage to Data Science work on the Abstract Wikipedia team board.
Feb 15 2021, 1:46 PM · Abstract Wikipedia team
LostEnchanter created T274787: Create web service for providing results of analysis work .
Feb 15 2021, 1:45 PM · Abstract Wikipedia team
LostEnchanter added a comment to T273000: Change database access code to work with replicas redesign.

Update status:

Feb 15 2021, 1:20 PM · Abstract Wikipedia team
LostEnchanter updated the task description for T273000: Change database access code to work with replicas redesign.
Feb 15 2021, 1:02 PM · Abstract Wikipedia team

Feb 9 2021

LostEnchanter updated the task description for T273000: Change database access code to work with replicas redesign.
Feb 9 2021, 7:56 PM · Abstract Wikipedia team
LostEnchanter claimed T273000: Change database access code to work with replicas redesign.
Feb 9 2021, 5:45 PM · Abstract Wikipedia team

Feb 3 2021

LostEnchanter triaged T273767: Detect "data" modules as Medium priority.
Feb 3 2021, 2:37 PM · Abstract Wikipedia team
LostEnchanter moved T273767: Detect "data" modules from To triage to Data Science work on the Abstract Wikipedia team board.
Feb 3 2021, 2:37 PM · Abstract Wikipedia team
LostEnchanter created T273767: Detect "data" modules .
Feb 3 2021, 2:36 PM · Abstract Wikipedia team

Feb 1 2021

LostEnchanter added a comment to T273000: Change database access code to work with replicas redesign.

The update is live! See T272523 for the new connection scheme.

Feb 1 2021, 2:36 PM · Abstract Wikipedia team

Jan 30 2021

LostEnchanter added a comment to T272003: Analysis of data collected from databases to identify priority modules.

@Quiddity thanks for the info, it was really interesting to find out how this problem was handled. It really looks like encountering errors of this type should not be unusual at all, considering previous way of storying interwiki links.

Jan 30 2021, 1:23 AM · Abstract Wikipedia team

Jan 29 2021

LostEnchanter added a comment to T272003: Analysis of data collected from databases to identify priority modules.

@Quiddity thank you for this interesting observation! Do you know whether the initial linking to Wikidata pages was done by users or by bots? I'm curious because querying through API correctly shows, that w:tr:Modül:Konum haritası/veri/Polonya is Scribunto-type and belongs to namespace 828.

Jan 29 2021, 11:35 PM · Abstract Wikipedia team

Jan 27 2021

LostEnchanter added a comment to T272003: Analysis of data collected from databases to identify priority modules.

@tanny411 you did a great job creating this report!

Jan 27 2021, 4:10 PM · Abstract Wikipedia team
LostEnchanter added a comment to T272822: Mysql connection lost during query from toolforge.

I've run this script from my local PC using ssh tunnels. I tried different variations:

  1. Connecting through ssh to meta database and connecting to enwiki database;
  2. Using pandas to fetch the result and using basic pymysql cursor.fetchall()
  3. Using LIMIT 500 and LIMIT 2 OFFSET 100
Jan 27 2021, 2:45 PM · Data-Services, Toolforge, Abstract Wikipedia team

Jan 26 2021

LostEnchanter added a comment to T270827: Detect similarity in Lua sourcecodes.

Currently the idea is to use metric, based on the Levenstein distance as distance between texts (examples in this notebook or this notebook for bigger cases). Current idea of closeness detection algorithm look like this like this:

Jan 26 2021, 8:30 PM · Abstract Wikipedia team
LostEnchanter updated the task description for T273000: Change database access code to work with replicas redesign.
Jan 26 2021, 6:26 PM · Abstract Wikipedia team
LostEnchanter triaged T273000: Change database access code to work with replicas redesign as High priority.
Jan 26 2021, 6:26 PM · Abstract Wikipedia team
LostEnchanter moved T273000: Change database access code to work with replicas redesign from To triage to Data Science work on the Abstract Wikipedia team board.
Jan 26 2021, 6:26 PM · Abstract Wikipedia team
LostEnchanter created T273000: Change database access code to work with replicas redesign.
Jan 26 2021, 6:25 PM · Abstract Wikipedia team
LostEnchanter added a comment to T272003: Analysis of data collected from databases to identify priority modules.

@tanny411 So, yes, my logic is something like that: they are all different, and it looks like all of them have "?" in title. Can we drop them, or there's something I miss?

Jan 26 2021, 4:45 PM · Abstract Wikipedia team
LostEnchanter added a comment to T272003: Analysis of data collected from databases to identify priority modules.

@tanny411 I've been looking through your notebook and there are things I've seen previusly too. Modules like Module:inc-ash/dial/data/?? and Module:zh/data/ltc-pron/? all refer to different translation and/or prononciation information for the word. They would have different source code, that's logical, and I'm not sure we want to analyze them at all. At the same time, on my tests they are usually detected by Levenshtein distance quite easily, so it might not be worth the work to drop them.

Jan 26 2021, 4:32 PM · Abstract Wikipedia team

Jan 20 2021

LostEnchanter closed T271957: Transclusion in Lua modules might not always show up as Resolved.
Jan 20 2021, 9:37 PM · MediaWiki-Templates
LostEnchanter updated subscribers of T271957: Transclusion in Lua modules might not always show up.

@LostEnchanter I think I understand. I believe what you may be witnessing here is that no page uses Module:yesno (lowercase 'y') indirectly (e.g., a page using a template #invokeing some module which in turn requires yesno) on hiwiktionary, and therefore no MediaWiki parser hooks (and manually executed or scheduled link refresh jobs) run in a context where a determination is made to insert an entry into templatelinks.

Jan 20 2021, 9:37 PM · MediaWiki-Templates

Jan 14 2021

LostEnchanter triaged T272003: Analysis of data collected from databases to identify priority modules as High priority.
Jan 14 2021, 11:15 AM · Abstract Wikipedia team
LostEnchanter moved T272003: Analysis of data collected from databases to identify priority modules from To triage to Data Science work on the Abstract Wikipedia team board.
Jan 14 2021, 11:14 AM · Abstract Wikipedia team
LostEnchanter moved T271382: Request creation of annotation VPS project from Data Science work to To triage on the Abstract Wikipedia team board.
Jan 14 2021, 11:14 AM · cloud-services-team (Kanban), Wikidata, Abstract Wikipedia team, Wikidata Lexicographical data, Cloud-VPS (Project-requests)

Jan 13 2021

LostEnchanter added a comment to T271957: Transclusion in Lua modules might not always show up.

Hi, where exactly to see a Templatelinks table (URL)? Please follow https://www.mediawiki.org/wiki/How_to_report_a_bug whenever possible - thanks a lot!

Jan 13 2021, 6:25 PM · MediaWiki-Templates
LostEnchanter created T271957: Transclusion in Lua modules might not always show up.
Jan 13 2021, 4:51 PM · MediaWiki-Templates

Jan 6 2021

LostEnchanter updated the task description for T270827: Detect similarity in Lua sourcecodes.
Jan 6 2021, 2:26 PM · Abstract Wikipedia team

Dec 31 2020

LostEnchanter added a comment to T270492: Collect relevant data about the Modules for analysis.

My idea was that some pages are highly protected and this may mean they are important modules (therefore also used in a lot of places). Those can be prioritized to be centralized.

Dec 31 2020, 3:44 PM · Abstract Wikipedia team
LostEnchanter closed T270826: [Abstract Wikipedia data science] Unification and code commentary as Resolved.
Dec 31 2020, 3:40 PM · Abstract Wikipedia team
LostEnchanter closed T270826: [Abstract Wikipedia data science] Unification and code commentary , a subtask of T270824: [Abstract Wikipedia data science] "Quality-of-life" project modifications, as Resolved.
Dec 31 2020, 3:40 PM · Abstract Wikipedia team
LostEnchanter added a comment to T270492: Collect relevant data about the Modules for analysis.

@tanny411 you did a really good job!

Dec 31 2020, 3:23 PM · Abstract Wikipedia team

Dec 30 2020

LostEnchanter updated the task description for T270826: [Abstract Wikipedia data science] Unification and code commentary .
Dec 30 2020, 9:22 PM · Abstract Wikipedia team
LostEnchanter updated the task description for T270826: [Abstract Wikipedia data science] Unification and code commentary .
Dec 30 2020, 7:46 PM · Abstract Wikipedia team
LostEnchanter updated the task description for T270826: [Abstract Wikipedia data science] Unification and code commentary .
Dec 30 2020, 5:37 PM · Abstract Wikipedia team

Dec 29 2020

LostEnchanter updated the task description for T270826: [Abstract Wikipedia data science] Unification and code commentary .
Dec 29 2020, 12:40 PM · Abstract Wikipedia team
LostEnchanter claimed T270826: [Abstract Wikipedia data science] Unification and code commentary .
Dec 29 2020, 12:37 PM · Abstract Wikipedia team
LostEnchanter updated the task description for T270826: [Abstract Wikipedia data science] Unification and code commentary .
Dec 29 2020, 12:36 PM · Abstract Wikipedia team
LostEnchanter closed T270825: [Abstract Wikipedia data science] Modify code to make it usable from both Toolforge and outside environment, a subtask of T270824: [Abstract Wikipedia data science] "Quality-of-life" project modifications, as Resolved.
Dec 29 2020, 12:24 PM · Abstract Wikipedia team
LostEnchanter closed T270825: [Abstract Wikipedia data science] Modify code to make it usable from both Toolforge and outside environment as Resolved.
Dec 29 2020, 12:24 PM · Abstract Wikipedia team

Dec 25 2020

LostEnchanter claimed T270825: [Abstract Wikipedia data science] Modify code to make it usable from both Toolforge and outside environment.
Dec 25 2020, 4:43 PM · Abstract Wikipedia team
LostEnchanter triaged T270826: [Abstract Wikipedia data science] Unification and code commentary as Low priority.
Dec 25 2020, 4:15 PM · Abstract Wikipedia team
LostEnchanter triaged T270827: Detect similarity in Lua sourcecodes as High priority.
Dec 25 2020, 4:15 PM · Abstract Wikipedia team
LostEnchanter triaged T270825: [Abstract Wikipedia data science] Modify code to make it usable from both Toolforge and outside environment as Medium priority.
Dec 25 2020, 4:15 PM · Abstract Wikipedia team
LostEnchanter moved T270827: Detect similarity in Lua sourcecodes from To triage to Data Science work on the Abstract Wikipedia team board.
Dec 25 2020, 4:05 PM · Abstract Wikipedia team
LostEnchanter created T270827: Detect similarity in Lua sourcecodes.
Dec 25 2020, 4:04 PM · Abstract Wikipedia team
LostEnchanter moved T270826: [Abstract Wikipedia data science] Unification and code commentary from To triage to Data Science work on the Abstract Wikipedia team board.
Dec 25 2020, 2:58 PM · Abstract Wikipedia team
LostEnchanter created T270826: [Abstract Wikipedia data science] Unification and code commentary .
Dec 25 2020, 2:57 PM · Abstract Wikipedia team
LostEnchanter moved T270825: [Abstract Wikipedia data science] Modify code to make it usable from both Toolforge and outside environment from To triage to Data Science work on the Abstract Wikipedia team board.
Dec 25 2020, 2:38 PM · Abstract Wikipedia team
LostEnchanter created T270825: [Abstract Wikipedia data science] Modify code to make it usable from both Toolforge and outside environment.
Dec 25 2020, 2:38 PM · Abstract Wikipedia team
LostEnchanter triaged T270492: Collect relevant data about the Modules for analysis as High priority.
Dec 25 2020, 2:07 PM · Abstract Wikipedia team
LostEnchanter triaged T270824: [Abstract Wikipedia data science] "Quality-of-life" project modifications as Low priority.
Dec 25 2020, 2:07 PM · Abstract Wikipedia team
LostEnchanter triaged T270494: [Abstract Wikipedia data science] Create scripts to fetch Module contents as High priority.
Dec 25 2020, 2:07 PM · Abstract Wikipedia team
LostEnchanter moved T270824: [Abstract Wikipedia data science] "Quality-of-life" project modifications from To triage to Data Science work on the Abstract Wikipedia team board.
Dec 25 2020, 2:07 PM · Abstract Wikipedia team
LostEnchanter created T270824: [Abstract Wikipedia data science] "Quality-of-life" project modifications.
Dec 25 2020, 2:06 PM · Abstract Wikipedia team

Dec 24 2020

LostEnchanter closed T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge, a subtask of T263678: Analyze community authored functions that build Wikipedia infoboxes and more, as Resolved.
Dec 24 2020, 7:04 PM · Abstract Wikipedia team, Outreach-Programs-Projects, Outreachy (Round 21)
LostEnchanter closed T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge as Resolved.
Dec 24 2020, 7:04 PM · Abstract Wikipedia team
LostEnchanter updated the task description for T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.
Dec 24 2020, 7:04 PM · Abstract Wikipedia team

Dec 23 2020

LostEnchanter updated the task description for T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.
Dec 23 2020, 11:07 PM · Abstract Wikipedia team
LostEnchanter updated the task description for T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.
Dec 23 2020, 11:07 PM · Abstract Wikipedia team
LostEnchanter added a comment to T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.

We could use dbname but that wasnt not save from the content fetcher. When loading from database I guess that wont matter, so yes, we can use dbname for sure.

Dec 23 2020, 12:33 PM · Abstract Wikipedia team

Dec 22 2020

LostEnchanter updated the task description for T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.
Dec 22 2020, 8:55 PM · Abstract Wikipedia team
LostEnchanter added a comment to T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.

Faulty Toolforge update today slows things down, sadly...

Dec 22 2020, 7:51 PM · Abstract Wikipedia team
LostEnchanter updated the task description for T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.
Dec 22 2020, 7:46 PM · Abstract Wikipedia team
LostEnchanter updated the task description for T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.
Dec 22 2020, 7:43 PM · Abstract Wikipedia team

Dec 21 2020

LostEnchanter added a comment to T270494: [Abstract Wikipedia data science] Create scripts to fetch Module contents.

I've tried to compare pages collected by API and db(id and titles only) by ids. Had to go through a LOT of memory errors to run this script.
This is the output:

Length of db pages: 275154
Length of api pages: 274543
Length of unique pages in db: 740 # pages not found from API calls
Length of unique pages in api: 129 # pages not listed from db queries
Ok

It seems there are some discrepancies. I am looking into what these files are and if there's any pattern here.

@LostEnchanter I think it's a good idea to save data into databases and process from there. Loading contents gives couple of errors due to the presence of all kinds of symbols in the code (quotes and commas). Since we are going to use db anyways, I think its best not to try to solve all these errors now. (I did spend a good amount of time trying to load the csv to compare page entries with db, but then I went on with a work around for now)

Dec 21 2020, 2:29 PM · Abstract Wikipedia team

Dec 19 2020

LostEnchanter added a comment to T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.

For file downloading: scp seems to be working just file, but wasn't able to make ssh tunneling from here to work

Dec 19 2020, 6:41 PM · Abstract Wikipedia team

Dec 18 2020

LostEnchanter created T270500: [Abstract Wikipedia data science] Move data storage to database which can be accessed from outside of Toolforge.
Dec 18 2020, 3:26 PM · Abstract Wikipedia team
LostEnchanter created T270494: [Abstract Wikipedia data science] Create scripts to fetch Module contents.
Dec 18 2020, 2:36 PM · Abstract Wikipedia team
LostEnchanter closed T270493: [Abstract Wikipedia data science] Create parser for list of all existing wikis, a subtask of T263678: Analyze community authored functions that build Wikipedia infoboxes and more, as Resolved.
Dec 18 2020, 2:32 PM · Abstract Wikipedia team, Outreach-Programs-Projects, Outreachy (Round 21)
LostEnchanter closed T270493: [Abstract Wikipedia data science] Create parser for list of all existing wikis as Resolved.
Dec 18 2020, 2:32 PM · Abstract Wikipedia team
LostEnchanter created T270493: [Abstract Wikipedia data science] Create parser for list of all existing wikis.
Dec 18 2020, 2:31 PM · Abstract Wikipedia team
LostEnchanter removed a project from T270491: [Abstract Wikipedia data science] Collect relevant data about the Modules for analysis: Abstract Wikipedia team.
Dec 18 2020, 2:24 PM
LostEnchanter closed T270491: [Abstract Wikipedia data science] Collect relevant data about the Modules for analysis as Invalid.
Dec 18 2020, 2:24 PM
LostEnchanter created T270492: Collect relevant data about the Modules for analysis.
Dec 18 2020, 2:24 PM · Abstract Wikipedia team
LostEnchanter created T270491: [Abstract Wikipedia data science] Collect relevant data about the Modules for analysis.
Dec 18 2020, 2:15 PM

Nov 25 2020

LostEnchanter updated LostEnchanter.
Nov 25 2020, 6:51 PM