Page MenuHomePhabricator

AikoChou
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Oct 2 2019, 10:06 AM (173 w, 2 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
AikoChou [ Global Accounts ]

Recent Activity

Apr 25 2022

AikoChou added a comment to T287056: Deploy Outlinks topic model to production.

Hi @Isaac, glad to start working on this. :) I am currently working on a sub-task to complete the HTTP error handling code. After that, we should be ready to deploy it. I will let you know if we see any issues.

Apr 25 2022, 7:50 PM · Machine-Learning-Team (Active Tasks), Lift-Wing

Mar 21 2022

AikoChou added a comment to T300270: Return meaningful HTTP responses in Lift Wing's revscoring backends.

Hmm, editquality image is also not the latest version. Need to deploy as well.

Mar 21 2022, 6:31 PM · Patch-For-Review, Machine-Learning-Team (Active Tasks), Lift-Wing
AikoChou added a comment to T300270: Return meaningful HTTP responses in Lift Wing's revscoring backends.

Checked deployment-chart, only editquality has been deployed.

Mar 21 2022, 6:20 PM · Patch-For-Review, Machine-Learning-Team (Active Tasks), Lift-Wing

Aug 26 2021

AikoChou added a comment to T287317: Add an image: count image suggestions without infoboxes.

The following files are samples of the image suggestions for articles that have no infoboxes for each of the wikis we counted. :)

Aug 26 2021, 10:38 AM · Image-Suggestions, Growth-Team (Current Sprint), Growth-Structured-Tasks

Aug 19 2021

AikoChou added a comment to T287317: Add an image: count image suggestions without infoboxes.

The data I posted is not inclusive of those changes. These numbers were calculated based on an older set of image recommendations from 2021-04.

Aug 19 2021, 6:28 AM · Image-Suggestions, Growth-Team (Current Sprint), Growth-Structured-Tasks

Aug 9 2021

AikoChou added a comment to T287317: Add an image: count image suggestions without infoboxes.

Update -- we excluded all kinds of infobox by filtering using Q19887878. For cebwiki, the count drops from 99% to 3% of unillustrated articles that have no infobox, and the count for other wikis has also dropped.

Aug 9 2021, 5:38 AM · Image-Suggestions, Growth-Team (Current Sprint), Growth-Structured-Tasks

Jul 30 2021

AikoChou added a comment to T287317: Add an image: count image suggestions without infoboxes.

The following table is the preliminary result:
https://docs.google.com/spreadsheets/d/1JGDOmZ16L3La-l82rhKAD2IhoaCQfcNICNQ90paH54U

Jul 30 2021, 6:36 PM · Image-Suggestions, Growth-Team (Current Sprint), Growth-Structured-Tasks

Jul 29 2021

AikoChou added a comment to T287317: Add an image: count image suggestions without infoboxes.

The task is ongoing, but it may take longer than expected.

Jul 29 2021, 5:57 PM · Image-Suggestions, Growth-Team (Current Sprint), Growth-Structured-Tasks

Jul 2 2021

AikoChou added a comment to T276407: An End-to-End Image Classification Pipeline.

Weekly update:

Jul 2 2021, 2:43 PM · Research, Structured-Data-Backlog, MachineVision

Jun 18 2021

AikoChou added a comment to T276407: An End-to-End Image Classification Pipeline.

Weekly updates:

Jun 18 2021, 3:06 PM · Research, Structured-Data-Backlog, MachineVision

Jun 11 2021

AikoChou added a comment to T276407: An End-to-End Image Classification Pipeline.

Weekly updates:
We confirmed (1) How the input data is formatted and (3) The function used to transform the Keras model to an Estimator are not the cause of the poor performance for Estimator, as we trained a CNN model from scratch that can reach the same performance in both Keras and Estimator.

Jun 11 2021, 4:24 PM · Research, Structured-Data-Backlog, MachineVision

May 30 2021

AikoChou awarded T283980: Phacility (Maintainer of Phabricator) is winding down. Upstream support ending. a Burninate token.
May 30 2021, 5:12 PM · Release-Engineering-Team (Seen), User-Matthewrbowker, Phabricator

May 21 2021

AikoChou added a comment to T276407: An End-to-End Image Classification Pipeline.

Weekly updates:
We wrote documentation of distributed image inference workflow in the Github repo and provided three tasks as examples: image quality inference, face detection, and Resnet feature extraction. With regard to distributed training using tf-yarn, we are looking for an alternative to wrap a Keras model in Estimator to solve the accuracy issue.

May 21 2021, 4:21 PM · Research, Structured-Data-Backlog, MachineVision

May 18 2021

Ladsgroup awarded T276407: An End-to-End Image Classification Pipeline a Yellow Medal token.
May 18 2021, 5:39 AM · Research, Structured-Data-Backlog, MachineVision

May 3 2021

elukey awarded T276407: An End-to-End Image Classification Pipeline a Party Time token.
May 3 2021, 10:29 AM · Research, Structured-Data-Backlog, MachineVision
AikoChou added a comment to T276407: An End-to-End Image Classification Pipeline.

Weekly update:

May 3 2021, 9:16 AM · Research, Structured-Data-Backlog, MachineVision

Apr 29 2021

AikoChou created P15638 image classification gpu.
Apr 29 2021, 6:44 AM

Apr 6 2021

AikoChou added a comment to T277828: Investigate placeholder image recommendation.

For point 1. I calculated the number of overlapped images in allowed_images and image_placeholders as follows:

Apr 6 2021, 8:12 AM · Growth-Team-Filtering, Image-Suggestions, Growth-Team
AikoChou added a comment to T276407: An End-to-End Image Classification Pipeline.

I want to use tf-yarn to train a simple model on the cluster, but I found some environment variables need to be set up, which described in this doc:

  • JAVA_HOME: /usr/bin/java
  • HADOOP_HDFS_HOME: /usr/bin/hdfs
Apr 6 2021, 5:14 AM · Research, Structured-Data-Backlog, MachineVision

Mar 27 2021

AikoChou added a comment to T277828: Investigate placeholder image recommendation.

I updated the code in the GitHub repo (in the branch) that improves filtering out placeholders. The workflow is as follows - first use PetScan to search all the subcategories from Category:Image_placeholders (https://petscan.wmflabs.org/?psid=18699732). Next, query for all images from those categories in Hive. Then, exclude these images when querying for candidates in both wikidata commons category (fewer cases) and other wikis (many cases).

Mar 27 2021, 3:12 PM · Growth-Team-Filtering, Image-Suggestions, Growth-Team

Mar 23 2021

AikoChou added a comment to T274225: Multivariate logistic regression on search scores.

Hi @Cparle - yes of course, there you go:

Mar 23 2021, 2:42 PM · SDAW-MediaSearch (MediaSearch-ImageRecs), Structured-Data-Backlog (Current Work), Image-Suggestions, Structured Data Engineering, WikibaseMediaInfo

Mar 10 2021

AikoChou added a comment to T274878: Estimate the number of images added to each Wiki in a month.

@MMiller_WMF -- here are the results computed using unillustrated articles for which the algorithm has at least one recommendation. Since illustrated articles for February are available to query, I added results for January. Most of them fall within the range of 0.1% ~ 8%. There are two very high numbers 21.46% and 31.62% in arzwiki (In previous results, these two months also have relatively high percentages). A scatter plot is shown below that excludes the two outliers, showing the distribution for most wikis.

Mar 10 2021, 9:02 AM · Research (FY2020-21-Research-January-March), Image-Suggestions, Growth-Team

Mar 4 2021

AikoChou updated the task description for T276407: An End-to-End Image Classification Pipeline.
Mar 4 2021, 1:35 AM · Research, Structured-Data-Backlog, MachineVision
AikoChou added a comment to T276407: An End-to-End Image Classification Pipeline.

Summary of the work done so far:

  • Imported the image data on local and saved to TFRecords files
  • Finetuned an Xception model to classify images between 'sculptures' and 'maiolica'
  • Ran inference on test data on local
Mar 4 2021, 1:27 AM · Research, Structured-Data-Backlog, MachineVision
AikoChou created T276407: An End-to-End Image Classification Pipeline.
Mar 4 2021, 1:08 AM · Research, Structured-Data-Backlog, MachineVision

Mar 2 2021

AikoChou added a comment to T274878: Estimate the number of images added to each Wiki in a month.

Here are estimates of the percentage of unillustrated articles that become illustrated after one month for each target wikis.

Mar 2 2021, 5:08 AM · Research (FY2020-21-Research-January-March), Image-Suggestions, Growth-Team

Feb 22 2021

AikoChou added a comment to T272109: Assess prevalence of Wikidata infoboxes.

Hi all,

Feb 22 2021, 8:02 AM · Research (FY2020-21-Research-April-June), Growth-Team-Filtering, Image-Suggestions, Growth-Team, Wikipedia-Android-App-Backlog

Feb 15 2021

AikoChou added a comment to T272109: Assess prevalence of Wikidata infoboxes.

Hi @MMiller_WMF @Tgr -- it's very nice to meet you too. I'm really happy to have the opportunity to help :D

Feb 15 2021, 8:26 AM · Research (FY2020-21-Research-April-June), Growth-Team-Filtering, Image-Suggestions, Growth-Team, Wikipedia-Android-App-Backlog

Feb 11 2021

AikoChou added a comment to T274225: Multivariate logistic regression on search scores.

Hi all,

Feb 11 2021, 6:00 AM · SDAW-MediaSearch (MediaSearch-ImageRecs), Structured-Data-Backlog (Current Work), Image-Suggestions, Structured Data Engineering, WikibaseMediaInfo

Feb 10 2021

AikoChou added a comment to T274225: Multivariate logistic regression on search scores.

@Miriam Yeah if there is no maximum, it's not appropriate to use normalization. I'll update the result of the non-normalization one.

Feb 10 2021, 1:31 PM · SDAW-MediaSearch (MediaSearch-ImageRecs), Structured-Data-Backlog (Current Work), Image-Suggestions, Structured Data Engineering, WikibaseMediaInfo

Feb 9 2021

AikoChou added a comment to T272109: Assess prevalence of Wikidata infoboxes.

Hi all!
Here are the results after removing icons (.svg). Overall, these numbers drop slightly but not change much.

Feb 9 2021, 8:43 AM · Research (FY2020-21-Research-April-June), Growth-Team-Filtering, Image-Suggestions, Growth-Team, Wikipedia-Android-App-Backlog
AikoChou created T274225: Multivariate logistic regression on search scores.
Feb 9 2021, 7:18 AM · SDAW-MediaSearch (MediaSearch-ImageRecs), Structured-Data-Backlog (Current Work), Image-Suggestions, Structured Data Engineering, WikibaseMediaInfo

Feb 8 2021

AikoChou added a comment to T272109: Assess prevalence of Wikidata infoboxes.

Hi all,

Feb 8 2021, 8:27 AM · Research (FY2020-21-Research-April-June), Growth-Team-Filtering, Image-Suggestions, Growth-Team, Wikipedia-Android-App-Backlog
AikoChou reopened T273602: Access to analytics-privatedata-users for Research contractor AikoChou as "Open".
Feb 8 2021, 12:05 AM · Research, SRE, SRE-Access-Requests
AikoChou added a comment to T273602: Access to analytics-privatedata-users for Research contractor AikoChou.

Could you double check that I have LDAP access? because I'm not able to access the notebooks.

Feb 8 2021, 12:05 AM · Research, SRE, SRE-Access-Requests

Feb 3 2021

AikoChou added a comment to T273602: Access to analytics-privatedata-users for Research contractor AikoChou.

Hi @CDanis,
My wikitech username: AikoChou
Preferred shell username: aikochou
SSh public key: https://phabricator.wikimedia.org/P14137
I have read and signed the L3 Wikimedia Server Access Responsibilities document.
Thanks! :)

Feb 3 2021, 6:14 AM · Research, SRE, SRE-Access-Requests
AikoChou created P14137 Ai-Jou Chou (AikoChou) production SSH public key.
Feb 3 2021, 6:00 AM

Mar 9 2020

Pavithraes awarded T241518: Outreachy Proposal: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia a Love token.
Mar 9 2020, 7:41 AM · Outreachy (Round 19)
AikoChou closed T233707: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia, a subtask of T199190: [2.4] Improve unsourced statement identification tools and algorithms, as Resolved.
Mar 9 2020, 4:54 AM · Knowledge-Integrity, Epic
AikoChou closed T233707: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia as Resolved.

Completed the wrap-up steps:

Mar 9 2020, 4:54 AM · User-ArielGlenn, Research, Outreachy (Round 19)
AikoChou closed T241518: Outreachy Proposal: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia, a subtask of T233707: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia, as Resolved.
Mar 9 2020, 4:53 AM · User-ArielGlenn, Research, Outreachy (Round 19)
AikoChou closed T241518: Outreachy Proposal: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia as Resolved.

Completed the wrap-up steps:

Mar 9 2020, 4:53 AM · Outreachy (Round 19)

Mar 2 2020

srishakatux awarded T233707: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia a Love token.
Mar 2 2020, 11:25 PM · User-ArielGlenn, Research, Outreachy (Round 19)
Pavithraes awarded T233707: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia a Love token.
Mar 2 2020, 2:54 PM · User-ArielGlenn, Research, Outreachy (Round 19)
AikoChou added a comment to T233707: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.

In the last week of the internship, I've been working on:

Mar 2 2020, 7:59 AM · User-ArielGlenn, Research, Outreachy (Round 19)
AikoChou added a comment to T233707: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.

Week 9-10

Mar 2 2020, 7:59 AM · User-ArielGlenn, Research, Outreachy (Round 19)
AikoChou added a comment to T233707: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.

Week 1-8 Summary

Mar 2 2020, 7:58 AM · User-ArielGlenn, Research, Outreachy (Round 19)
AikoChou added a comment to T241518: Outreachy Proposal: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.

In the last week of the internship, I've been working on:

Mar 2 2020, 7:56 AM · Outreachy (Round 19)
AikoChou added a comment to T241518: Outreachy Proposal: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.

Week 9-10

Mar 2 2020, 7:53 AM · Outreachy (Round 19)
AikoChou added a comment to T241518: Outreachy Proposal: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.

Week 1-8 Summary

Mar 2 2020, 7:49 AM · Outreachy (Round 19)

Jan 22 2020

AikoChou added a comment to T233707: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.

Weekly update

  • Modified the input pipeline and the format written to the database.
  • Worked on a script to ingest data into Citation Hunt.
  • Created pull request of 1 and 2 for Guilherme to review
Jan 22 2020, 1:57 PM · User-ArielGlenn, Research, Outreachy (Round 19)
AikoChou added a comment to T241518: Outreachy Proposal: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.

We are in week 8 now but have moved to week 9 work. Just swapped Citation Hunt work with testing/regular job work. Swapped 9-12 with 6-8. :)

Jan 22 2020, 10:04 AM · Outreachy (Round 19)

Dec 30 2019

AikoChou closed T241585: Facing an issue when loading a Tensorflow model as Resolved.
Dec 30 2019, 9:12 PM · Toolforge
AikoChou added a comment to T241585: Facing an issue when loading a Tensorflow model.

Thank you all for your help!
The issue was solved when I run it using the grid engine and set -mem option. :)

Dec 30 2019, 9:11 PM · Toolforge
AikoChou added a comment to T241585: Facing an issue when loading a Tensorflow model.
Dec 30 2019, 1:53 PM · Toolforge
AikoChou created T241585: Facing an issue when loading a Tensorflow model.
Dec 30 2019, 11:10 AM · Toolforge

Dec 28 2019

AikoChou updated the task description for T241518: Outreachy Proposal: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.
Dec 28 2019, 8:21 PM · Outreachy (Round 19)
AikoChou created T241518: Outreachy Proposal: A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia.
Dec 28 2019, 8:20 PM · Outreachy (Round 19)

Nov 28 2019

AikoChou updated AikoChou.
Nov 28 2019, 12:52 AM

Oct 14 2019

AikoChou added a comment to T234606: Your second task: classify statements within an article.

@Surlycyborg @Miriam @Samwalton9 I have confusion hopefully you could clear it out.

First, could you define the statement and the sentence in the scope of this project.

I see in your paper that you did the rain on random sentences. But the example on the github repo in doing the prediction on statements which could be composed of multiple sentences.

Hi @Ghassanmas I am also confusing about the definition between statement and sentence in this project. Thanks for pointing out.

Oct 14 2019, 7:33 PM · Outreachy (Round 19)
AikoChou added a comment to T234606: Your second task: classify statements within an article.

Here is my repo:
https://github.com/AikoChou/wikimedia-outreachy-2019

Oct 14 2019, 6:49 PM · Outreachy (Round 19)

Oct 8 2019

AikoChou added a comment to T234519: Your first task: classify sample statements using Citation Needed Models.

Hello! I found it useful to install the packages in a virtual environment, had some issues probably with packages from before, and having a virtual environment solved the "No module found" errors. Here is some info: https://docs.python-guide.org/dev/virtualenvs/

Yes, thanks for the suggestion! By the way, I filed a similar issue in the repository itself a few days ago: https://github.com/mirrys/citation-needed-paper/issues/2. If you (or anyone else reading this) would like to send a Pull Request to update the docs, that would also be a nice little contribution :)

Oct 8 2019, 2:20 AM · Outreachy (Round 19)

Oct 4 2019

AikoChou added a comment to T234606: Your second task: classify statements within an article.

Do we have a deadline for this task? Thanks!

Oct 4 2019, 11:32 AM · Outreachy (Round 19)
AikoChou added a comment to T234519: Your first task: classify sample statements using Citation Needed Models.

I am using :
Python 3.7
Keras 2.2.4
Tensorflow 2.0.0

Oct 4 2019, 3:00 AM · Outreachy (Round 19)

Oct 3 2019

AikoChou added a comment to T234519: Your first task: classify sample statements using Citation Needed Models.

Hello everyone! I'm also participating to this Outreachy round and I'm very interested in doing to this project of Wikipedia :)
Thanks for the opportunity, @Miriam!
I'm having the following error when I run the script:

2019-10-03 19:05:10.628372: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_100.dll'; dlerror: cudart64_100.dll not found
Traceback (most recent call last):
  File "run_citation_need_model.py", line 17, in <module>
    K.set_session(K.tf.Session(config=k.tf.ConfigProto(intra_op_parallelism_threads=10, inter_op_parallelism_threads=10)))
AttributeError: module 'keras.backend' has no attribute 'tf'

I'm also using Python 3.7, and I have Keras 2.3.0. Maybe this is my issue?
Thanks in advance!

Oct 3 2019, 11:51 PM · Outreachy (Round 19)