Page MenuHomePhabricator

aetilley (aetilley)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Jul 10 2015, 4:56 PM (222 w, 2 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Aetilley [ Global Accounts ]

Recent Activity

Jun 30 2016

aetilley committed rTESTREVSCORINGAGAINdbac5d07c0a4: Added docopt.py module to revscoring/revscoring. (authored by aetilley).
Added docopt.py module to revscoring/revscoring.
Jun 30 2016, 1:22 AM
aetilley committed rTESTREVSCORINGAGAIN5733347aa99c: Merge dbac5d07c0a4783a74d50c1fb3e2f071e5edf5e9 into… (authored by aetilley).
Merge dbac5d07c0a4783a74d50c1fb3e2f071e5edf5e9 into…
Jun 30 2016, 1:22 AM

Jan 17 2016

aetilley moved T123759: Create Rule and Symbol objects in pcfg.py. Generalize types of rules that can be read into PCFG object. from Backlog to Done on the Scoring-platform-team (Current) board.
Jan 17 2016, 12:17 AM · Scoring-platform-team (Current)
aetilley added a comment to T123759: Create Rule and Symbol objects in pcfg.py. Generalize types of rules that can be read into PCFG object..

Implemented.

Jan 17 2016, 12:16 AM · Scoring-platform-team (Current)

Jan 15 2016

aetilley added a comment to T122728: Determine how to build WP phrase-structure tree-bank..

Redirecting into Project

Jan 15 2016, 6:10 PM · Scoring-platform-team (Current)
aetilley moved T123759: Create Rule and Symbol objects in pcfg.py. Generalize types of rules that can be read into PCFG object. from Active to Backlog on the Scoring-platform-team (Current) board.
Jan 15 2016, 6:04 PM · Scoring-platform-team (Current)
aetilley created T123759: Create Rule and Symbol objects in pcfg.py. Generalize types of rules that can be read into PCFG object..
Jan 15 2016, 6:03 PM · Scoring-platform-team (Current)

Jan 8 2016

aetilley moved T122728: Determine how to build WP phrase-structure tree-bank. from Review to Backlog on the Scoring-platform-team (Current) board.
Jan 8 2016, 5:47 PM · Scoring-platform-team (Current)
aetilley moved T122728: Determine how to build WP phrase-structure tree-bank. from Active to Review on the Scoring-platform-team (Current) board.
Jan 8 2016, 5:47 PM · Scoring-platform-team (Current)

Jan 1 2016

aetilley created T122728: Determine how to build WP phrase-structure tree-bank..
Jan 1 2016, 6:43 PM · Scoring-platform-team (Current)

Dec 18 2015

aetilley moved T121258: Complete beta version of pcfg_scorer and approximate overhead from Backlog to Done on the Scoring-platform-team (Current) board.
Dec 18 2015, 5:39 PM · Scoring-platform-team (Current)
aetilley added a comment to T121258: Complete beta version of pcfg_scorer and approximate overhead.

PCFG object beta complete

Dec 18 2015, 5:39 PM · Scoring-platform-team (Current)

Dec 11 2015

aetilley moved T121258: Complete beta version of pcfg_scorer and approximate overhead from Active to Backlog on the Scoring-platform-team (Current) board.
Dec 11 2015, 7:11 PM · Scoring-platform-team (Current)
aetilley created T121258: Complete beta version of pcfg_scorer and approximate overhead.
Dec 11 2015, 7:06 PM · Scoring-platform-team (Current)
aetilley added a comment to T102343: [Spike] Experiment with using bag-of-words badwords features and general NLP strategies..

Looked at two more papers.

Dec 11 2015, 7:00 PM · Scoring-platform-team (Current)
aetilley moved T102343: [Spike] Experiment with using bag-of-words badwords features and general NLP strategies. from Backlog to Done on the Scoring-platform-team (Current) board.
Dec 11 2015, 6:39 PM · Scoring-platform-team (Current)

Dec 3 2015

aetilley moved T118730: Flake8 of aetilley/sigclust from Review to Done on the Scoring-platform-team (Current) board.
Dec 3 2015, 6:49 PM · User-Ladsgroup, Scoring-platform-team (Current)
aetilley added a comment to T118730: Flake8 of aetilley/sigclust.

Sorry, I just saw this. All done.

Dec 3 2015, 6:49 PM · User-Ladsgroup, Scoring-platform-team (Current)

Nov 28 2015

aetilley added a comment to T102343: [Spike] Experiment with using bag-of-words badwords features and general NLP strategies..

Using large feature sets requires very large datasets to be effective, and the more subtle the content that you're trying to extract (e.g. "sneaky vandalism") the more difficult it is to extract this content from an editor's word choice.

Nov 28 2015, 11:46 PM · Scoring-platform-team (Current)

Nov 20 2015

aetilley moved T118593: Spike -- methods for identifying overfitting/bias/whatever problems in prediction models. from Active to Paused on the Scoring-platform-team (Current) board.
Nov 20 2015, 6:23 PM · Spike, Scoring-platform-team
aetilley moved T102343: [Spike] Experiment with using bag-of-words badwords features and general NLP strategies. from Paused to Backlog on the Scoring-platform-team (Current) board.
Nov 20 2015, 6:23 PM · Scoring-platform-team (Current)
aetilley renamed T102343: [Spike] Experiment with using bag-of-words badwords features and general NLP strategies. from [Spike] Experiment with using bag-of-words badwords features to [Spike] Experiment with using bag-of-words badwords features and general NLP strategies..
Nov 20 2015, 6:22 PM · Scoring-platform-team (Current)

Nov 13 2015

aetilley added a comment to T118004: Compare R sigclust to python sigclust implementation.

Python and R sigclusts giving similar results on enwiki data. See R_read.R in /tests.

Nov 13 2015, 6:11 PM · Scoring-platform-team (Current)

Nov 11 2015

aetilley added a comment to T116403: Testing python sigclust (relationship between full cluster & damaging clusters).

Introducing soft thresholding in python sigclust:

Nov 11 2015, 7:59 AM · Scoring-platform-team (Current)

Nov 6 2015

aetilley added a comment to T118003: [Spike] Figure out why clustering is behaving weird. .

An important realization was that default pre-scaling of input data (mean centering and normalizing variance to 1) did away with the strange behavior or the simulated CIs being so much lower than the input data CI. The scaling has taken us from always getting a p-value of 1 for the main dataset to always getting a p-value of 0.

Nov 6 2015, 6:24 PM · Scoring-platform-team (Current)
aetilley added a comment to T118004: Compare R sigclust to python sigclust implementation.

Python Sigclust and R sigclust gave similar results on enwiki_data.

Nov 6 2015, 6:23 PM · Scoring-platform-team (Current)
aetilley added a comment to T116403: Testing python sigclust (relationship between full cluster & damaging clusters).
Nov 6 2015, 5:06 PM · Scoring-platform-team (Current)
Restricted Application updated subscribers of T116403: Testing python sigclust (relationship between full cluster & damaging clusters).

An important realization this week was that default pre-scaling of input data (mean centering and normalizing variance to 1) did away with the strange behavior or the simulated CIs being so much lower than the input data CI. The scaling has taken us from always getting a p-value of 1 for the main dataset to always getting a p-value of 0. Thus, we begin clustering.

Nov 6 2015, 5:03 PM · Scoring-platform-team (Current)

Nov 3 2015

aetilley added a comment to T117253: Duplicate clustering with old kmeans strategy.

I had understood that we were interesting in clustering edits generally. Thus I just dropped the last column. Aaron, which did you have in mind?

Nov 3 2015, 12:07 AM · User-Ladsgroup, Scoring-platform-team (Current)

Nov 2 2015

aetilley added a comment to T117253: Duplicate clustering with old kmeans strategy.

The file data2.tsv has 19863 samples, your clusters sum to 802 samples. Let me look at the code you sent and get back to you.

Nov 2 2015, 10:36 PM · User-Ladsgroup, Scoring-platform-team (Current)

Oct 30 2015

aetilley renamed T102343: [Spike] Experiment with using bag-of-words badwords features and general NLP strategies. from Experiment with using bag-of-words badwords features to [Spike] Experiment with using bag-of-words badwords features.
Oct 30 2015, 5:51 PM · Scoring-platform-team (Current)
aetilley moved T102343: [Spike] Experiment with using bag-of-words badwords features and general NLP strategies. from Paused to Backlog on the Scoring-platform-team (Current) board.
Oct 30 2015, 5:51 PM · Scoring-platform-team (Current)
aetilley claimed T102343: [Spike] Experiment with using bag-of-words badwords features and general NLP strategies..
Oct 30 2015, 5:51 PM · Scoring-platform-team (Current)
aetilley moved T116403: Testing python sigclust (relationship between full cluster & damaging clusters) from Done to Backlog on the Scoring-platform-team (Current) board.
Oct 30 2015, 5:26 PM · Scoring-platform-team (Current)
aetilley moved T116403: Testing python sigclust (relationship between full cluster & damaging clusters) from Backlog to Done on the Scoring-platform-team (Current) board.
Oct 30 2015, 5:26 PM · Scoring-platform-team (Current)

Oct 28 2015

aetilley added a comment to T116403: Testing python sigclust (relationship between full cluster & damaging clusters).
  1. See recently added script R_read.R in sigclust/enwiki_data (see https://github.com/aetilley/sigclust). Call source("R_read.R") in R (inside the "enwiki_data" directory) to apply sigclust to the enwiki data (now titled "data2.tsv" ) as well as some other artificial data.
Oct 28 2015, 10:15 PM · Scoring-platform-team (Current)

Oct 23 2015

aetilley moved T116403: Testing python sigclust (relationship between full cluster & damaging clusters) from Active to Backlog on the Scoring-platform-team (Current) board.
Oct 23 2015, 5:50 PM · Scoring-platform-team (Current)
aetilley added a project to T116403: Testing python sigclust (relationship between full cluster & damaging clusters): Scoring-platform-team (Current).
Oct 23 2015, 5:50 PM · Scoring-platform-team (Current)
aetilley created T116403: Testing python sigclust (relationship between full cluster & damaging clusters).
Oct 23 2015, 5:46 PM · Scoring-platform-team (Current)
aetilley moved T113761: Draft implementation SigClust in python from Backlog to Done on the Scoring-platform-team (Current) board.
Oct 23 2015, 5:45 PM · Scoring-platform-team (Current)

Oct 16 2015

aetilley added a comment to T113761: Draft implementation SigClust in python.

"Hard Thresholding" variant implemented.

Oct 16 2015, 4:55 PM · Scoring-platform-team (Current)
aetilley set Security to default on T113761: Draft implementation SigClust in python.
Oct 16 2015, 4:54 PM · Scoring-platform-team (Current)

Oct 9 2015

aetilley added a comment to T113761: Draft implementation SigClust in python.

Converting algorithm summary into psuedo-code.

Oct 9 2015, 5:23 PM · Scoring-platform-team (Current)

Sep 25 2015

aetilley reopened T113057: Prepare summary of SigClust and other methods for choosing number of clusters. as "Open".
Sep 25 2015, 4:36 PM · Scoring-platform-team (Current)
aetilley closed T113057: Prepare summary of SigClust and other methods for choosing number of clusters. as Resolved.
Sep 25 2015, 4:35 PM · Scoring-platform-team (Current)

Sep 18 2015

aetilley created T113057: Prepare summary of SigClust and other methods for choosing number of clusters..
Sep 18 2015, 5:04 PM · Scoring-platform-team (Current)

Sep 11 2015

aetilley renamed T112303: Review of papers by Tufekci and Sandvig et. al. from Review of papers by Tufekci and Saldvig et. al. to Review of papers by Tufekci and Sandvig et. al..
Sep 11 2015, 10:20 PM · Scoring-platform-team (Current)
aetilley moved T112303: Review of papers by Tufekci and Sandvig et. al. from Review to Done on the Scoring-platform-team (Current) board.
Sep 11 2015, 7:49 PM · Scoring-platform-team (Current)
aetilley moved T112303: Review of papers by Tufekci and Sandvig et. al. from Active to Review on the Scoring-platform-team (Current) board.
Sep 11 2015, 7:49 PM · Scoring-platform-team (Current)
aetilley added a comment to T112303: Review of papers by Tufekci and Sandvig et. al..

The Sandvig paper did make brief mention of feedback mechanisms which seem to be pertinent to our considerations.

Sep 11 2015, 7:39 PM · Scoring-platform-team (Current)
aetilley added a comment to T112303: Review of papers by Tufekci and Sandvig et. al..

Tufekci's paper is mostly expository of other studies, but the studies that she mentions are truly fascinating. aetilley has never had a Facebook account, but was intrigued by the possibilities that Tufekci mentions.
Sandvig et. al. seem to be a diverse group of experts taking many pages to say something which is more or less obvious, but perhaps it bears repeating. There is a distinction between a function, an algorithm for computing a function, and a specific implementation of an algorithm. Racism, and bias in general can creep in at more than one level.
A mantra that kept coming to mind while reading these was "strive for open algorithms and open training sets." The principal barrier here is in determining the level of detail at which to describe an algorithm/dataset to a most likely non-technical user or in which to let said user specify their own personal algorithm.

Sep 11 2015, 7:12 PM · Scoring-platform-team (Current)
aetilley created T112303: Review of papers by Tufekci and Sandvig et. al..
Sep 11 2015, 5:25 PM · Scoring-platform-team (Current)

Sep 1 2015

aetilley updated the task description for T107599: Arthur's init for revscoring.
Sep 1 2015, 7:49 AM · Scoring-platform-team (Current)
aetilley closed T107599: Arthur's init for revscoring as Resolved.
Sep 1 2015, 7:48 AM · Scoring-platform-team (Current)