See @aetilley's repo
Description
Description
Event Timeline
Comment Actions
The file data2.tsv has 19863 samples, your clusters sum to 802 samples. Let me look at the code you sent and get back to you.
Comment Actions
Because we only test on reverted edits and the last column is reverted status (not a feature). I did this mistake initially too :)
Comment Actions
I had understood that we were interesting in clustering edits generally. Thus I just dropped the last column. Aaron, which did you have in mind?
Comment Actions
Responded in IRC. Do both! Cluster the entire set and also cluster just the damaging set and compare the difference.