Page MenuHomePhabricator

Create a one-off dump of enwiki user table
Closed, DeclinedPublic

Description

As part of a sockpuppet investigation, I need a list of all registered usernames on enwiki (excluding any that have been suppressed). This is essentially what T51132 is about. Would it be possible to get a one-off export in a file somewhere that's visible on spi-tools-host-1.spi-tools.eqiad1.wikimedia.cloud? For my immediate purposes, I only really need the names, so just one name per line in a text file would be ideal, but a SQL XML dump would probably be OK too.

I have a vague recollection that I've seen a username a long time ago which is similar to some of the usernames in a current SPI report. My plan is to use standard unix command-line tools to slice and dice the username list in various ways hoping to find it.

Related:

Event Timeline

Hi, who is asked to do this / which project tag should this task have so someone could find it? Thanks!

@Cryptic showed me how to do this on my own:

for i in $(seq 0 42); do echo $i; echo "SELECT user_name FROM user WHERE user_id BETWEEN ${i}000000 AND ${i}999999;" | sql enwiki_p | tail -n+2 | gzip -c > usernames-$i.gz; done

so no longer needed.

Legoktm subscribed.

I have a vague recollection that I've seen a username a long time ago which is similar to some of the usernames in a current SPI report. My plan is to use standard unix command-line tools to slice and dice the username list in various ways hoping to find it.

Surely there are better ways to filter the user table through MariaDB queries with LIKE and RLIKE instead of dumping the entire table to filter through it...

But since you already figured out how to make your own dump, will decline.