Deduplicate projects into their own table

Authored by Tgr on Nov 21 2017, 9:16 AM.


Deduplicate projects into their own table

Create a reading_list_project table for projects (domains).
Use it to normalize the project column and not waste space e.g.
storing 'en.wikipedia.org' a million times; also to validate
projects (the table needs to be pregenerated, projects that are
not found in it are rejected).

The extension is agnostic about what values are supposed to be
used as projects, but a maintenance script is provided which
puts the wikifarm's wgCanonicalServer values (things like
'https://en.wikipedia.org') into the new table.

Change-Id: I047d1493f1d9f51d733c1925a38c080829589f35
(cherry picked from commit a02d6eba878ef95c69f08cfe45164096e69ccbad)