Page MenuHomePhabricator

import all lists with the script we wrote for that
Closed, InvalidPublic

Description

after fermium is reinstalled:

import all lists with the script we wrote for that

Details

Related Gerrit Patches:
operations/puppet : productionmailman: remove import scripts
operations/puppet : productionmailman: import even unknown lists
operations/puppet : productionmailman: also import held messages and qfiles
operations/puppet : productionmailman: Don't store bad messages in qfiles
operations/puppet : productionmailman: adjust import_list.sh for private lists

Related Objects

Event Timeline

Dzahn created this task.Aug 24 2015, 11:27 PM
Dzahn claimed this task.
Dzahn raised the priority of this task from to High.
Dzahn updated the task description. (Show Details)
Dzahn added subscribers: ori, MZMcBride, Dzahn and 5 others.

Change 234032 had a related patch set uploaded (by Dzahn):
mailman: adjust import_list.sh for private lists

https://gerrit.wikimedia.org/r/234032

Change 234032 merged by Dzahn:
mailman: adjust import_list.sh for private lists

https://gerrit.wikimedia.org/r/234032

Change 234043 had a related patch set uploaded (by Dzahn):
mailman: Don't store bad messages in qfiles

https://gerrit.wikimedia.org/r/234043

Change 234043 merged by Dzahn:
mailman: Don't store bad messages in qfiles

https://gerrit.wikimedia.org/r/234043

Change 234138 had a related patch set uploaded (by Dzahn):
mailman: also import held messages and qfiles

https://gerrit.wikimedia.org/r/234138

sizes after considerable cleanup (qfiles/bad, shunt, Gigabytes deleted):

17:54 <mutante> 91G archives
17:54 <mutante> 4.4G data
17:54 <mutante> 552M lists
17:54 <mutante> 68M qfiles

Change 234138 merged by Dzahn:
mailman: also import held messages and qfiles

https://gerrit.wikimedia.org/r/234138

Dzahn added a comment.Aug 27 2015, 4:04 AM

a full import with ./import_all_lists.sh took: real 115m26.589s

Dzahn added a comment.EditedAug 27 2015, 4:19 AM

./bin/list_lists says

556 matching mailing lists found on fermium. But the same command says 558 (!) on sodium. Figure out the diff!


-->

Licom-l - Wikimedia licensing update committee (CLOSED)
WLM-CN - Wiki Loves Monuments In China

Dzahn added a comment.Aug 27 2015, 4:42 AM

it probably has to do with the way these have been disabled in the past.

i see:

wlm-cn.disabled.rt8567 and licom-l.disabled.rt7307 on sodium in ./lists/. These have been rsynced but not imported. "list not found" by import script. Because of the dots in the names?

Dzahn added a comment.Aug 27 2015, 4:49 AM
Importing fix_url...
Running fix_url.fix_url()...
Loading list foundation-l (locked)
Unknown list: foundation-l
Traceback (most recent call last):
  File "/var/lib/mailman/bin/withlist", line 299, in <module>
    main()
  File "/var/lib/mailman/bin/withlist", line 277, in main
    r = do_list(listname, args, func)
  File "/var/lib/mailman/bin/withlist", line 202, in do_list
    return func(m, *args)
  File "/usr/lib/mailman/bin/fix_url.py", line 73, in fix_url
    if not mlist.Locked():
AttributeError: 'NoneType' object has no attribute 'Locked'
--------------------------
Dzahn added a comment.Aug 27 2015, 4:55 AM

Lists to check for import issues (because there are still files left that should have been deleted if it was succesful).

advocacy_advisors
analytics.backup
boardexec
checkuser-l.backup
chip-l
clevel
education-coop
foundation-news-l
fundraising.backup
fundraising-de
gendergap.backup.20110209
hiphop
juribak
langcom-observers
licom-l
mk-edu
moderators-nl.save
officeit
private-l
research-team
tools-wmt-staff
unblock-en-l.backup
wikibooksde-l
wikidata-l.backup
wikidata.old.backup
wikide-l.backup
wikimedia-commits
wikimediacz-oldprivate
wikimedia-de-by-ltp
wikimedia-in
wikimedia-l.old
wikimedia-ve
wikimedia-ve-afiliados
wikimedia-ve-entusiastas
wikimedical-l
wikipedia-ensino
wiki-research-l.backup
wikitech-announce
wikitech-announce.disabled.t100503
wlm-cn
wmf-care-bears
wmfresearch
wmfsocial
xmldatadumps-admin-l

Glancing over the above list and recalling memories related to them (but not tickets right now; perhaps will track these tomorrow if needed), they're not real lists. Some are either deleted (tools-wmt-staff as I requested it) or disabled in a bad way (Wikitech-announce as an example, renamed twice by directory).

We should just blindly rsync it all so it's exactly like before and if in doubt we can deal with it any time on the new server. That means changing the script to not even attempt checking whether a list exists from MM's point of view.

We should just blindly rsync it all so it's exactly like before and if in doubt we can deal with it any time on the new server. That means changing the script to not even attempt checking whether a list exists from MM's point of view.

+1 I don't see why is a special purpose script even needed.

Also, please check that non-ascii characters are ok in the new mailman. We had charset problems there when upgrading WM-ES lists from wheezy to jessie, to the point that some lists had special chars in some fields that made python raised an encoding exception (other fields like the list footer just happily send the garbage).

Dzahn added a comment.Sep 2 2015, 10:07 PM

We should just blindly rsync it all so it's exactly like before and if in doubt we can deal with it any time on the new server. That means changing the script to not even attempt checking whether a list exists from MM's point of view.

+1 I don't see why is a special purpose script even needed.

It started as "for the very first tests i just want to export public data and not mess with the private things before i;m more sure it even works".

Also, please check that non-ascii characters are ok in the new mailman. We had charset problems there when upgrading WM-ES lists from wheezy to jessie, to the point that some lists had special chars in some fields that made python raised an encoding exception (other fields like the list footer just happily send the garbage).

We even had a puppet failure "failed: invalid byte sequence in UTF-8" because of one charset issue in one template, the French listinfo template.

It was: listinfo.html: HTML document, ISO-8859 text
and i ran iconv -f ISO-8859-1 -t UTF8
and then it was: listinfo.html: HTML document, UTF-8 Unicode text
https://gerrit.wikimedia.org/r/#/c/234589/

and that fixed the puppet run on jessie.

But that's the only file i touched encoding-wise,i left all the other language templates alone because they did not cause puppet issues.

Even though they look like they are in different formats. --> see P1944

At first i planned to convert them all to UTF-8, like here: https://gerrit.wikimedia.org/r/#/c/234565/ but that looked like it broke more than it did good, for example see Korean.

I then decided to leave them untouched except the one that caused actual issues. So, French fixed, all others as on sodium, for now.

Change 237011 had a related patch set uploaded (by Dzahn):
mailman: import even unknown lists

https://gerrit.wikimedia.org/r/237011

Change 237011 merged by Dzahn:
mailman: import even unknown lists

https://gerrit.wikimedia.org/r/237011

Change 237315 had a related patch set uploaded (by Dzahn):
mailman: remove import scripts

https://gerrit.wikimedia.org/r/237315

Change 237315 merged by Dzahn:
mailman: remove import scripts

https://gerrit.wikimedia.org/r/237315

Dzahn added a comment.Sep 10 2015, 2:10 AM

we won't use that script anymore. instead we adjusted the rsyncd config and the script running on the source side to directly sync into /var/lib/mailman in the right places

Dzahn closed this task as Invalid.Sep 10 2015, 2:10 AM