Page MenuHomePhabricator

Wikisource: Implement adding Symfony 5 to WSExport
Closed, ResolvedPublic8 Estimated Story PointsOct 21 2020

Description

As a Wikisource user, I want the team to add Symfony 5 to WSExport, so that WSExport can be managed in a more standardized fashion.

Background: In the course of discussing potential improvements to WSExport reliability, an idea was brought forth to add Symfony 5 to WSExport. Symfony 5 is a PHP framework to create websites and web applications.

Acceptance Criteria:

  • Implement adding Symfony 5 to WSExport
  • If this is not possible to do, share findings on why this was not possible and potential next steps

Details

Due Date
Oct 21 2020, 4:00 AM

Event Timeline

Restricted Application added a subscriber: Aklapper. ยท View Herald TranscriptSep 22 2020, 6:42 PM
ifried renamed this task from Watchlist Expiry: Can we easily add Symfony 5 to WSExport? [placeholder] to Watchlist Expiry: Can we easily add Symfony 5 to WSExport?.Sep 24 2020, 3:29 PM
ifried updated the task description. (Show Details)
ifried renamed this task from Watchlist Expiry: Can we easily add Symfony 5 to WSExport? to Watchlist Expiry: Investigate adding Symfony 5 to WSExport.Sep 24 2020, 4:21 PM
ARamirez_WMF set the point value for this task to 8.Sep 24 2020, 6:07 PM
ifried renamed this task from Watchlist Expiry: Investigate adding Symfony 5 to WSExport to Watchlist Expiry: Implement adding Symfony 5 to WSExport.Sep 24 2020, 6:07 PM
ifried updated the task description. (Show Details)
ifried removed the point value for this task.
ARamirez_WMF set the point value for this task to 8.
ifried renamed this task from Watchlist Expiry: Implement adding Symfony 5 to WSExport to Wikisource: Implement adding Symfony 5 to WSExport.Sep 24 2020, 6:09 PM
ifried removed the point value for this task.
ARamirez_WMF set the point value for this task to 8.Sep 24 2020, 6:18 PM
ARamirez_WMF changed the subtype of this task from "Task" to "Deadline".

@Samwilson I saw that you moved this ticket to 'In Development.' Should it be assigned to you?

Yes, oops! I meant to do that.

PR: https://github.com/wsexport/tool/pull/244/

A few things that might need some discussion:

  • How do we want to handle docker-compose? I've made it a bit clunky in the above patch, with the addition of --env-file=.env.local to every call. I feel like having PHP and the symfony tool installed locally is pretty useful and likely to be common; having them makes it easier to develop with Symfony. Doing everything in Docker (i.e. not having symfony or composer installed locally) is a few more keystrokes.
  • I've set up a dev branch, for the new work.
  • Do we need to put in a placeholder for the old bin/book.php CLI entry point, or is it okay to break BC for people who might be running scripts with this?
  • There is lots of further clean-up that can be done after this patch, but I've tried to keep this as small as I can. I'll make separate issues for each thing I can think of.
ARamirez_WMF changed Due Date from Oct 7 2020, 4:00 AM to Oct 21 2020, 4:00 AM.Oct 8 2020, 10:09 PM

I've switched the test site to use the dev branch, and updated the docs on Wikitech for the new Apache config.

This is ready for QA, and there should be nothing different about how it works. I think the main things to check are that URLs are unchanged.

There's a (uncontrolled) deployment script at /var/www/tool/upgrade.sh which I've also modified to use the dev branch now (so upgrades are just sudo upgrade.sh). It should be possible to switch instead to the deploy script in the ToolforgeBundle (which we'll definitely want to do before we add DB migrations), but at the moment that script only supports deploying from master (or a tag); I've created wikimedia/ToolforgeBundle#35 to change this (if y'all think that's a good idea?).

@Samwilson I find a lot of Wikisource urls are of the form https://tools.wmflabs.org/wsexport/tool/book.php?page=.... I think this will be a 404 once this has gone to production. E.g. https://tools.wmflabs.org/wsexport-test/tool/book.php?lang=en&format=epub-3&page=Strange_Case_of_Dr_Jekyll_and_Mr_Hyde

Does this need to be changed on the Wikis, or in WSExport?

@dom_walden that is part of each wiki's MediaWiki:Gadget-WSexport.js and in production at enWS it is now

//wsexport.wmflabs.org/wsexport/tool/book.php

There are redirects in place from tools.wmflabs.org to the correct place. If it is being set up as a gadget then I am guessing that it will be rolled out differently. If it is utilising a new concept then it will be some server redirects and we will just need to speak to someone like @bd808 to hack the redirects.

Good point. I've added an alias for /tool/book.php (plain /book.php was already supported). Easier than trying to update all links on wikis (although it might be nice to do that in the gadgets anyway, at some point).

PR: https://github.com/wsexport/tool/pull/247

@Samwilson I am seeing more errors on the test site compared to production.

At the moment, I am also finding it returning "502 Bad Gateway" from time to time.

Errors below, with examples of books which have produced them (sometimes the same book/format will produce two different errors at different times).

  1. The filename and the fallback cannot contain the "/" and "\" characters.
  1. The process has been signaled with signal "5".
  1. Unable to launch a new process.
  1. This long error when trying to load https://wsexport-test.toolforge.org/book.php?lang=fr&page=Robinson_Cruso%C3%A9&format=pdf-a5&fonts=:
The command "'ebook-convert' '/mnt/nfs/labstore-secondary-tools-project/wsexport-test/tool/temp/wsexport-test/ws-c0_Robinson_Crusoe-95811663003713.epub' '/mnt/nfs/labstore-secondary-tools-project/wsexport-test/tool/temp/wsexport-test/ws-c0_Robinson_Crusoe-95811291001983.pdf' '--page-breaks-before' '/' '--paper-size' 'a5' '--pdf-page-margin-bottom' '32' '--pdf-page-margin-top' '40' '--pdf-page-margin-left' '24' '--pdf-page-margin-right' '24' '--pdf-page-numbers' '--preserve-cover-aspect-ratio'" failed.

Exit Code: 2(Misuse of shell builtins)

Working directory: /mnt/nfs/labstore-secondary-tools-project/wsexport-test/tool/public

Output:
================


Error Output:
================
Usage: ebook-convert input_file output_file [options]

Convert an ebook from one format to another.

input_file is the input and output_file is the output. Both must be specified as the first two arguments to the command.

The output ebook format is guessed from the file extension of output_file. output_file can also be of the special format .EXT where EXT is the output file extension. In this case, the name of the output file is derived from the name of the input file. Note that the filenames must not start with a hyphen. Finally, if output_file has no extension, then it is treated as a directory and an "open ebook" (OEB) consisting of HTML files is written to that directory. These files are the files that would normally have been passed to the output plugin.

After specifying the input and output file you can customize the conversion by specifying various options. The available options depend on the input and output file types. To get help on them specify the input and output file and then use the -h option.

For full documentation of the conversion system see
https://manual.calibre-ebook.com/conversion.html

Whenever you pass arguments to ebook-convert that have spaces in them, enclose the arguments in quotation marks. For example "C:\some path with spaces"

ebook-convert: error: no such option: --pdf-page-margin-bottom

(I could not find anymore debug output in the error.log file, which does not appear to have had anything new written to it in the last 2 hours).

Let me know if you want me to raise these as separate bugs.

Thanks @dom_walden that's super useful!

I've made a patch for the first issue (the slashes): https://github.com/wsexport/tool/pull/249

But the others should probably be looked into in more detail, so should be separate tasks.

The slash issue should be fixed now.

I can no longer reproduce on wsexport-test.

I am going to move this along. The specific errors can be dealt with in separate tasks.

The branch of the code which contains these changes is only on wsexport-test. That branch contains lots of other changes as well. I don't know at which point we want to merge that branch into the main branch and put it on production.

  1. The process has been signaled with signal "5".

I haven't been able to reproduce this more recently...

  1. Unable to launch a new process.

Raised as T266854.

  1. This long error when trying to load https://wsexport-test.toolforge.org/book.php?lang=fr&page=Robinson_Cruso%C3%A9&format=pdf-a5&fonts=:

Raised as T266757.

The issues described above have been documented in separate tickets, and this work has been released to production. For this reason, I'm marking it as Done.