Page MenuHomePhabricator

RuntimeError: The string to escape is not a valid UTF-8 string.
Closed, ResolvedPublic

Description

Links:

{
    "class": "Twig\\Error\\RuntimeError",
    "message": "The string to escape is not a valid UTF-8 string.",
    "code": 0,
    "file": "/var/www/tool/templates/export.html.twig:22",
    "trace": [
        "/var/www/tool/var/cache/prod/twig/87/87b7d9e7411c00f8f4132b5ecd9979cc165e3e31b67492860f24de86f614ae6f.php:76",
        "/var/www/tool/vendor/twig/twig/src/Template.php:171",
        "/var/www/tool/var/cache/prod/twig/6f/6f4caec2ab13b38c62d05cac171aa5410a5d82d73c0834f92edf11819e319fdb.php:110",
        "/var/www/tool/vendor/twig/twig/src/Template.php:394",
        "/var/www/tool/vendor/twig/twig/src/Template.php:367",
        "/var/www/tool/var/cache/prod/twig/87/87b7d9e7411c00f8f4132b5ecd9979cc165e3e31b67492860f24de86f614ae6f.php:42",
        "/var/www/tool/vendor/twig/twig/src/Template.php:394",
        "/var/www/tool/vendor/twig/twig/src/Template.php:367",
        "/var/www/tool/vendor/twig/twig/src/Template.php:379",
        "/var/www/tool/vendor/twig/twig/src/TemplateWrapper.php:40",
        "/var/www/tool/vendor/twig/twig/src/Environment.php:277",
        "/var/www/tool/vendor/symfony/framework-bundle/Controller/AbstractController.php:249",
        "/var/www/tool/vendor/symfony/framework-bundle/Controller/AbstractController.php:257",
        "/var/www/tool/src/Controller/ExportController.php:232",
        "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:157",
        "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:79",
        "/var/www/tool/vendor/symfony/http-kernel/EventListener/ErrorListener.php:60",
        "/var/www/tool/vendor/symfony/event-dispatcher/EventDispatcher.php:270",
        "/var/www/tool/vendor/symfony/event-dispatcher/EventDispatcher.php:230",
        "/var/www/tool/vendor/symfony/event-dispatcher/EventDispatcher.php:59",
        "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:218",
        "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:90",
        "/var/www/tool/vendor/symfony/http-kernel/Kernel.php:196",
        "/var/www/tool/public/index.php:35"
    ],
    "previous": {
        "class": "Symfony\\Component\\HttpKernel\\Exception\\NotFoundHttpException",
        "message": "Page not found for: 1911_Encyclop�dia_Britannica/Dock_(structure)",
        "code": 0,
        "file": "/var/www/tool/src/Util/Api.php:252",
        "trace": [
            "/var/www/tool/vendor/guzzlehttp/promises/src/Promise.php:204",
            "/var/www/tool/vendor/guzzlehttp/promises/src/Promise.php:153",
            "/var/www/tool/vendor/guzzlehttp/promises/src/TaskQueue.php:48",
            "/var/www/tool/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php:118",
            "/var/www/tool/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php:145",
            "/var/www/tool/vendor/guzzlehttp/promises/src/Promise.php:248",
            "/var/www/tool/vendor/guzzlehttp/promises/src/Promise.php:224",
            "/var/www/tool/vendor/guzzlehttp/promises/src/Promise.php:269",
            "/var/www/tool/vendor/guzzlehttp/promises/src/Promise.php:226",
            "/var/www/tool/vendor/guzzlehttp/promises/src/Promise.php:62",
            "/var/www/tool/src/BookProvider.php:174",
            "/var/www/tool/src/BookProvider.php:39",
            "/var/www/tool/src/BookCreator.php:42",
            "/var/www/tool/src/Controller/ExportController.php:130",
            "/var/www/tool/src/Controller/ExportController.php:90",
            "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:157",
            "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:79",
            "/var/www/tool/vendor/symfony/http-kernel/Kernel.php:196",
            "/var/www/tool/public/index.php:35"
        ]
    }
}

Event Timeline

I'm not sure where the link that generates this error is coming from.

The error comes from this title:

1911_Encyclop%E6dia_Britannica%2FDock_%28structure%29

but the link on the Wikisource page is to this title:

1911_Encyclop%C3%A6dia_Britannica/Dock_(structure)

where æ is correctly encoded as two bytes C3 A6.

We can avoid this error by removing the html_attr escaping from the title param, but that doesn't actually help: the above then ends up trying to export 1911 Encyclop�dia Britannica/Dock (structure) and fails. Although, I guess it is better to fail with "Page not found for: 1911_Encyclop�dia_Britannica/Dock_(structure)" rather than the existing bad error.

I've made a patch for that: https://github.com/wsexport/tool/pull/320

dom_walden subscribed.

The link in the description now returns Page not found for: 1911_Encyclop�dia_Britannica/Dock_(structure).

As Sam points out, wsexport links on https://en.wikisource.org/wiki/1911_Encyclop%C3%A6dia_Britannica/Dock_(structure) export fine (they wouldn't have had this error in the first place).

Test Environment: wsexport-test and wsexport production WS Export version 2.2.3.

ifried subscribed.

This is now on production, and I'm seeing the "Page not found for" text (see screenshot example below). This behavior is preferable over the previous behavior, since it provides more useful information to users. For this reason, I'm marking this as Done.

Screen Shot 2021-02-02 at 5.55.48 PM.png (602×958 px, 65 KB)