User Details
- User Since
- Jan 11 2021, 3:17 PM (269 w, 3 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- Denis Gagne52 [ Global Accounts ]
Dec 3 2025
- Very similar indeed but T408915 was registered on October 31 while the problem appeared on ws-export only on November 28.
- The books downloaded before November 28 are well constructed. Afterwards, several [ '>', '<' ] need to be replaced with [ '>', '<' ] to reactivate the link tags in each chapter of an epub file.
- This is the opposite of the replacement found here : https://gerrit.wikimedia.org/r/c/mediawiki/core/+/993051/5/includes/Parser/CoreTagHooks.php#85).
Nov 30 2025
May be related with T411280
Similar issue with the link tag (T411303):
Same problem on books exported from French Wikisource. Some tags are urlencoded and therefore not recognised.
Nov 20 2025
Merci @Samwilson ! Surprisingly, I get the same error on the test server when I try to d/l a pdf from a page with no image in it. Was it working before your update or did you test somewhere else ??
Anyway, it also works for me if I download the epub file (from the test server), strip the image and use the same command line locally.
I can see the new image in the pdf with acrobat reader. Great !
Nov 19 2025
Additional informations :
- Original file from Commons : 24bits image with sRGB ColorSpace
- Extracted file from Wikisource : 8 bits grayscale image with insufficient data for sRGB ColorSpace
- 2 solutions from https://github.com/ImageMagick/ImageMagick/issues/2070 : 1) -strip or Imagick::stripImage on all grayscale images ; 2) upgrade wikimedia servers
- ref. : https://phabricator.wikimedia.org/T254937
Jun 18 2025
Confirmation that the css style including the px and the (w3 xhtml1-strict.dtd) documented width and height attributes work both on Calibre 8.4 (last version of Calibre). Thanks to all contributors. :)
Jun 9 2025
Hoping this is helpful, here is some additional information :
Jun 8 2025
@Samwilson @Tpt The img content is corrupted before ws-export begins the conversion with calibre.
This may occur through the deprecatedAttributes function in the PageParser module.
Jun 2 2025
Hi @Samwilson, upgrading Calibre won’t help, but you may change << style="width:150; height:165; " >> to << width="150" height="165" >> for xhtml conformity. Those larger images will increase the epub weight : 504px becomes 960px. Arghhh ! Many books won’t be downloadable anymore !
Dec 30 2023
You may also consider disabling HTMLZ. I noticed a massive download from ws.en and ws.fr in the past two months. I don’t think this format is very useful for individuals. Adding --disable-font-rescaling could help with pdfs.
Dec 27 2023
It was duplicated on ws.fr... and was deleted a few hours ago.
Dec 24 2023
Tested on 3 different books with 8bits grayscale cover and the image is displayed in the pdf files exported
Merci beaucoup @Samwilson !
Dec 17 2023
Bonjour @Aklapper, If you're more comfortable in ws.en, you’ll observe the same issue with the "Template:Clickable button 2".
Dec 15 2023
Mar 18 2023
@Aklapper The same update is needed on Ubuntu, Windows... Isn’t it ? As soon as I got the information 4 months ago, I upgraded to Calibre to 6.7.1 and tested with images downloaded from Commons. I informed @Tpt that a fix was available and that it solved the cover page bug. Now I know that someone at WMF is notified : files downloaded from Commons/ws-export can be corrupted, and a fix was released by ImageMagick.
Mar 16 2023
This bug affects images issued by Commons. An upgrade of Calibre on the ws-export server will fix the cover page bug whith grayscale images. Can we expect this upgrade in a near future ??
Dec 8 2022
Feb 27 2021
Assuming the link is prefixed with the document name, why can’t we link to that document whether it is the same or not ?
Feb 26 2021
If the page name is added to the link, I think the first elseif would catch it and build a new link as expected. Could you investigate if the problem is not with the node. I don’t see any title attribute in the neighborhood of href="#foo" and one is needed :
Feb 24 2021
This was working before and was broken during this project. I don’t understand the difficulty to identify such links. As mentioned earlier, any link built this way [[Les pères du système taoïste/Tao-Tei-King#CHAP1]] will work but not with only #CHAP1 inside the brackets. There are many links built that way. I think the conversion was taken care in this function of BookCleanerEpub.php but the first part does not trap #mylink any more :
Feb 4 2021
Same behaviour was observed in fr:Wikisource with links pointing to the same page (first char=#)
example : [ https://wsexport.wmflabs.org/?lang=fr&page=Les_p%C3%A8res_du_syst%C3%A8me_tao%C3%AFste/Tao-Tei-King&format=epub-3&fonts=]
The links in the summary will take you back to Ws website.
Feb 3 2021
Sam,
I choosed that large pseudo-book only to show that the limitation was not coming from Calibre.
The proposition sounds okay for me.
