Page MenuHomePhabricator

planet: Image in blog post appear missing if the <img> tag includes many characters before the actual "src" parameter
Open, Needs TriagePublic

Description

My latest blog post: https://addshore.com/2018/04/wikidata-map-march-2018/

Appears to have images missing when shown on https://en.planet.wikimedia.org

Where there should be an image tag instead I just seen an empty <img>

<p><a href="https://addshore.com/2018/04/wikidata-map-march-2018/wikidata_map_comparison_nov2017_march2018/" rel="attachment wp-att-1231"><img></a></p>

Is there some sort of filtering of img tags done by planet?
Have I perhaps changed something on my wordpress install that has started causing this? as it is not happening for older posts.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 6 2018, 10:23 AM

Note that looking at the source of https://addshore.com/2018/04/wikidata-map-march-2018/ I see for that image:

<p><a href="https://addshore.com/2018/04/wikidata-map-march-2018/wikidata_map_comparison_nov2017_march2018/" rel="attachment wp-att-1231"><img data-attachment-id="1231" data-permalink="https://addshore.com/2018/04/wikidata-map-march-2018/wikidata_map_comparison_nov2017_march2018/" data-orig-file="https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?fit=4000%2C2000&amp;ssl=1" data-orig-size="4000,2000" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="wikidata_map_comparison_nov2017_march2018" data-image-description="" data-medium-file="https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?fit=300%2C150&amp;ssl=1" data-large-file="https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?fit=676%2C338&amp;ssl=1" class="alignnone wp-image-1231 size-full" src="https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?resize=676%2C338&#038;ssl=1" alt="" width="676" height="338" srcset="https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?w=4000&amp;ssl=1 4000w, https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?resize=300%2C150&amp;ssl=1 300w, https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?resize=768%2C384&amp;ssl=1 768w, https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?resize=1024%2C512&amp;ssl=1 1024w, https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?resize=676%2C338&amp;ssl=1 676w, https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?w=1352&amp;ssl=1 1352w, https://i0.wp.com/addshore.com/wp-content/uploads/2018/04/wikidata_map_comparison_nov2017_march2018.png?w=2028&amp;ssl=1 2028w" sizes="(max-width: 676px) 100vw, 676px" data-recalc-dims="1" /></a></p>

I wonder if there are any kind of logs on the planet host that might help figure out what happened.

Maybe it is just a one off thing?

I suspect it may be looking for the src attribute look at most at N characters, while in your post there are many characters before it.

Dzahn added a subscriber: Dzahn.Apr 9 2018, 2:23 PM

I have the same suspicion. There are a lot of characters before that 'src=' starts after the image tag. That seems the uncommon part about it. While the software does log which feeds it is parsing and whether there is a general error, it does not log details about the parsing/image stripping part .

Addshore added a comment.EditedMay 25 2018, 3:18 PM

it does not log details about the parsing/image stripping part .

Would that be easy to add?

Could you point to the bit of code this happens in? 💃

Dzahn added a comment.May 25 2018, 6:52 PM

@Addshore The current planet-venus software is not maintained and got dropped from stretch, so once we upgrade Planet servers to stable we will have to replace it with something new. That replacement is called "rawdog". The puppet work has also already been done to support rawdog and there is a testing instance for it.

All that is part of T180498. Given that situation i think it makes more sense to focus on making that switch happen. And before that.. see if image handling in rawdog is ok for us.

Dzahn added a comment.May 25 2018, 6:56 PM

Can you find your blog post in http://planet-hotdog.wmflabs.org/ and see how / if it handles images?

Can you find your blog post in http://planet-hotdog.wmflabs.org/ and see how / if it handles images?

Right now http://planet-hotdog.wmflabs.org/ goes back until 20180410 but the blogpost is from 20180406. Is there a link to "Older posts" that I cannot find?

Hi, I’ve been improving the ui of rawdog ny using planet kde theme which is open source. That now includes a Older posts button that works.

Addshore added a comment.EditedSep 11 2018, 6:48 AM

I just spotted this again on my last blog post:

On my actual site there is an image tag inside the figure tag:

It looks fine in the RSS feed too :

https://addshore.com/feed/

<figure class="wp-block-image alignwide"><a href="https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?ssl=1"><img data-attachment-id="1872" data-permalink="https://addshore.com/2018/09/grafana-graphite-and-maxdatapoints-confusion-for-totals/image-3/" data-orig-file="https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?fit=1392%2C712&amp;ssl=1" data-orig-size="1392,712" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image" data-image-description="" data-medium-file="https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?fit=300%2C153&amp;ssl=1" data-large-file="https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?fit=676%2C346&amp;ssl=1" src="https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?w=676&#038;ssl=1" alt="" class="wp-image-1872" srcset="https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?w=1392&amp;ssl=1 1392w, https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?resize=300%2C153&amp;ssl=1 300w, https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?resize=768%2C393&amp;ssl=1 768w, https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?resize=1024%2C524&amp;ssl=1 1024w, https://i2.wp.com/addshore.com/wp-content/uploads/2018/09/image-2.png?resize=676%2C346&amp;ssl=1 676w" sizes="(max-width: 676px) 100vw, 676px" data-recalc-dims="1" /></a></figure>

The same thing also happens on https://planet-hotdog.wmflabs.org/

Addshore added a comment.EditedSep 22 2018, 8:27 AM

It seemed to happen again on my last post, although one of the images is there.....

Actual post: https://addshore.com/2018/09/wikibase-extensions-on-wikidata-org/
(I don't think I can link to it on planet...)

The images that are not there do still have the caption:

<figure class="wp-block-image">
      <img class="img img-fluid imcent">
      <figcaption>
        A diagram of current dependencies between the various Wikibase extensions running on wikidata.org
      </figcaption>
    </figure>

The one image that is there has both the caption and image:

<figure class="wp-block-image">
      <img alt="File:Douglas Adams Constraint Violation.png" src="https://i0.wp.com/upload.wikimedia.org/wikipedia/commons/d/da/Douglas_Adams_Constraint_Violation.png?w=676&amp;ssl=1" class="img img-fluid imcent">
      <figcaption>
        An example on page constraint violation (by&nbsp;<br>
        <a href="https://commons.wikimedia.org/wiki/User:Lucas_Werkmeister_(WMDE)">Lucas Werkmeister</a>&nbsp;CC BY-SA 4.0)
      </figcaption>
    </figure>

On my RSS feed https://addshore.com/feed/ :

<figure class="wp-block-image"><img data-attachment-id="1957" data-permalink="https://addshore.com/2018/09/wikibase-extensions-on-wikidata-org/wikibase-extension-dependancies/" data-orig-file="https://i0.wp.com/addshore.com/wp-content/uploads/2018/09/Wikibase-extension-dependancies.png?fit=960%2C540&amp;ssl=1"
    data-orig-size="960,540" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}"
    data-image-title="Wikibase extension dependancies" data-image-description="" data-medium-file="https://i0.wp.com/addshore.com/wp-content/uploads/2018/09/Wikibase-extension-dependancies.png?fit=300%2C169&amp;ssl=1" data-large-file="https://i0.wp.com/addshore.com/wp-content/uploads/2018/09/Wikibase-extension-dependancies.png?fit=676%2C380&amp;ssl=1"
    src="https://i0.wp.com/addshore.com/wp-content/uploads/2018/09/Wikibase-extension-dependancies.png?w=676&#038;ssl=1" alt="" class="wp-image-1957" srcset="https://i0.wp.com/addshore.com/wp-content/uploads/2018/09/Wikibase-extension-dependancies.png?w=960&amp;ssl=1 960w, https://i0.wp.com/addshore.com/wp-content/uploads/2018/09/Wikibase-extension-dependancies.png?resize=300%2C169&amp;ssl=1 300w, https://i0.wp.com/addshore.com/wp-content/uploads/2018/09/Wikibase-extension-dependancies.png?resize=768%2C432&amp;ssl=1 768w, https://i0.wp.com/addshore.com/wp-content/uploads/2018/09/Wikibase-extension-dependancies.png?resize=676%2C380&amp;ssl=1 676w"
    sizes="(max-width: 676px) 100vw, 676px" data-recalc-dims="1" />
    <figcaption>A diagram of current dependencies between the various Wikibase extensions running on wikidata.org</figcaption>
</figure>

and

<figure class="wp-block-image"><img src="https://i0.wp.com/upload.wikimedia.org/wikipedia/commons/d/da/Douglas_Adams_Constraint_Violation.png?w=676&#038;ssl=1" alt="File:Douglas Adams Constraint Violation.png" data-recalc-dims="1" />
    <figcaption>An example on page constraint violation (by
        <br/><a href="https://commons.wikimedia.org/wiki/User:Lucas_Werkmeister_(WMDE)">Lucas Werkmeister</a> CC BY-SA 4.0)</figcaption>
</figure>

The figure tag for the image that ended up being rendered on planet seems much simpler..
Now to figure out why wordpress does this!

I suspect it may be looking for the src attribute look at most at N characters, while in your post there are many characters before it.

I have the same suspicion. There are a lot of characters before that 'src=' starts after the image tag. That seems the uncommon part about it. While the software does log which feeds it is parsing and whether there is a general error, it does not log details about the parsing/image stripping part .

Now that I have compared a working image and non working image from the same blog post it looks like that could be the case.
It's over 1000 chars in.

Aklapper renamed this task from planet: Image in blog post appear missing to planet: Image in blog post appear missing if the <img> tag includes many characters before the actual "src" parameter.Sep 22 2018, 9:34 AM

Still happens with my latest blog post which is now on planet https://addshore.com/2019/07/wikidata-map-july-2019/ :(

This comment was removed by Addshore.