I finally got @ssastry 's Parsoid patch tested with MCS. With that I see similar behavior in MCS in that is also generated an extra empty section at the end with just an id after avoiding the 'indexOf' of undefined exception, like
Wed, Sep 20
The fix for T173384 is deployed. Now we could add Polish, too.
Looks like this has been deployed today. I still don't see the class attribute for the <li> in the example page, even after a purge.
Mon, Sep 18
@Jdlrobson @ovasileva I've compiled some files that help compare the extract implementations, old vs new. (Old = MW API TextExtract, New = summary endpoint on MCS master as of today).
I've done this for many languages (almost top 30-40 languages by number of users). The script is still running for a few more but this should be good enough for an initial comparison. If you want this run for another language let me know.
Thu, Sep 14
Wed, Sep 13
Some more info what this entails and/or why RI is tagged would be appreciated.
MCS is not used for zhwiki until PArsoid and RESTBase can handle language variants.
Yes, there might be other MCS users, and certainly will be in the future (for the page content portion of MCS will likely be renamed to Page Content Service, by the way).
While I can still repro the behavior on the sandbox page in the Android app when RESTBase/MCS is enabled, I don't see the error output on the [[Iain Banks]] page anymore. Not sure what's changed.
Tue, Sep 12
I guess it's time to merge in the latest template changes to MCS. Before I do so, I was going to update the template, though, to reduce the amount of conflicts in the future (mainly around test files not using ES6 yet).
In the patch ^ I've changed the parsoid-access code to use the old code for sectioning via <div> tags instead of <section> tags. It'll automatically take advantage of <section> tags once Parsoid produces them. This should give us some time to come up with a proper solution if T114072 is taking too long.
Mon, Sep 11
@GWicke Not to the featured articles. This was for pages linked from In the News.
Sat, Sep 9
Fri, Sep 8
When encountering a heading of a higher level (higher number, lower prominence), the sectioning code I wrote in parsoid-utils creates a nested section.
Thu, Sep 7
(e.g. you make an edit and want to show people the result).
Ok, I think the real issue is with sectioning the code. The sectioning code behaves inconsistently. If the current section is at the level of an <h2> and the text contains an <h2> inside a <div> it starts a new section. OTOH if the text contains an <h3> inside a <div> instead it continues the current section.
The issue is that for some section text is not set. After avoiding this issue there is a follow-up needed to invetigate why there is an extra section at the end with just and id but no text or line.
Not sure why the RB parts were deployed already either. I did see @Pchelolo 's PR, and thought we would test it in beta cluster first. I also wanted to run some tests comparing the output of the top 1000 pages in a few wikis.
@JMinor or others, any guidance about how long is acceptable? Is 10 minutes max without intervention good enough?
Wed, Sep 6
@Pchelolo What can we do to reduce this time? Where is the bottleneck or most of the time spent? I thought it would only in the order of minutes.
@Pchelolo What about images in page summaries (for news items)?
@GWicke This is not about trending-edits. The Explore feed content does not include trending-edits information yet.
Adding the Services team to see if they have any more input on this or to correct anything I write here if necessary.
The examples mentioned in the description are variation of navboxes.
To use 1er instead of 1 for the monthNumber is not a surprise to me. The inconsistency between the page names for the selected event pages and the regular day pages is surprising to me:
Tue, Sep 5
The first one is a table.vertical-navbox. The other screenshots are from a div.navbox. MCS currently removes table.navbox. I could easily add those selectors to the transformation list if that is what is desired, and I'm leaning towards doing so for consistency sake.
@Mhurd a separate issue with details would be better.
@Jdlrobson One more question. In the code I only see one type of standard. Shouldn't there be more than that? I thought there was a different one for disambig pages.
@Jdlrobson I'm going to push a couple of changes soon.
- I think we should get rid of the -html in the name since the response is now a JSON object. Why no call it just summary?
- Update Content-Type to application/json.
- Add spec.yaml entries.