Page MenuHomePhabricator

QA html previews on beta cluster
Closed, ResolvedPublic

Description

Background

The beta cluster is currently using html previews and the new summary endpoint. We would like a full round of QA on articles in the beta cluster with specific highlight to the changes made in the requirements in https://www.mediawiki.org/wiki/User_talk:Phuedx_(WMF)/Reading/Web/Page_Preview_API

Acceptance Criteria

Pre Sign Off Actions

Event Timeline

ovasileva triaged this task as Medium priority.Dec 8 2017, 10:51 AM
ovasileva created this task.

@ABorbaWMF - here's a task to track the QA results.

@phuedx - yup :) this one is more up to date, so I'll close the other one.

Page previews looks good so far. I noticed two small issues: the empty preview and the list preview don't seem to be working (no preview at all). It's hard to tell from the screenshots, but the pointer is hovering over the Empty and List previews on this article:
https://en.wikipedia.beta.wmflabs.org/wiki/Category:Page_Previews

Screen Shot 2017-12-08 at 8.16.03 AM.png (1×2 px, 1 MB)

Screen Shot 2017-12-08 at 8.16.20 AM.png (1×2 px, 1 MB)

Everything else looks fine. Emphasis text looks just like italics, but it looks the same in the article itself.

Thank you @ABorbaWMF! The first issue is now tracked under T182596: Empty previews not appearing on beta cluster. @Jdlrobson, @phuedx - based on https://www.mediawiki.org/wiki/User:Phuedx_(WMF)/Reading/Web/Page_Preview_API#Intros we should currently be displaying lists, right? I want to double-check before opening that task.

I'm not sure that the PreviewsPopupsList/sandbox test case is constructed correctly. The spec says that the intro should consist of

  • The first paragraph from the introductory section.
  • The first ordered, unordered, or definition list that is the next sibling of the first paragraph.

On that test page we have no lead paragraph, so the intro logic is correctly extracting nothing.

If you were to wrap the list in <p></p>, then I think the correct behavior would be to return the first child list (which is what the test case seems to be after).

It looks like the new Summary API endpoint is failing here. Both of the following URLs return 500s ("internal server error"):

It looks like HyperSwitch is transforming 204s from MCS into 500s. Better track down why that's happening...

I'm not sure that the PreviewsPopupsList/sandbox test case is constructed correctly. The spec says that the intro should consist of

  • The first paragraph from the introductory section.
  • The first ordered, unordered, or definition list that is the next sibling of the first paragraph.

On that test page we have no lead paragraph, so the intro logic is correctly extracting nothing.

If you were to wrap the list in <p></p>, then I think the correct behavior would be to return the first child list (which is what the test case seems to be after).

Or, for a more true-to-life approach, I think if you add some text above the lists, the parser will then <p>-wrap the works. I think it's not doing that now becuase it doesn't see a text node.

@Mholloway - let me try that, seems like we might just have a faulty test case. Thanks!

FYI https://phabricator.wikimedia.org/T182399#3828497 is tracked in T182596.

@Jdlrobson, @phuedx - based on https://www.mediawiki.org/wiki/User:Phuedx_(WMF)/Reading/Web/Page_Preview_API#Intros we should currently be displaying lists, right? I want to double-check before opening that task.

No. This looks like it's behaving correctly to me. A paragraph of text must be associated with the list for it to be included. Otherwise there is a danger you are using a list of say 1000 elements as a preview (e.g. a list page). The list requirement was added only for lists where the list is part of the paragraph e.g. http://en.m.wikipedia.org/wiki/Planet
e.g.

* Foo
* Bar
* Test

This is the lead paragraph that will be used for the summary and it will NOT include the list.
This is the lead paragraph that will be used for the summary and it will include the list:
* Foo
* Bar
* Test

bingo!

Screen Shot 2017-12-11 at 5.56.40 PM.png (254×786 px, 85 KB)

although it seems that we now have an edge case where if an article begins with a list, nothing happens. Would probably be difficult to catch the edge case on a real wiki, but we should either:

  • display the list either way or
  • display the empty preview

I would prefer the former, if possible - we could restructure the definition of an intro a bit to fit this.

display the empty preview

This will be fixed by T182596 - the content is being treated as empty and @Mholloway is on it.

Would probably be difficult to catch the edge case on a real wiki, but we should either:

Should we be optimising for edge cases? Note this has repercussions on how content is displayed in the app. It would potentially reposition the list in certain articles above the infobox.

It does seem highly unlikely that an IRL article would begin with an <ol> or <ul>. And the apps would indeed have no use for that. We could special-case it to return an extract for extract_html only, but it seems much cleaner just to be consistent and return no extract. +1 from me FWIW to the empty preview in this case.

Empty is fine then, it's definitely a small edge case.

What's the expected summary response for a project Main Page? A 204, or a 200 with an empty extract_html? We're currently returning a 204 in this case (subject to the HyperSwitch rewriting to 500 mentioned above). It's not explicitly covered in the spec.

Let me know if it should be a 200 with empty extract, and I'll quickly update before this afternoon's deployment if so.

Also: are you prepared to handle intros from pages with a wikibase-item or wikibase-property content model? It's in the spec but never been enabled.

What's the expected summary response for a project Main Page? A 204, or a 200 with an empty extract_html? We're currently returning a 204 in this case (subject to the HyperSwitch rewriting to 500 mentioned above). It's not explicitly covered in the spec.

Let me know if it should be a 200 with empty extract, and I'll quickly update before this afternoon's deployment if so.

I'm making an executive decision and doing this since I think I added the special treatment for main pages based on a misreading of an earlier patch. On the content model question I'll wait to hear back before doing anything.

Given for disambiguation pages we send a flag we'd probably do something
similar here but it's such an edge case I'd guess it doesnt matter too much
.

Also: are you prepared to handle intros from pages with a wikibase-item or wikibase-property content model? It's in the spec but never been enabled.

It looks like Wikidata previews are stalled: T111231: Page previews for Wikidata

@Mholloway - you made the right call. The empty extract is the best way to go here.

This appears to be fixed on Beta now.

Screen Shot 2017-12-14 at 1.00.15 PM.png (1×2 px, 1 MB)

Screen Shot 2017-12-14 at 12.59.56 PM.png (1×2 px, 1 MB)

^ The above is meant to reflect that the fact that we shouldn't be deploying the Page Summary API until the major client (Page Previews) has been QA'd.

All test cases on the beta cluster have been QA'd. Resolving.