Page MenuHomePhabricator

Make RESTBase tests not depend on production enwiki pages
Closed, ResolvedPublic

Description

Currently RESTBase tests depend on pages in production wikis. Local/CI RESTBase tests are running against other services in beta cluster. JS Parsoid in beta used to be able to fetch content for production enwiki, while PHP parsoid will not.

We need to remove all the dependencies to production wiki pages from tests.

Event Timeline

The service-checker tests should be rewritten to depend on beta pages in testing and production pages in production.

This is not needed because the checks are not run in beta (neither for deployment nor during operation)

The service-checker tests should be rewritten to depend on beta pages in testing and production pages in production.

This is not needed because the checks are not run in beta (neither for deployment nor during operation)

But they're run in local development. So this is needed.

JS Parsoid in beta used to be able to fetch content for production enwiki, while PHP parsoid will not.

Why is this the case? Won't we be able to point the tests to the production deployment of the parsoid API instead?

JS Parsoid in beta used to be able to fetch content for production enwiki, while PHP parsoid will not.

Why is this the case? Won't we be able to point the tests to the production deployment of the parsoid API instead?

We don't expose Parsoid itself to the internet. So if we could potentially make CI work, local testing would fail anyway.

Will we expose it on an API we could query in the future?

If it is going to be temporarily hidden we could disable the tests until the API is ready and then re-enable them.

Just to clarify one point: it's common in Product Infrastructure team discussions to refer to the REST /page/html endpoint informally as "Parsoid," although that's not technically correct. When you see us discussing consuming Parsoid, think /page/html.

I think I see what's going on here: the external JS Parsoid service was configured to be able to retrieve content from production Wikipedias via public API even while running on the Beta Cluster, but Parsoid/PHP is internal to a MediaWiki instance and consumes only content in the instance's own DB. Theoretically we could write some code to have Parsoid/PHP grab Wikitext for parsing from an external wiki's public API, but that's contrary to the overall architecture of the system. Is this correct?

Theoretically we could write some code to have Parsoid/PHP grab Wikitext for parsing from an external wiki's public API, but that's contrary to the overall architecture of the system. Is this correct?

See T231569#5462660 .. If we really wanted to run the REST APIs (vs. scripts like bin/parse.php) in this mode, we'll have to add this ability. But, only if it is really really needed.

JS Parsoid in beta used to be able to fetch content for production enwiki, while PHP parsoid will not.

Why is this the case? Won't we be able to point the tests to the production deployment of the parsoid API instead?

We don't expose Parsoid itself to the internet. So if we could potentially make CI work, local testing would fail anyway.

@Jhernandez @Mholloway Do the tests need to actually query Parsoid or can they hit RESTBase instead? If the latter, then, I don't think anything will be affected. Parsoid/JS wasn't exposed to the public internet either.

@Jhernandez @Mholloway Do the tests need to actually query Parsoid or can they hit RESTBase instead? If the latter, then, I don't think anything will be affected. Parsoid/JS wasn't exposed to the public internet either.

They hit RB, but it does matter because RB can only access the Parsoid instance local to it. This used to work because, as said above, Parsoid from beta was set up to also respond to requests for production wikis, but without that ability, RB will not be able to produce a meaningful response.

@Jhernandez @Mholloway Do the tests need to actually query Parsoid or can they hit RESTBase instead? If the latter, then, I don't think anything will be affected. Parsoid/JS wasn't exposed to the public internet either.

They hit RB, but it does matter because RB can only access the Parsoid instance local to it. This used to work because, as said above, Parsoid from beta was set up to also respond to requests for production wikis, but without that ability, RB will not be able to produce a meaningful response.

What I meant was: the MCS tests can hit the production restbase urls .. i.e if you want cross-cluster access for tests, make it transparent and have the MCS tests that want this feature do the cross-cluster access instead of hiding it behind 2 layers (MCS -> RB -> Parsoid).

Pchelolo renamed this task from Make services tests not depend on production enwiki pages to Make RESTBase tests not depend on production enwiki pages.Jan 2 2020, 6:51 PM
Pchelolo updated the task description. (Show Details)

Seems like this task is only about RESTBase tests, as it should be the only one accessing JS parsoid in beta directly. Updated the task description to accommodate that.

Change 561900 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/services/restbase@master] Tests: Remove dependency from production wikis

https://gerrit.wikimedia.org/r/561900

Change 561900 abandoned by Ppchelko:
Tests: Remove dependency from production wikis

Reason:
Accident

https://gerrit.wikimedia.org/r/561900

Pchelolo claimed this task.

Change 561900 merged by Ppchelko:
[mediawiki/services/restbase@master] Tests: Remove dependency from production wikis

https://gerrit.wikimedia.org/r/561900