Page MenuHomePhabricator

Contributor creates a page
Closed, ResolvedPublic2 Estimated Story Points

Description

As a Contributor, I want to create a new page, so that I can add information to the project.

POST /page

Create a new page.

Payload: JSON
source: source of new page
title: title of new page, like "Talk:Main Page"
comment: edit summary, required, null means it can be filled in if needed by the server
content_model: Optional content model for the main slot of the page; defaults to 'wikitext'
csrf_token: Optional CSRF token; required when using Cookie authentication; otherwise forbidden

Status:
201 – created
401 – not authenticated
403 – not authorized
409 – a page with that title already exists

Headers:
Location: new page API endpoint location ("rest.php/v1/page/The_New_Page")

Body: JSON

  • id: numeric id of the page
  • key: prefixed DB key of the page, like "Talk:Main_Page"
  • title: title for display, like "Talk:Main Page"
  • latest: latest revision of the page, object with these properties
    • id: revision ID
    • timestamp: revision timestamp
  • license: Object for the preferred license of the page, including these properties:
    • spdx: SPDX code
    • url: URL for the license
    • title: title of the license
  • other_licenses: array of objects with {spdx, url, title} for other licenses the page is available under
  • content_model: Content model of the main slot of the page
  • source: preferred editable source of the main slot of the page; usually wikitext but depends on content_model

Event Timeline

I think it would be better to avoid enclosing the payload in JSON. Instead, we could have POST /page/{title} with the body being the payload, i.e. the contents of the page. Moreover, my personal preference for creating new documents is to use PUT instead of POST. The reason for it is that POST is ubiquitous and is usually used for both create and update operations; changing the method makes it clearer to clients they can't update an existing page.

I think we also need to discuss the format that clients send. According to the architecture principles, they should send HTML by default, but I think accepting wikitext too here is acceptable. Clients should indicate which variant are they sending by setting the appropriate Content-Type header.

Wrt the return blob, if 201 is returned, then only the metadata should be returned, there is no need to return the text itself.

As I said in a comment on the design principles doc, like @mobrovac, I think the PUT endpoint should allow creation, if we do go ahead and implement PUT. The only reason to use POST for creation is if the target URL is not known to the client.

I'll put comments which apply equally to create and update on the update task at T230843.

@tstarling I've updated this endpoint to have the same output as T229663. I tried to think of which title format to use in the input, and decided that client developers were more likely to have a space-separated title than a pdbk. We could maybe allow both, make them optional, prefer the pdbk version if both are provided?

I realize that you prefer PUT-to-create, but POST-to-create is such a dominant pattern in APIs that I'm really reluctant to remove it. Can you live with having both?

I went ahead and added both display_title and title_key. Also, I added a field for the edit summary.

I agree with @tstarling

Additionally,

display_title: title of new page, like "Talk:Main Page"

  1. The display title has a different semantics in MW: https://www.mediawiki.org/wiki/Extension:Display_Title
  2. title and 'display_title' contradicts 'Object composition' design principle. They should be composed into a sub-object
title:
  dbkey: "Bla_Bla"
  text: "Bla Bla"
  display: "User_overridden_bla_bla"

However, if we switch to 'PUT' this will not be relevant anymore.

summary: optional edit summary

I thought we agreed it's called a comment?

I realize that you prefer PUT-to-create, but POST-to-create is such a dominant pattern in APIs that I'm really reluctant to remove it. Can you live with having both?

Yes.

  1. The display title has a different semantics in MW: https://www.mediawiki.org/wiki/Extension:Display_Title
  2. title and 'display_title' contradicts 'Object composition' design principle. They should be composed into a sub-object
title:
  dbkey: "Bla_Bla"
  text: "Bla Bla"
  display: "User_overridden_bla_bla"

I agree that the title forms should be in a sub-object and that these are the correct names of the various title forms. But note that the display title depends on the user's language, including the Accept-Language header on variant wikis. The display title is only available after parsing the page in the specified language. So I don't support having any mention of the display title in the spec of this endpoint.

The text and dbkey forms are trivially related — the only difference is that one has spaces and the other has underscores.

summary: optional edit summary

I thought we agreed it's called a comment?

In the Action API and the web UI it is called a summary. In EditPage and WikiPage it is called a summary. In the DB schema and the storage classes it is called a comment. PageUpdater straddles the boundary with code like:

class PageUpdater {
...
	public function saveRevision( CommentStoreComment $summary, $flags = 0 ) {
...
	private function makeNewRevision(
		CommentStoreComment $comment,

My preference is to call it a summary, for consistency with the existing APIs, the UI, and on-wiki documentation such as https://en.wikipedia.org/wiki/Help:Edit_summary .

@tstarling we've used comment in other API endpoints already. Is your preference strong enough that we need to change it, or can you live with comment?

Do I need to do some kind of pros and cons list and we can come to a ruling on it?

Honestly, I'm not sure I can come up with an objective criterion for this kind of issue, except maybe "use the property names developers will expect". It feels like summary is right here, but I'm not sure it's worth the trouble of changing it.

eprodromou updated the task description. (Show Details)
eprodromou updated the task description. (Show Details)

Does the "license" field need updated to match recent changes to T229663: Contributor gets page source? If so, presumably T230843: Contributor updates a page would also need a similar update.

eprodromou updated the task description. (Show Details)

@BPirkle great point! I updated the schema on the wiki and the read-only endpoints, but I forgot to hit these. I think it's up-to-date now.

Implementation note: the ApiEditPage class handles both edits and creates in a single function: https://gerrit.wikimedia.org/g/mediawiki/core/+/1f77235c14ab05ceb46f9a6ab64b718f0560a95a/includes/api/ApiEditPage.php#36

I agree with having both POST and PUT available from a REST standpoint. But on the Handler side, there is likely room for massive code sharing between the two implementations.

Likely we will use ApiEditPage as inspiration, with EditPage being the behind-the-scenes workhorse: https://gerrit.wikimedia.org/g/mediawiki/core/+/1f77235c14ab05ceb46f9a6ab64b718f0560a95a/includes/EditPage.php#46

Given that EditPage is undergoing massive refactoring, are we not going to be writing code to immediately through it away soon?

Given that EditPage is undergoing massive refactoring, are we not going to be writing code to immediately through it away soon?

Hmmm ... maybe? I'd rather not write throwaway code, but EditPage refactor isn't a small task. I saw T20654: EditPage.php needs rewrite: Separate DB and UI logic (no gerrits), with a bunch of related tasks. I didn't go through them all, but I saw T157658: Factor out a backend from EditPage (no gerrits) and T208801: Support slots other than the main slot in EditPage - backend support (one gerrit that says it is a very rough draft). From what I can tell, that refactoring effort won't be done any time soon.

So we may have to work with what we have now and rewrite it later. ApiEditPage is going to need a similar refactor, so if the same developer takes on both, they might be able to use what they learn on one to speed up the other, and it won't be as bad as it sounds.

These CRUD endpoints are scheduled for "next sprint", which at this point I guess means early January 2020. Do you have any additional information about the EditPage refactor, or when it might be complete? In other words, do you have a reason to be optimistic that waiting on the EditPage refactor to finish wouldn't push the implementation of these endpoints super far into the future?

WDoranWMF changed the point value for this task from 6 to 5.Feb 26 2020, 7:38 PM
WDoranWMF changed the point value for this task from 5 to 2.Mar 4 2020, 7:46 PM

Change 577591 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Define POST handler for /page/: create page

https://gerrit.wikimedia.org/r/577591

Change 579260 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] REST: page/ endpoints: don't use tokens with OAuth

https://gerrit.wikimedia.org/r/579260

POST /page/

Shouldn't this be POST /page? The trailing / seems odd. I didn't see an easy way to be oblivious to it.

comment: optional edit summary

IMHO this should not be optional. The UI let's you submit an edit with an edit summary, bit it's generally discouraged, and (depending on settings) even triggers a prompt. I don't think we want to encourage people to do edits without supplying an edit comment. IF they really want to, they can send an empty string.

@daniel The design principles have POST /<type> with no trailing slash, so I updated this endpoint. Thanks.

@daniel I'm reluctant to introduce a requirement for the edit summary at the API layer.

I believe it's possible to save a revision with no summary within our internal code and storage layer, and it's possible to create a page with no revision summary in the UIs we have, right? I understand the principle, but it feels like overstepping our bounds to make that decision at the REST interface layer.

@apaskulin can we manage this problem with documentation? Is there a good way for us to say, "Edit summaries are optional, but it's good practice to include them for automated updates, and if your client has a user interface, you should prompt for an edit summary if it's not provided."?

I believe it's possible to save a revision with no summary within our internal code and storage layer, and it's possible to create a page with no revision summary in the UIs we have, right? I understand the principle, but it feels like overstepping our bounds to make that decision at the REST interface layer.

It is possible to create a revision with an *empty* edit summary, and I'm not saying the API does not allow this. The UI always asks for a summary, often pre-fills it, and warns if it's left empty. Under some circumstances, the summary is actually required, see EditPage::AS_SUMMARY_NEEDED.

In action=edit, the summary argument is optional. This is needed because that API is also used to add sections (in which case, the summary defaults to the new section title), and undo (which also generates a useful summary if none is given explicitly).

@apaskulin can we manage this problem with documentation? Is there a good way for us to say, "Edit summaries are optional, but it's good practice to include them for automated updates, and if your client has a user interface, you should prompt for an edit summary if it's not provided."?

We should indeed suggest that a summary is generated, or the user is prompted for one, if none is provided by the user. This in my mind is saying "the summary should not be empty". I still think the field should always be set, even if it is empty.

Conceptually, and also in the data model, a revision always has a comment. It may be empty, it may be generated automatically, but it always exist. Requiring it as input doesn't seem problematic, as long as we also accept an empty value there.
Alternatively, we could specify that a comment will be generated if none is supplied, but then we'd have to make sure that we can always do this. Currently, we have auto-comments for some situations, but not for most.

@apaskulin can we manage this problem with documentation? Is there a good way for us to say, "Edit summaries are optional, but it's good practice to include them for automated updates, and if your client has a user interface, you should prompt for an edit summary if it's not provided."?

Yes, we can definitely provide some guidance like this in the docs.

@daniel I would prefer that we make as little as possible required; everything that's required is one more bit of effort on the part of the client developer.

That said, I want to clear this blocker. I've marked it as required, but null is a possible value, and if null is provided the default behaviour happens (either it's empty, or it gets auto-generated). Let me know if that's what you were thinking. I'd prefer to keep the empty string "" meaning "make the comment literally an empty string".

I've also created a ticket T248331 to make the parameter optional, and we can continue the discussion there. Making a required parameter optional is backwards-compatible for semver, but making an optional parameter required is not. So this is probably the right direction to go.

Change 577591 merged by jenkins-bot:
[mediawiki/core@master] Define POST handler for /page/: create page

https://gerrit.wikimedia.org/r/577591

Change 579260 merged by jenkins-bot:
[mediawiki/core@master] REST: page/ endpoints: don't use tokens with OAuth

https://gerrit.wikimedia.org/r/579260

Documentation is live here. Reviewed and approved by Daniel on April 9 via email.