Page MenuHomePhabricator

Expose revisions.rev_sha1 field through revision query API
Open, LowPublic

Description

I would want revision.rev_sha1 exposed in the api, so I can retrieve revisions which contains the same content (I assume rev_sha1 is the sha1 of the content and thus should be same for different revisions with the same content)


Version: unspecified
Severity: enhancement

Details

Reference
bz49136

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:48 AM
bzimport added a project: MediaWiki-API.
bzimport set Reference to bz49136.
bzimport added a subscriber: Unknown Object (MLST).
AzaToth created this task.Jun 4 2013, 3:49 PM

I meant expose it as field for query so you can limit the query to sha1

This would probably require adding an index covering it, as it's currently not indexed at all (although img_sha1 is).

Anomie added a comment.Jun 4 2013, 3:57 PM

(In reply to comment #1)

I meant expose it as field for query so you can limit the query to sha1

Good thing you clarified, I was about to close this as a dup of bug 21860.

What exactly is the use case for this, that isn't served by processing the output of prop=revisions&rvprop=sha1 on the client?

(In reply to comment #2)

This would probably require adding an index covering it, as it's currently
not indexed at all (although img_sha1 is).

That is correct.

Sometimes there can be way too many revisions to practically loop through them all to find out which one is the oldest one. For example in a edit war/revert war, I would want to be able to quickly get the original revision which started it all, and of someone reverts to a really old revision, it's good to be able to quickly find out whom made that revision in case of sockery

afeldman wrote:

If rev_sha1 will always be queried with a page id, a (rev_page, rev_sha1(5)) index should be fine, even on enwiki.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 18 2015, 3:11 PM

Note that the sha1 returned by the api is not the rev_sha1 field of the revision table, but a sha1 of the sha1 :facepalm: ... see T75411 for all its glory.

So I think that would be really confusing, as the output sha1 of the prop=revisions can't be used to query it again for that particular sha1

I think it is the same value, just written differently. A SHA-1 hash is just a 160-bit number, which is usually displayed as 40 digits in base 16 (e.g. 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12), it can be written in any base. MediaWiki internally stores it as a human-readable value in base 36, presumably as a compromise between space taken and readability.

Anomie added a comment.Oct 5 2015, 2:01 PM

I think it is the same value, just written differently.

You are correct.

Krinkle added a subscriber: Krinkle.

Merging T51138 into here as it was created a sub task a bit too early. If and when this is accepted as a change, and it runs out that an index is indeed, and that the team working on it wants a separate task for that part of the work, feel free to re-open/parent it.

Krinkle renamed this task from expose revisions.rev_sha1 as query field to Expose revisions.rev_sha1 field through revision query API.Jul 18 2019, 2:50 PM
Krinkle removed a subscriber: wikibugs-l-list.

If and when this is accepted as a change,

That's the real open question here.

and it runs out that an index is indeed,

It will be, there's no question about that. That's why this task is marked as "blocked" on the MediaWiki-API board.