expose revisions.rev_sha1 as query field
Open, LowPublic

Description

I would want revision.rev_sha1 exposed in the api, so I can retrieve revisions which contains the same content (I assume rev_sha1 is the sha1 of the content and thus should be same for different revisions with the same content)


Version: unspecified
Severity: enhancement

Details

Reference
bz49136
bzimport raised the priority of this task from to Low.
bzimport set Reference to bz49136.
bzimport added a subscriber: Unknown Object (MLST).
AzaToth created this task.Jun 4 2013, 3:49 PM

I meant expose it as field for query so you can limit the query to sha1

This would probably require adding an index covering it, as it's currently not indexed at all (although img_sha1 is).

Anomie added a comment.Jun 4 2013, 3:57 PM

(In reply to comment #1)

I meant expose it as field for query so you can limit the query to sha1

Good thing you clarified, I was about to close this as a dup of bug 21860.

What exactly is the use case for this, that isn't served by processing the output of prop=revisions&rvprop=sha1 on the client?

(In reply to comment #2)

This would probably require adding an index covering it, as it's currently
not indexed at all (although img_sha1 is).

That is correct.

Sometimes there can be way too many revisions to practically loop through them all to find out which one is the oldest one. For example in a edit war/revert war, I would want to be able to quickly get the original revision which started it all, and of someone reverts to a really old revision, it's good to be able to quickly find out whom made that revision in case of sockery

afeldman wrote:

If rev_sha1 will always be queried with a page id, a (rev_page, rev_sha1(5)) index should be fine, even on enwiki.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 18 2015, 3:11 PM

Note that the sha1 returned by the api is not the rev_sha1 field of the revision table, but a sha1 of the sha1 :facepalm: ... see T75411 for all its glory.

So I think that would be really confusing, as the output sha1 of the prop=revisions can't be used to query it again for that particular sha1

I think it is the same value, just written differently. A SHA-1 hash is just a 160-bit number, which is usually displayed as 40 digits in base 16 (e.g. 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12), it can be written in any base. MediaWiki internally stores it as a human-readable value in base 36, presumably as a compromise between space taken and readability.

Anomie added a comment.Oct 5 2015, 2:01 PM

I think it is the same value, just written differently.

You are correct.