Page MenuHomePhabricator

Add a hash or CRC module to Scribunto
Closed, ResolvedPublic

Description

Needed for being able to generate unique reference names for references created from Wikidata. Because of the general nature of this, I think the module should rather live in Scribunto than in ArticlePlaceholder or even Wikibase.

I suggest using md5 which is fast and good enough for the use cases at hand.

Related change: https://gerrit.wikimedia.org/r/296177

Event Timeline

hoo created this task.Aug 10 2016, 4:23 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 10 2016, 4:23 PM
hoo updated the task description. (Show Details)Aug 10 2016, 5:10 PM
Lucie added a subscriber: Lucie.Aug 12 2016, 4:11 PM

Seems sane enough. It could easily be done with a simple wrapper to call PHP's hash() (and maybe related functions) rather than implementing hashing in pure Lua.

Change 304933 had a related patch set uploaded (by Hoo man):
Add mw.hash to Scribunto

https://gerrit.wikimedia.org/r/304933

hoo added a comment.Aug 15 2016, 10:21 PM

Seems sane enough. It could easily be done with a simple wrapper to call PHP's hash() (and maybe related functions) rather than implementing hashing in pure Lua.

That sounds good to me, I've taken a shot at implementing this.

Change 304933 merged by jenkins-bot:
Add mw.hash to Scribunto

https://gerrit.wikimedia.org/r/304933

hoo closed this task as Resolved.Aug 18 2016, 3:37 PM
hoo claimed this task.
hoo removed a project: Patch-For-Review.
hoo moved this task from Incoming to Done on the ArticlePlaceholder board.

Hi @hoo, we've been implementing the functionality mentioned in the task description, namely generating unique reference names for references created from Wikidata (and solving some adjacent problems), in ruwiki (after T175725: Deploy HTML5 sections to WMF production was done for ruwiki and some things broke). Do you have any thoughts on which algorithm is better suited for generating hash values for references? I've done my modest research and stopped at fnv164 as it is fast, and 64-bit hash reduces collision probability to a negligibly small value. But maybe I am missing something.

The choice hash_algos() gives is the following:

"adler32", "crc32", "crc32b", "fnv132", "fnv164", "fnv1a32", "fnv1a64", "gost", "haval128,3", "haval128,4", "haval128,5", "haval160,3", "haval160,4", "haval160,5", "haval192,3", "haval192,4", "haval192,5", "haval224,3", "haval224,4", "haval224,5", "haval256,3", "haval256,4", "haval256,5", "joaat", "md2", "md4", "md5", "ripemd128", "ripemd160", "ripemd256", "ripemd320", "sha1", "sha224", "sha256", "sha3-224", "sha3-256", "sha3-384", "sha3-512", "sha384", "sha512", "snefru", "tiger128,3", "tiger128,4", "tiger160,3", "tiger160,4", "tiger192,3", "tiger192,4", "whirlpool"
Anomie added a comment.Nov 3 2017, 1:46 PM

This task is not the place to ask such questions. Your best bet would be to contact hoo on a wiki, on IRC, or via email. Even better would be to take advantage of a public mailing list, wiki help page, or Technical Advice IRC Meeting to ask your questions.

Thank you, @Anomie, I will choose another place. Actually, the lack of a place where local wikis' technicians could consult with MediaWiki developers seems to me to be an actual problem, not sure if the places you listed fulfill this need completely.