Page MenuHomePhabricator

Outreachy microtask: collect captcha data from signup page (#2)
Closed, ResolvedPublic

Description

This is a microtask for Outreachy applicants for T158909: Automatically detect spambot registration using machine learning (like invisible reCAPTCHA) .

  • Set up local development environment. (You'll probably want to use MediaWiki-Vagrant)
  • Set up EventLogging.
  • Create an EventLogging schema (on your own machine, not meta.wikimedia.org! see docs) for data to be collected for the captcha. (Doesn't have to be realistic, just put in some fields.)
  • Make MediaWiki log some data on registration. For example, you could use the LocalUserCreated hook and record the IP of the user.
  • Submit the code you wrote to Gerrit. Make sure to start the commit message with [DO NOT MERGE] and explain in the commit description that this is an Outreachy microtask and not intended to be merged. (Eventually the code should go into a new extension, but setting that up is too much work for a microtask.) Add mentors as reviewers.
  • Verify that event logging works - the records should show up in /vagrant/logs/eventlogging.log.
  • Share the schema and some logged data (e.g. in the form of a gist or a screenshot).

Please don't hesitate to ask questions if something is not clear. The documentation for EventLogging is particularly horrible :( Don't spend too much time trying to understand it, just ask. #wikimedia-devrel or {Z610} is a good starting point.

Event Timeline

Tgr renamed this task from Outreachy microtask: collect captcha data from signup page (#1) to Outreachy microtask: collect captcha data from signup page (#2).Sep 8 2017, 4:16 AM
Tgr updated the task description. (Show Details)

Hi all - I tried to create a schema for EventLogging. Not sure if it makes much sense or is correct, but just wanted to try my hand at it. Added it here. Please provide feedback and let me know how to proceed. Thanks.

@Veenasankar you should create the schema on your local MediaWiki installation. Sorry, that wasn't very clear in the instructions. You need to set up MediaWiki locally (probably via Vagrant, linked above), install the eventlogging role as described, create the schema in your wiki's Schema namespace, and use that for testing. I tried to clarify the extension installation docs; hope that helps.

(Our docs are not very good and definitely not writted or reviewed with newcomer's eye, so please aks where you think they don't make sense / are missiong something.

Tgr updated the task description. (Show Details)

Hi @Tgr ,
I have done the above steps and a post request is being sent with the logged data, but the post request is failing because there is no server listening to it. How do I set this up?

@Groovier the logging is done by the systemd service eventlogging-devserver, so you should debug that - systemd status eventlogging-devserver should tell you whether it's running and systemd restart eventlogging-devserver might fix problems. Also re-running vagrant provision or restarting the system via vagrant reload might if it ended up in some weird state.

@Tgr , vagrant provision solved the issue. Thanks. Can you please look at my code review?
https://gerrit.wikimedia.org/r/#/c/385845/

Tgr claimed this task.