This is a microtask for Outreachy applicants for T158909: Automatically detect spambot registration using machine learning (like invisible reCAPTCHA) .
- Set up a Jupyter notebook to work in.
- Clone a sample data set, https://github.com/balabit/Mouse-Dynamics-Challenge
- Come up with several features that can be calculated from the mouse movement data, for example time between movements, movement arc curvature, speed of movement. These features will be refined later, so don't worry about choosing the perfect features.
- Extract feature vectors for at least one of the mouse movement histories in the training data, and store as a numpy array or in any other format that can be consumed by sklearn classifiers.
- Display a sample of the feature vectors inside the notebook, either as a table or graphically.
- Publish to GitHub or another publicly accessible Git server.