- DrTrigon (1. Development Phase around 2013)
[ ] https://commons.wikimedia.org/wiki/User:DrTrigonBot/doc - contains old/most recent TODO list, original opencv bot proposal, etc.
[ ] alternative to pHash (not developed since 2013): http://blockhash.io/ (or just create an icon by averaging over pixels, resp. reducing resolution/scale/zoom)
[ ] wavlet decompositions for peak detection, color regions, fingerprinting/hashing, frequency decomp., denoise, compress, etc.
- code/software:
- http://www.pybytes.com/pywavelets/regression/wp2d.html (supports 2D data, mature, see WaveletPacket2D.get_leaf_nodes() and store as xml/json)
- http://jseabold.net/blog/2012/02/23/wavelet-regression-in-python/
- literature/paper:
- https://www.researchgate.net/post/How_wavelet_transform_coefficient_used_for_image_classification
- http://www.cmapx.polytechnique.fr/~yu/publications/ICPR08Final.pdf <- **implement this as it supports object recognition, texture and satelite images classification, text/image language identification, sound classification**
- patch transformation: http://people.csail.mit.edu/taegsang/Documents/CVPRPatch.pdf
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.684.5988&rep=rep1&type=pdf
- http://ac.els-cdn.com/S0377042706006431/1-s2.0-S0377042706006431-main.pdf?_tid=dca58f04-2708-11e6-9c23-00000aab0f27&acdnat=1464683149_3f76b534460f9c998182d86714c80597 (watermarking)
- http://soundlab.cs.princeton.edu/publications/2001_amta_aadwt.pdf
- **Text Classification by Aggregation of SVD Eigenvectors**: http://delab.csd.auth.gr/papers/ADBIS2012skm.pdf (might not be very useful... how many text do we need to categorize?)
- Chapter 15 - BLIND SOURCE SEPARATION: http://www.mit.edu/~gari/teaching/6.555/LECTURE_NOTES/ch15_bss.pdf
[ ] head pose estimation
- see: http://rpg.ifi.uzh.ch/software_datasets.html (Perspective 3-Point (P3P) Algorithm)
- http://rpg.ifi.uzh.ch/software/p3p_code_final.zip
- http://rpg.ifi.uzh.ch/docs/CVPR11_kneip.pdf
[ ] T137558: render error detection (see T136934)
- using `convert` (ImageMagick) the commons default, allows to compare commons results against other libraries and e.g. find rendering errors, see https://github.com/AbdealiJK/file-metadata/issues/37
- possible categories to check for testing are:
- https://commons.wikimedia.org/wiki/Category:PDF_files_affected_by_MediaWiki_restrictions
- https://commons.wikimedia.org/wiki/Category:Images_without_thumbnails
- https://commons.wikimedia.org/wiki/Category:Images_with_render_problem
[ ] {T61499}
- AbdealiJK, jayvdb, DrTrigon (2. Development Phase GSoC 2016)
[ ] T135836#2314683, T135836#2314835: face recognition (e.g. like facebook) as well as age and gender - needs kind of DB (e.g. commons)
[ ] T135836#2314683: facial landmarks
[ ] audio fingerprinting and recognition
- https://news.ycombinator.com/item?id=8303713
- http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/
- https://github.com/worldveil/dejavu
- @DrTrigon had a nice IRC chat with rillke, very supportive and inspiring:
[ ] https://acoustid.org/fingerprinter
[ ] https://bitbucket.org/acoustid/profile/repositories
[ ] docker container and/or puppet script for vagrant (labs, VM e.g for win users)
[ ] http://echoprint.me/
[ ] https://github.com/spotify/echoprint-codegen (this could be show changer)
[ ] https://github.com/AbdealiJK/file-metadata/issues/15
- Detect line drawings
- Detect pie charts
- Detect line charts
- Detect if SVG
[ ] {T134644}
[ ] {T137558}
[ ] {T138119}
[ ] learning? how time consuming? (not to spend too much time on something that we cannot finish - though a actually should be quite easy to have a first theoretically working script)
[ ] train the bot with images of persons we now in advance that they will appear in a dataset (e.g. generals or politicians during wars, etc.)
[ ] train the bot on the dataset itself at least after humans have gone over it
[ ] Z441#5618: what happens if you take an image flandmark cannot detect, and some amount of random noise, resize and rotate it a bit and re-try - as if you were sitting in front of the cam and move and tilt your head a bit untill it get the detection
[ ] https://pypi.python.org/pypi/tesserocr
Categories to assign (see https://etherpad.wikimedia.org/p/Zl7V7KuK7J):
- [[ https://commons.wikimedia.org/wiki/Category:Portraits | Category:Portraits ]] -> size of face (ration compared to picture size - kind of coverage) and orientation (head pose)
-