|Open||Miriam||T155538 General image classifier for commons|
|Open||Miriam||T215413 Image Classification Working Group|
|Invalid||Miriam||T228441 Design a pipeline for image classification|
|Resolved||Miriam||T250150 Improve prototypes of image classifiers trained on images from Commons Categories|
- retrained the model with the new, polished data
- improvements are +7% overall, and +15% for the classes where we have modified the data! https://docs.google.com/spreadsheets/d/18Er84wdWIme_KMOrOYZZQxq5z0d9O4L0nZMMibzQ_rc/edit?usp=sharing
- noticed that there is another minor data improvement: basically, there are some concepts whose data comes from ambiguous Commons categories. My plan is to remove those and re-train the model on the cleaner data. Will try to do this next week.
- Refined the data, results are similar.
- Computed the top-5 accuracy as final metric on the classifiers. This metric is widely used in image classification competitions such as Imagenet Large Scale Visual Recognition Challenge. It counts how many time the correct label is found among the top-5 predictions of the classifier.
- Top-5 accuracy is around 80% for the first version, and 81.5% for the improved one, with major gains on classes we have worked on this quarter. https://docs.google.com/spreadsheets/d/18Er84wdWIme_KMOrOYZZQxq5z0d9O4L0nZMMibzQ_rc/edit?usp=sharing
- I could close this task but i still hope to train a network from scratch by the end of the quarter :)