Get ORES topic thresholds.
ActivePublic
Actions

Authored by Halfak on Feb 4 2020, 9:16 PM.

Tags

Referenced Files

	F31547870: raw.txt
	Feb 4 2020, 9:16 PM

Subscribers

None

	Let's get some useful thresholds for models. Generally, these thresholds are going to look a lot worse than they really are -- mostly because of labels we used to train are messy and incomplete. We're targeting at least 70% precision, but we're likely to get that when we ask for 50% precision -- and in some cases, we'll still get it when we target even lower precision.


	So! We're going to use ORES "threshold optimization" querying system. We'll need to make a call for each topic in order to get an appropriate threshold:

	* Culture.Biography.Biography* [[maximum recall @ precision >= 0.5](https://ores.wikimedia.org/v3/scores/enwiki/?models=articletopic&model_info=statistics.thresholds.%22Culture.Biography.Biography*%22.%22maximum%20recall%20@%20precision%20%3E=%200.5%22)]
	```
	{
	"!f1": 0.925,
	"!precision": 0.996,
	"!recall": 0.863,
	"accuracy": 0.877,
	"f1": 0.662,
	"filter_rate": 0.759,
	"fpr": 0.137,
	"match_rate": 0.241,
	"precision": 0.5,
	"recall": 0.977,
	"threshold": 0.086
	}
	```
	* Culture.Biography.Women [[maximum recall @ precision >= 0.5](https://ores.wikimedia.org/v3/scores/enwiki/?models=articletopic&model_info=statistics.thresholds.%22Culture.Biography.Women%22.%22maximum%20recall%20@%20precision%20%3E=%200.5%22)]
	```
	{
	"!f1": 0.993,
	"!precision": 0.995,
	"!recall": 0.99,
	"accuracy": 0.985,
	"f1": 0.572,
	"filter_rate": 0.981,
	"fpr": 0.01,
	"match_rate": 0.019,
	"precision": 0.501,
	"recall": 0.668,
	"threshold": 0.667
	}
	```
	* Culture.Media.Entertainment [[maximum recall @ precision >= 0.5](https://ores.wikimedia.org/v3/scores/enwiki/?models=articletopic&model_info=statistics.thresholds.%22Culture.Media.Entertainment%22.%22maximum%20recall%20@%20precision%20%3E=%200.5%22)]
	```
	{
	"!f1": 0.998,
	"!precision": 0.998,
	"!recall": 0.998,
	"accuracy": 0.996,
	"f1": 0.47,
	"filter_rate": 0.997,
	"fpr": 0.002,
	"match_rate": 0.003,
	"precision": 0.503,
	"recall": 0.442,
	"threshold": 0.646
	}
	```
	* STEM.Mathematics [maximum recall @ precision >= 0.3](https://ores.wikimedia.org/v3/scores/enwiki/?models=articletopic&model_info=statistics.thresholds.%22STEM.Mathematics%22.%22maximum%20recall%20@%20precision%20%3E=%200.3%22)]
	```
	{
	"!f1": 1.0,
	"!precision": 1.0,
	"!recall": 0.999,
	"accuracy": 0.999,
	"f1": 0.401,
	"filter_rate": 0.999,
	"fpr": 0.001,
	"match_rate": 0.001,
	"precision": 0.309,
	"recall": 0.571,
	"threshold": 0.903
	}
	```

	Here, we can see some diversity. Culture.Biography.Biography* is easy to model and it's very common in the labeled data, so we can get very high precision and very high recall and a strict threshold. STEM.Mathematics is on the other end of the spectrum. There are very few math-related articles at all. I've relaxed the minimum precision to 0.3 in order to get a threshold.

Event Timeline

Halfak created this paste.Feb 4 2020, 9:16 PM

Get ORES topic thresholds. ActivePublicActions

Event Timeline

Get ORES topic thresholds.
ActivePublic
Actions