Having a well cited page with a diverse set of resources unsurprisingly raises trust with readers. The purpose of this signal is to give reusers the opportunity to surface the number of citations to build confidence in the content. The listening tour conducted by the design team earlier in 5.3 confirmed this as one of the most persuasive signals for readers to build trust in Wikimedia content.
Conditions of acceptance
- Create a technical design document that proposes at least one approach for tallying references within an article.
- Verify the design approach with MWI, MWP, and Amir S.
- Implement the recommended method for counting the number of references per page.
- The reference count should reflect the number of unique sources on the page; if the same source is used for multiple references, count it only once.
- Reference count is not returned on media files; it is only used for articles.
- Add a new field to the "trust_and_relevance" object returned by the signals endpoint to return this value: "reference_count": "string"
- [Stretch] Audit existing English Wikipedia pages for the distribution of references. Knowing the average, max, and distribution of the number of references will help design determine appropriate thresholds for returning opaque strings instead of explicit values.
Implementation details
For now, return the value as the actual reference count, as a string. We should implement it as a string because we may update the logic to conditionally return alternative messages at certain count thresholds (for example, returning "more than 5" or "5+" if there are 5 or more, but fewer than 10 references.
Notes
- We are waiting for the design to confirm this approach and specific thresholds that we would like to see.
- DPE would like to be involved in the design of this work. Because reference count may be expensive to calculate on the fly, it might make more sense to set up a derived data pipeline in partnership with DPE.
Open questions:
If reference count is hard to get, do we need to keep that as trust_and_relevance property? Maybe this could be a another signal.