RE redirects - We don't reuse the redirect tables that @ArielGlenn dumps every few weeks because we need more up-to-date data, but we do something similar to what he described. We typically parse article pages for redirect templates, and add the information we extract about the redirect pages to the pages they redirect to.
- Feed Queries
- All Stories
- Search
- Feed Search
- Transactions
- Transaction Logs
Feed Search
Jul 15 2020
Jul 15 2020
Jul 10 2020
Jul 10 2020
@RBrounley_WMF:
+1 on publishing the dataset as a small number of large splittable files compressed with a splittable format. It helps the download and distributed data processing.
Apr 1 2019
Apr 1 2019
Nicolastorzec added a comment to T216160: Update wikidata-entities dump generation to fixed day-of-month instead of fixed weekday.
Thanks for the summary Ariel.
Feb 20 2019
Feb 20 2019
Nicolastorzec added a comment to T216160: Update wikidata-entities dump generation to fixed day-of-month instead of fixed weekday.
I'm also interested in the specific reasons why the update frequency needs to be changed, i.e. beside streamlining the monthly workload on the Wikimedia machines.
Feb 18 2019
Feb 18 2019
Nicolastorzec added a comment to T216160: Update wikidata-entities dump generation to fixed day-of-month instead of fixed weekday.
Hi Ariel et al.,
Content licensed under Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 unless otherwise noted; code licensed under GNU General Public License (GPL) 2.0 or later and other open source licenses. By using this site, you agree to the Terms of Use, Privacy Policy, and Code of Conduct. · Wikimedia Foundation · Privacy Policy · Code of Conduct · Terms of Use · Disclaimer · CC-BY-SA · GPL · Credits