Ok. So I should record a contribution on this page.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 30 2020
In T263874#6589856, @Isaac wrote:Hey everyone! A few days left to get in those final contributions on the Outreachy site. Make sure you complete your final application there (you can do this today and still edit it up until the deadline). Diego also posted some good general feedback about notebooks at T263860#6589759 that I wanted everyone to see:
I have a general recommendation to all of you: Keep the notebook easy to read. That means: Explain each piece of code that you are running. The idea is to make the notebook easy to understand. Don't make the reader have to guess what you were trying to do. Describe your motivation and conclusions for every statistics you show. For example, why are you plotting variable X, or Y? and what is your takeaway/conclusions? Avoid long/repetitive code outputs that doesn't provide relevant information. For example, if you are applying a model that runs 1000 epochs, avoid to print 1000 lines which each epoch, because makes the notebook difficult to read. If you think that there is relevant information on those outputs, think how to show that information in a way that is compact and easy to understand (for example a plot).Also, I know the timeline part of the application can be confusing. Some general points about it:
- This is an opportunity for you to indicate whether there are any components of the project that are more interesting to you (spend more time on them) or where you feel you would need to learn some skills in advance. We don't expect anyone to know everything they need to do these projects, so don't hesitate to explain where you'd want to do some additional learning etc.
- Note if you have any previous commitments that would prevent you from working a given week.
- We know you won't have a perfect plan for the project as you only know as much as we've said on the tasks about them. Do your best but we'll be more interested in the other questions in the application and Jupyter notebook submission.
Oct 27 2020
Yes Now it seems to be okay. Thank You so much @Isaac I'm really relieved now.
Hello @Isaac and everyone.
Oct 26 2020
Oct 25 2020
In T263874#6576411, @Amamgbu wrote:In T263874#6576410, @SafiaKhaleel wrote:Hi @Isaac and everyone,
Can anyone give me an idea to find if a page is about a human or not?You could reference the tutorial given to us by @Isaac. There’s a segment on that in it.
Hi @Isaac and everyone,
Can anyone give me an idea to find if a page is about a human or not?
Oct 19 2020
In T263874#6560006, @Amamgbu wrote:In T263874#6558281, @SafiaKhaleel wrote:@Isaac and everyone ,
Can we access the variable 'page_counter' from the page table as it had been removed completely in MediaWiki 1.25. Is there any other method to get views of each page?You can query for page view. You can reference the MediaWiki query API documentation to get this info. Though i think it brings up a max of 60 days.
Oct 18 2020
@Isaac and everyone ,
Can we access the variable 'page_counter' from the page table as it had been removed completely in MediaWiki 1.25. Is there any other method to get views of each page?
@Isaac and everyone,
Is there any way to get the reason of protection for a given page? If yes from where do we get that data
Oct 11 2020
In T263874#6534884, @Amamgbu wrote:In T263874#6534882, @SafiaKhaleel wrote:In T263874#6534745, @Amamgbu wrote:In T263874#6534735, @SafiaKhaleel wrote:@Isaac
The protection data that I got from MediaWiki dump and API results seems to be in different format.
For a particular page, we get data in the following formats..
In MediaWiki dump : (3664672,'edit','autoconfirmed',0,NULL,'infinity',717409)In API result : {'pageid': 3664672, 'ns': 10, 'title': 'Template:Cyclopaedia 1728', 'contentmodel': 'wikitext', 'pagelanguage': 'en', 'pagelanguagehtmlcode': 'en', 'pagelanguagedir': 'ltr', 'touched': '2020-10-10T18:45:26Z', 'lastrevid': 952065611, 'length': 3572, 'protection': [{'type': 'edit', 'level': 'autoconfirmed', 'expiry': 'infinity'}, {'type': 'move', 'level': 'autoconfirmed', 'expiry': 'infinity'}], 'restrictiontypes': ['edit', 'move']}
Isn't that a problem? like how are we supposed to check the discrepancies between the two if they are in different formats? Moreover the protection data in API results doesnt show user specific restrictions or sysop permissions. Is that ok?
Hi @SafiaKhaleel, it is possible to compare those tuples with the JSON objects through indexing. You can check the tutorial notebook @Isaac shared to us for more details.
@Isaac mentioned that the user specific restrictions was an obsolete field and we should disregard it. For the sysop permissions, it is stored in the JSON data as “level”
Thanks @Amamgbu . I understood that. But the protection type in both the cases also seems to be different. In the API result, all the pages have both edit and move protection as you can see here:
'protection': [{'type': 'edit', 'level': 'autoconfirmed', 'expiry': 'infinity'}, {'type': 'move', 'level': 'autoconfirmed', 'expiry': 'infinity'}]
whereas in MediaWiki dump, there is only one type out of the two: (39620487,'edit','autoconfirmed',0,NULL,'infinity',692808)There are actually two if you inspect the data well. I had that same issue till i inspected the ids
In T263874#6534745, @Amamgbu wrote:In T263874#6534735, @SafiaKhaleel wrote:@Isaac
The protection data that I got from MediaWiki dump and API results seems to be in different format.
For a particular page, we get data in the following formats..
In MediaWiki dump : (3664672,'edit','autoconfirmed',0,NULL,'infinity',717409)In API result : {'pageid': 3664672, 'ns': 10, 'title': 'Template:Cyclopaedia 1728', 'contentmodel': 'wikitext', 'pagelanguage': 'en', 'pagelanguagehtmlcode': 'en', 'pagelanguagedir': 'ltr', 'touched': '2020-10-10T18:45:26Z', 'lastrevid': 952065611, 'length': 3572, 'protection': [{'type': 'edit', 'level': 'autoconfirmed', 'expiry': 'infinity'}, {'type': 'move', 'level': 'autoconfirmed', 'expiry': 'infinity'}], 'restrictiontypes': ['edit', 'move']}
Isn't that a problem? like how are we supposed to check the discrepancies between the two if they are in different formats? Moreover the protection data in API results doesnt show user specific restrictions or sysop permissions. Is that ok?
Hi @SafiaKhaleel, it is possible to compare those tuples with the JSON objects through indexing. You can check the tutorial notebook @Isaac shared to us for more details.
@Isaac mentioned that the user specific restrictions was an obsolete field and we should disregard it. For the sysop permissions, it is stored in the JSON data as “level”
@Isaac
The protection data that I got from MediaWiki dump and API results seems to be in different format.
For a particular page, we get data in the following formats..
In MediaWiki dump : (3664672,'edit','autoconfirmed',0,NULL,'infinity',717409)
Oct 10 2020
In T263874#6534125, @Arfat2396 wrote:I'm trying to read the uncompressed data using the gzip library but everytime i run the cell, i get this error, does anyone know what's causing this?
IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)Update: I found a workaround by iterating and displaying selective parts of data.
In T263874#6533330, @Thulieblack wrote:Perfect evening
I'm really confused and need someone to clarify for me in cell 5
To do an example that loops through all pages and extract data how do I do that as the python docs don't really give example a or docs
Oct 8 2020
Hi everyone.! I'm Safia another Outreachy applicant. I'm kinda new to open source but really interested about this project. Lets all do our best.!!