Page MenuHomePhabricator

Amamgbu (Jesse Amamgbu)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Oct 8 2020, 8:38 AM (187 w, 5 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Amamgbu [ Global Accounts ]

Recent Activity

Mar 9 2021

Amamgbu claimed T276834: Include more languages in the selector.
Mar 9 2021, 9:33 PM · Research
Amamgbu claimed T276833: Replace jquery.dataTables with a native Vue library.
Mar 9 2021, 2:01 AM · Research
Amamgbu claimed T276831: Loading indicator during inference API call.
Mar 9 2021, 2:00 AM · Research

Mar 8 2021

Amamgbu added a comment to T276834: Include more languages in the selector.

Nice question. There is a french ground-truth available. Subsequently, the model will accommodate several ground truths which would be used alongside the language picker.

Mar 8 2021, 10:37 PM · Research

Oct 26 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Hello Everyone,
I'd like to raise a question concerning the predictive model we are to build in the later part of the notebook.
Are we predicting the protection types or we are to predict if a page is protected or not??

Oct 26 2020, 2:39 PM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Thank You so much @Isaac , @Amamgbu and @Tambe You guys have been really helpful.
However, when I use the value wbgetentities for the action parameter, I'm getting the following error:

APIError: badvalue: Unrecognized value for parameter "action": wbgetentities. -- See https://en.wikipedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes.

Oct 26 2020, 9:02 AM · Outreachy (Round 21)

Oct 25 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Hi @Isaac and everyone,
Can anyone give me an idea to find if a page is about a human or not?

You could reference the tutorial given to us by @Isaac. There’s a segment on that in it.

Yes but its for checking if an individual "item" is human or not. I was thinking if we could implement that in pages too

Oct 25 2020, 5:47 AM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Hi @Isaac and everyone,
Can anyone give me an idea to find if a page is about a human or not?

Oct 25 2020, 5:08 AM · Outreachy (Round 21)

Oct 24 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Hi @Isaac and everyone

Oct 24 2020, 8:24 AM · Outreachy (Round 21)

Oct 22 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Welcome to all the new applicants since I last posted a welcome! One request for everyone working on this task:

  • To get a good sense of how many people are intending to apply to each project (T263646 and/or T263860), I'd ask that you make an initial contribution on the Outreachy site with a link to your current progress in the next two days (so by end-of-day October 23rd).

Thanks! Keep the questions / collaboration coming!

Oct 22 2020, 7:52 AM · Outreachy (Round 21)

Oct 19 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Thanks @Amamgbu But a lot of pages seems to have 'null' value for the pageviews variable.
And another problem I encountered is with the limit upto which we can query page information using title/pageid. Getting only 50 instances at a time is not enough to work on right? Does anyone know how to increase the limit upto 500? (Its said "500 for clients allowed higher limits")
@Isaac

Oct 19 2020, 2:58 PM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Amamgbu thanks I have sorted it out 🤗🤗

Oct 19 2020, 11:53 AM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Isaac and everyone ,
Can we access the variable 'page_counter' from the page table as it had been removed completely in MediaWiki 1.25. Is there any other method to get views of each page?

Oct 19 2020, 11:52 AM · Outreachy (Round 21)

Oct 18 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Amamgbu inprop gave me an error I passed the prop*inf that's how I got all the queries

Oct 18 2020, 10:41 AM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Amamgbu take a look{F32397114}

Oct 18 2020, 9:10 AM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Perfect day I need to ask,my API response doesn't show protection type unlike the Dump data,is it something I need to worry about?

Oct 18 2020, 7:19 AM · Outreachy (Round 21)

Oct 17 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Amamgbu that is correct and @Vanevela pointed to the appropriate prior discussion about this. More details: the restrictiontypes field is just what restrictions could be applied to the page, not which ones are applied -- a fuller description of what you could find in that field can be found here. For most pages, you'll see edit and move and can verify this by choosing a random page without restrictions and querying the API. I'd suggest ignoring the field as it won't tell you much.

Oct 17 2020, 7:54 PM · Outreachy (Round 21)

Oct 16 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Hi @Amamgbu
I think that this issue was addressed here, I hope that this helps you.

Oct 16 2020, 11:23 PM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Isaac and everyone.

Oct 16 2020, 10:17 PM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Hi,
I am extremely sorry for this stupid question, but can anyone please guide me a bit about what is happening in "example of working with Wikidata data" in cell 5. Basically, I am having difficulty in understanding the format of data extraction from json dump. What checks are being confirmed by if and for statements?

Thanks

Oct 16 2020, 9:46 PM · Outreachy (Round 21)

Oct 15 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Hi,
I am unable to understand "head -46 " in below line code.

!zcat "{DUMP_DIR}{DUMP_FN}" | head -46 | cut -c1-1000

Secondly, I am having error

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
--NotebookApp.iopub_data_rate_limit.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

How to solve this error?

Oct 15 2020, 2:55 PM · Outreachy (Round 21)

Oct 14 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Does anyone know if we can obtain the timestamp for when a protection was added for a page? It seems to be an option for protected titles, but I haven't had any luck in finding out how to do this for protected pages.

Oct 14 2020, 2:06 PM · Outreachy (Round 21)

Oct 12 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Hi everyone, I've been able to decompress the file but please how do I clean the data

Oct 12 2020, 7:11 AM · Outreachy (Round 21)

Oct 11 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Isaac
The protection data that I got from MediaWiki dump and API results seems to be in different format.
For a particular page, we get data in the following formats..
In MediaWiki dump : (3664672,'edit','autoconfirmed',0,NULL,'infinity',717409)

In API result : {'pageid': 3664672, 'ns': 10, 'title': 'Template:Cyclopaedia 1728', 'contentmodel': 'wikitext', 'pagelanguage': 'en', 'pagelanguagehtmlcode': 'en', 'pagelanguagedir': 'ltr', 'touched': '2020-10-10T18:45:26Z', 'lastrevid': 952065611, 'length': 3572, 'protection': [{'type': 'edit', 'level': 'autoconfirmed', 'expiry': 'infinity'}, {'type': 'move', 'level': 'autoconfirmed', 'expiry': 'infinity'}], 'restrictiontypes': ['edit', 'move']}

Isn't that a problem? like how are we supposed to check the discrepancies between the two if they are in different formats? Moreover the protection data in API results doesnt show user specific restrictions or sysop permissions. Is that ok?

Hi @SafiaKhaleel, it is possible to compare those tuples with the JSON objects through indexing. You can check the tutorial notebook @Isaac shared to us for more details.

@Isaac mentioned that the user specific restrictions was an obsolete field and we should disregard it. For the sysop permissions, it is stored in the JSON data as “level”

Thanks @Amamgbu . I understood that. But the protection type in both the cases also seems to be different. In the API result, all the pages have both edit and move protection as you can see here:
'protection': [{'type': 'edit', 'level': 'autoconfirmed', 'expiry': 'infinity'}, {'type': 'move', 'level': 'autoconfirmed', 'expiry': 'infinity'}]
whereas in MediaWiki dump, there is only one type out of the two: (39620487,'edit','autoconfirmed',0,NULL,'infinity',692808)

There are actually two if you inspect the data well. I had that same issue till i inspected the ids

Oh so do you mean to say there are two page protection entries for a single page in MediaWiki dumps? One for edit and another for move

Oct 11 2020, 4:12 PM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Isaac
The protection data that I got from MediaWiki dump and API results seems to be in different format.
For a particular page, we get data in the following formats..
In MediaWiki dump : (3664672,'edit','autoconfirmed',0,NULL,'infinity',717409)

In API result : {'pageid': 3664672, 'ns': 10, 'title': 'Template:Cyclopaedia 1728', 'contentmodel': 'wikitext', 'pagelanguage': 'en', 'pagelanguagehtmlcode': 'en', 'pagelanguagedir': 'ltr', 'touched': '2020-10-10T18:45:26Z', 'lastrevid': 952065611, 'length': 3572, 'protection': [{'type': 'edit', 'level': 'autoconfirmed', 'expiry': 'infinity'}, {'type': 'move', 'level': 'autoconfirmed', 'expiry': 'infinity'}], 'restrictiontypes': ['edit', 'move']}

Isn't that a problem? like how are we supposed to check the discrepancies between the two if they are in different formats? Moreover the protection data in API results doesnt show user specific restrictions or sysop permissions. Is that ok?

Hi @SafiaKhaleel, it is possible to compare those tuples with the JSON objects through indexing. You can check the tutorial notebook @Isaac shared to us for more details.

@Isaac mentioned that the user specific restrictions was an obsolete field and we should disregard it. For the sysop permissions, it is stored in the JSON data as “level”

Thanks @Amamgbu . I understood that. But the protection type in both the cases also seems to be different. In the API result, all the pages have both edit and move protection as you can see here:
'protection': [{'type': 'edit', 'level': 'autoconfirmed', 'expiry': 'infinity'}, {'type': 'move', 'level': 'autoconfirmed', 'expiry': 'infinity'}]
whereas in MediaWiki dump, there is only one type out of the two: (39620487,'edit','autoconfirmed',0,NULL,'infinity',692808)

Oct 11 2020, 3:25 PM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Isaac
The protection data that I got from MediaWiki dump and API results seems to be in different format.
For a particular page, we get data in the following formats..
In MediaWiki dump : (3664672,'edit','autoconfirmed',0,NULL,'infinity',717409)

In API result : {'pageid': 3664672, 'ns': 10, 'title': 'Template:Cyclopaedia 1728', 'contentmodel': 'wikitext', 'pagelanguage': 'en', 'pagelanguagehtmlcode': 'en', 'pagelanguagedir': 'ltr', 'touched': '2020-10-10T18:45:26Z', 'lastrevid': 952065611, 'length': 3572, 'protection': [{'type': 'edit', 'level': 'autoconfirmed', 'expiry': 'infinity'}, {'type': 'move', 'level': 'autoconfirmed', 'expiry': 'infinity'}], 'restrictiontypes': ['edit', 'move']}

Isn't that a problem? like how are we supposed to check the discrepancies between the two if they are in different formats? Moreover the protection data in API results doesnt show user specific restrictions or sysop permissions. Is that ok?

Oct 11 2020, 8:46 AM · Outreachy (Round 21)

Oct 10 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.
# Wikidata JSON dump we'll start processing (56 GB in size, compressed) so far too large to process the whole thing right now
!ls -shH "{WIKIDATA_DIR}{WIKIDATA_DUMP_FN}"

I'm still going through the Wikidata example, do you know what the shH option might mean? I can't find it online

I am not too sure but I think !ls -shH means list short format with readable file size.

Oct 10 2020, 6:08 PM · Outreachy (Round 21)

Oct 9 2020

Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

Perfect evening
I'm really confused and need someone to clarify for me in cell 5
To do an example that loops through all pages and extract data how do I do that as the python docs don't really give example a or docs

Oct 9 2020, 8:31 PM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Arfat2396 !zcat is a jupyter inbuilt command. It can also be found on Linux operating system. Mainly used for viewing compressed files without decompressing it.

So, I can use it in Jupyter with the same rules as in Linux! https://www.howtoforge.com/linux-zcat-command/ Thanks!

Oct 9 2020, 3:00 PM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

@Arfat2396 !zcat is a jupyter inbuilt command. It can also be found on Linux operating system. Mainly used for viewing compressed files without decompressing it.

Oct 9 2020, 2:55 PM · Outreachy (Round 21)
Amamgbu added a comment to T263874: Outreachy Application Task: Tutorial for Wikipedia Page Protection Data.

After changing LANGUAGE variable, you should leave it as it is. Just as @Arfat2396 rightly said, you can access a list of supported db names for your language from the link shared.

Oct 9 2020, 2:34 PM · Outreachy (Round 21)

Oct 8 2020

Amamgbu added a comment to T263646: Develop an approach to infer which countries are associated with a given Wikipedia article.

Hi Shamima, I'm Jesse. Welcome. To your question, yes we are to start with T263874.

Oct 8 2020, 10:56 AM · Outreachy (Round 21), Outreach-Programs-Projects