Page MenuHomePhabricator

PatsonJay (Dingani Muzimba)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Apr 1 2021, 2:41 PM (160 w, 2 h)
Availability
Available
LDAP User
Unknown
MediaWiki User
PatsonJay [ Global Accounts ]

Recent Activity

May 3 2021

PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

I would like to thank this great community for the constructive advice and help that I received during the contribution stage ending today. It was a learning phase for me, an interesting period where I get to learn from this great community. I really enjoyed working through the problems. Thanks to everyone.

May 3 2021, 9:16 AM · Outreachy (Round 22)

May 2 2021

PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.
May 2 2021, 7:38 AM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

@Ahn-nath thanks so much it helped

May 2 2021, 7:17 AM · Outreachy (Round 22)

May 1 2021

PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Outreach application form:

May 1 2021, 10:18 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Hello @Isaac I would like to know how can we answer this question on Outreachy application form:

May 1 2021, 10:17 PM · Outreachy (Round 22)

Apr 30 2021

PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

ok thanks let me try it.

Apr 30 2021, 1:10 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

yes it is

Apr 30 2021, 12:23 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

yes it is

Apr 30 2021, 12:23 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Hello guys I have tried different articles that appears across languages but I still I can't find the article in any other language other than English. If I do a query on the Langlinks API of the article "Westdeutscher_Rundfunk" you can see that this article appears in Germany with 80k plus pageviews but when I look for it in the germany clickstream I can't find it using it's germany name 'Westdeutscher_Rundfunk_Köln I cannot find it, when I do something like this on my germany clickstream dataframe:

Apr 30 2021, 7:59 AM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

yes that's it i had made a mistake there

No problem!

Another question is doesn't the mwapi query(session.request(method="Get", params) show data of this recent month, and by that I mean April.

It depends on the API. If you are making a GET request to the langlinks API, then you can't, because it is used to "get a list of all language links from the provided pages to other languages"; it is not specific to the month.

Apr 30 2021, 6:59 AM · Outreachy (Round 22)

Apr 29 2021

PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

yes that's it i had made a mistake there

Apr 29 2021, 10:07 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

I really do not know where I'm going wrong but I can't happen to find the article when I search in the clickstream. I have tried all sorts of methods. For example I would love to search this article - "NHK_Educational_TV". But I can't find it in the clickstream when I use

df.loc[(df[0] == NHK_Educational_TV)| (df[1] == NHK_Educational_TV)]

where df is my dataframe. But when I do a quick look at this https://pageviews.toolforge.org/langviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2021-01-01&end=2021-01-31&sort=views&direction=1&view=list&page=NHK%20Educational%20TV I can see that there are lots of languages where NHK_Education_TV appears. What might be the problem

Hello, @PatsonJay.

Pre-conditions:
You are confident that:
(1) the article (its data) is present in the clickstream dataset you are using; you have verified it by the langlinks API.
(2) 'NHK_Educational_TV' is the page title of the article in that clickstream dataset.

What might be happening:

Syntax
I would take a look at the syntax of that statement. For example, df.loc() is used to access a group of rows and columns by label(s) or a boolean array. If you want to select the columns with the 'source' and 'destination' articles you should use the column names instead. To access the group of rows by integer position, change the method to 'iloc' and be careful with the syntax. The second thing I noticed is that the page title is not enclosed in single or double quotation marks. If NHK_Educational_TV is not a variable, then use quotation marks for the program to recognize it as a string. 

 You are not loading the full dataset and the article is not present in the subset you have
If your syntax is correct, that is, no error is being thrown by the compiler (that was not the exact statement you used). Then, it could have something to do with the subset of the data you have. Are you loading the full dataset? If not, then your 'article' may not be present in the data you loaded, especially if you have less than 70%. 

Apr 29 2021, 10:01 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

If you want to select the columns with the 'source' and 'destination' articles you should use the column names instead.

From the dataset I think 1 represents the source and 3 represent the destination that's why I was doing something like this:

Apr 29 2021, 9:49 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Ok thanks so much let me try df.iloc instead. my code was correct, just that I forgot to put the single or double quotes when I brought the question here. And for the dataset I was loading something like 20k of the data

Apr 29 2021, 9:43 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Another question is doesn't the mwapi query(session.request(method="Get", params) show data of this recent month, and by that I mean April.

Apr 29 2021, 9:02 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

I really do not know where I'm going wrong but I can't happen to find the article when I search in the clickstream. I have tried all sorts of methods. For example I would love to search this article - "NHK_Educational_TV". But I can't find it in the clickstream when I use

df.loc[(df[0] == NHK_Educational_TV)| (df[1] == NHK_Educational_TV)]

where df is my dataframe. But when I do a quick look at this https://pageviews.toolforge.org/langviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2021-01-01&end=2021-01-31&sort=views&direction=1&view=list&page=NHK%20Educational%20TV I can see that there are lots of languages where NHK_Education_TV appears. What might be the problem

Apr 29 2021, 8:59 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Hello, @PatsonJay I forgot to mention that here in this API request :
https://en.wikipedia.org/w/api.php?action=query&prop=langlinks&titles=Chris_Ferguson&lllimit=max&redirects=
a German version for Chris Ferguson appears, we also have a January 2021 German clickstreams available in the notebook, and in Toolforge that German version of Chris Ferguson has 397 pageviews.

# TODO: for at least one language the article exists in that has a corresponding clickstream dataset,
# loop through that clickstream dataset and gather all the relevant data
# (as you did in English for your visualization above)

Above is a TODO, so for example the article that I got from the English one that I was working on is Chris Ferguson and so I do a query using the mwapi libary and the I noticed that this appears in spanish but when I look into the clickstream I do not find it. How then can I answer this todo if the article doesn't exist in any languages as per clickstream.

Apr 29 2021, 7:38 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

@PatsonJay The name you're using might have spaces in between them. The clickstream data does not store names with spaces. For example, Chris Ferguson might be stored as Chris_Ferguson in the dataset. I suggest you try using name.replace() command to replace spaces with underscores.

Apr 29 2021, 7:36 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

@rachita_saha thanks

Apr 29 2021, 3:03 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.
# TODO: for at least one language the article exists in that has a corresponding clickstream dataset,
# loop through that clickstream dataset and gather all the relevant data
# (as you did in English for your visualization above)

Above is a TODO, so for example the article that I got from the English one that I was working on is Chris Ferguson and so I do a query using the mwapi libary and the I noticed that this appears in spanish but when I look into the clickstream I do not find it. How then can I answer this todo if the article doesn't exist in any languages as per clickstream.

Apr 29 2021, 3:02 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

What if it doesn't

Apr 29 2021, 2:53 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Hello everyone I have an article say with the name Chris Ferguson and I got this article from an English clickstream. When do a query using the mwapi libray I can see that this article also appeared in Spanish. But when I try to find this article from the Spanish clickstream("eswiki") I do not find this article even if I search It's name in Spanish. But then when I look at this article using the languageviews (https://pageviews.toolforge.org/langviews/) I can see that this article does exist. What might be the problem I'm facing

Apr 29 2021, 2:39 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

I solved the problem by doing a request command.

Apr 29 2021, 8:19 AM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Hello. While trying to do a query using the mwapi library I'm getting the error below. What might be the problem.

Apr 29 2021, 8:14 AM · Outreachy (Round 22)

Apr 26 2021

PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

ok thanks

Apr 26 2021, 2:11 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Visualize the data to show what the common pathways to and from the article are. This is Todo isn't clear to understand can someone please explain

Apr 26 2021, 2:05 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Any advice on how I can visualize the data from a Todo that I do not quiet understand

Apr 26 2021, 2:00 PM · Outreachy (Round 22)

Apr 21 2021

PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

thanks understood

Apr 21 2021, 2:59 PM · Outreachy (Round 22)
PatsonJay added a comment to T276315: Outreachy Application Task: Tutorial for Wikipedia Clickstream data.

Hello. would like to ask a question. In the first TODO I do not understand what they mean by 'destination'. Can you please help.

Apr 21 2021, 2:22 PM · Outreachy (Round 22)