Page MenuHomePhabricator

Andreas_Sune (Andreas Sune)
Data Scientist

Projects

User does not belong to any projects.

Today

  • No visible events.

Tomorrow

  • No visible events.

Friday

  • No visible events.

User Details

User Since
Mar 4 2024, 10:16 PM (98 w, 2 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Andreas Sune [ Global Accounts ]

Recent Activity

Mar 14 2024

Andreas_Sune added a comment to T358412: Create tool for informative infographics from structured information from Wikimedia projects - Task A.

@Aixvik, @Aananditaa001, @Anju_Maurya, @ElvisGicharu, @Abishekdascs, @Damiodunuga, @Abishek_Das, @Keamybams, @MahimaSinghal, @Andreas_Sune, @Udonels, @Omolade1414, @Chimezee, @BruceMahagwa, @GonzaGertrude, @Anachimuco, @DevJames1 and @Sheilakaruku

I have been abscent from answering questions here and through email, because I'm mostly interested in learning how you code and how you approach development challenges (they will appear in this and other projects in your journey).

But some of the questions and challenges I saw might need clarification, other have been resolved by kind and thoughtful individuals in this thread. Let's tackle the exercise here, and take in mind that you all will receive personal feedback in your submissions next week.

About the task instructions

  1. You should create an Wikimedia account, if you don't already have one. You can do so at https://meta.wikimedia.org/w/index.php?title=Special:CreateAccount;
  2. Log in into the PAWS service with your wikimedia account: https://paws.wmflabs.org/paws/hub;

These two seems to not be a challenge.

  1. Fork this notebook in your repository in PAWS (see the instructions here). Name your file as "T357409 - YourWikiUsername";
  2. Follow the specific directions in the notebook. If you have questions or need assistance, comment your inquiry in this subtask and make sure to ping @Ederporto;

You are not required to code in PAWs, feel free to do this locally, in your Jupyter Notebook instance, for example.
If the Data visualization doesn't work properly in your PAWs instance, but works locally, that's fine for me.
The Completing the gaps section needs to work in PAWs.

  1. Once you feel you have completed your task, generate the public link of your notebook on PAWS (see the instructions here) and send it through email to @Ederporto (you can find his email in Outreachy);

If you are developing locally, you can upload it to PAWs and then generate the public link and send me.

  1. You can request feedback on your task until March 17, and we will answer to you until March 22, in order to give you ample time to work on the feedback and your final submission;

That will happen next week, as announced

  1. Make sure to register the public link as a contribution on the Outreachy website. Your final contribution has to be submitted before April 2, at 4pm UTC.

You all have to make at least a draft proposal in the Outreachy platform by March 17th, as we will close new applications in the 18th, you can update it later, but your draft needs to be up by then.

About the notebook instructions:

  • Your first function is supposed to return a list of the most viewed articles in ptwiki for January;
    • You can check if your function is returning the articles in the correct order by looking at this page.
  • Your second function is supposed to return a dataframe of the most viewed articles in ptwiki for January and February;
  • You are free to use auxiliar functions and libraries as you wish;
  • Your third function, in the Data visualization section, is supposed to get the dataframe you generated in the second function and make a bar chart visualization of it.
    • You can pivot the table.

I hope this helps everyone!

Mar 14 2024, 12:28 AM · Outreach-Programs-Projects, Outreachy (Round 28)

Mar 7 2024

Andreas_Sune added a comment to T358412: Create tool for informative infographics from structured information from Wikimedia projects - Task A.

Hello, @Ederporto

I have a question about visualisation step. On the notebook task we have to do our graphics with the previous result dataframe (top_view_dataframe). According to bar_chart_race documentation, the dataframe to use should have a date on row and different categories of articles on columns but top_view_dataframe is the opposite. My question is can we use different approach instead of top_view_dataframe, for exemple prepare our dataset with the build in method in bar_chart_race release for that goal?

Thank you

Yes, same issue. I find it hard to visualize it with the current state of our data frame. In the documentation, it's specified that every row must represent a single period, which is the exact opposite in ours.

@Ederporto would drop more insights.

I think to address the requirement of using the bar_chart_race library, we can reshape the DataFrame so that dates are on the rows and articles are on the columns. This way, it aligns with the expected format for the library.

Hello, @Ederporto

I have a question about visualisation step. On the notebook task we have to do our graphics with the previous result dataframe (top_view_dataframe). According to bar_chart_race documentation, the dataframe to use should have a date on row and different categories of articles on columns but top_view_dataframe is the opposite. My question is can we use different approach instead of top_view_dataframe, for exemple prepare our dataset with the build in method in bar_chart_race release for that goal?

Thank you

Yes, same issue. I find it hard to visualize it with the current state of our data frame. In the documentation, it's specified that every row must represent a single period, which is the exact opposite in ours.

@Ederporto would drop more insights.

Hello @Ederporto ,
I have tried using the "bar_chart_race" library but it's to no avail. I have tried the examples in the documentation using its in-built dataset, and it says ffmpeg is required to be installed. Is there a way around this on the PAWS system notebook?

Please if any other intern has gone past this, do aid me. I've spent hours debugging without results.

You'll need to install ffmpeg on your machine. I've already done that, but unfortunately, I'm still unable to visualize the bar chart. I tested it on my machine using vscode, and initially encountered an error. However, after installing ffmpeg, I managed to generate the video successfully. I suspect the issue might be with the jupyter environment. If anyone has successfully addressed the problem, could you please help us?

Mar 7 2024, 7:58 PM · Outreach-Programs-Projects, Outreachy (Round 28)
Andreas_Sune added a comment to T358412: Create tool for informative infographics from structured information from Wikimedia projects - Task A.

I think you need to install ffmpeg first on your system.
Look at this tutorial to have a clue on how to do It
https://phoenixnap.com/kb/ffmpeg-windows

Mar 7 2024, 7:41 PM · Outreach-Programs-Projects, Outreachy (Round 28)
Andreas_Sune added a comment to T358095: Outreachy Application Task: Tutorial for Wikipedia language-agnostic article quality modeling data.
Mar 7 2024, 7:27 PM · Outreachy (Round 28)
Andreas_Sune added a comment to T358412: Create tool for informative infographics from structured information from Wikimedia projects - Task A.

I have a question about visualisation step. On the notebook task we have to do our graphics with the previous result dataframe (top_view_dataframe). According to bar_chart_race documentation, the dataframe to use should have a date on row and different categories of articles on columns but top_view_dataframe is the opposite. My question is can we use different approach instead of top_view_dataframe, for exemple prepare our dataset with the build in method in bar_chart_race release for that goal?

Mar 7 2024, 2:21 PM · Outreach-Programs-Projects, Outreachy (Round 28)

Mar 4 2024

Andreas_Sune updated Andreas_Sune.
Mar 4 2024, 10:21 PM