People FatimaArshad-DS

FatimaArshad-DS
User

Projects

User does not belong to any projects.

Calendar

User Details

User Since: Mar 29 2022, 7:15 PM (108 w, 3 d)
Availability: Available
LDAP User: Unknown
MediaWiki User: FatimaArshad-DS [ Global Accounts ]

Recent Activity
View All

Apr 17 2022

FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.

Text coming from Wikitext is in pretty format. Was anyone able to pretty print HTML text?

Apr 17 2022, 3:46 PM · Research (FY2021-22-Research-April-June), Outreach-Programs-Projects, Outreachy (Round 24)

Apr 14 2022

FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.

I still don't understand the concept of templates. What are they?

Apr 14 2022, 4:40 PM · Research (FY2021-22-Research-April-June), Outreach-Programs-Projects, Outreachy (Round 24)

Apr 12 2022

FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.

Does it happen to anyone else... PAWS stops saving notebook after a while?

Apr 12 2022, 9:07 PM · Research (FY2021-22-Research-April-June), Outreach-Programs-Projects, Outreachy (Round 24)

Apr 10 2022

FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.

@Appledora There is no need to go inside tags manually. You can extract all the visible text very easily using BS4 :)

Apr 10 2022, 7:33 PM · Research (FY2021-22-Research-April-June), Outreach-Programs-Projects, Outreachy (Round 24)

Apr 9 2022

FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.

"# TODO: write a function for extracting the article text

It doesn't have to look the same as the output of wt.strip_code() above (in fact, it likely won't)
but it should be very similar in that you should aim for something
that captures the text of the article without a lot of markup etc.
NOTE: straightforward HTML -> text functions likely won't perform well here and you'll probably
want to write something more custom to handle the specifics of Wikipedia articles"

Apr 9 2022, 3:56 PM · Research (FY2021-22-Research-April-June), Outreach-Programs-Projects, Outreachy (Round 24)

Apr 8 2022

FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.

My question is related to this page: https://en.wikipedia.org/wiki/Chang_Gum-chol

Apr 8 2022, 4:41 PM · Research (FY2021-22-Research-April-June), Outreach-Programs-Projects, Outreachy (Round 24)

FatimaArshad-DS added a comment to T302237: Outreachy Project (Round 24): Build Python library to work with html-dumps.

Hi Everyone,

Apr 8 2022, 8:41 AM · Research (FY2021-22-Research-April-June), Outreach-Programs-Projects, Outreachy (Round 24)