User Details
User Details
- User Since
- Mar 29 2022, 7:15 PM (108 w, 3 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- FatimaArshad-DS [ Global Accounts ]
Apr 17 2022
Apr 17 2022
FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.
Text coming from Wikitext is in pretty format. Was anyone able to pretty print HTML text?
Apr 14 2022
Apr 14 2022
FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.
I still don't understand the concept of templates. What are they?
Apr 12 2022
Apr 12 2022
FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.
Does it happen to anyone else... PAWS stops saving notebook after a while?
Apr 10 2022
Apr 10 2022
FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.
@Appledora There is no need to go inside tags manually. You can extract all the visible text very easily using BS4 :)
Apr 9 2022
Apr 9 2022
FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.
"# TODO: write a function for extracting the article text
- It doesn't have to look the same as the output of wt.strip_code() above (in fact, it likely won't)
- but it should be very similar in that you should aim for something
- that captures the text of the article without a lot of markup etc.
- NOTE: straightforward HTML -> text functions likely won't perform well here and you'll probably
- want to write something more custom to handle the specifics of Wikipedia articles"
Apr 8 2022
Apr 8 2022
FatimaArshad-DS added a comment to T302242: Outreachy Application Task (Round 24): Build Python library to work with html-dumps.
My question is related to this page: https://en.wikipedia.org/wiki/Chang_Gum-chol