User Details
- User Since
- Jan 16 2023, 7:16 PM (45 w, 2 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- JEbe-WMF [ Global Accounts ]
Today
Wed, Nov 22
Mon, Nov 20
Wed, Nov 15
Thu, Nov 9
Thu, Nov 2
Oct 12 2023
Oct 4 2023
- Step 1: Figure out test are useful for pyspark jobs
- Step 2: Decide on acceptable parameters for test
- Step 3: Create/script tests on local
- Step 4: Move test to prod- gitlab
Oct 3 2023
Oct 2 2023
XML Schema Validation(Dan is already doing this using IntelliJ):
- If your XML files adhere to a predefined XML schema (XSD), you can validate them against the schema to identify structural differences.
- Any non-conformance with the schema will be flagged as a difference.
Size and Visual Comparison of the XML:
- Open the two XML files in a text editor or XML viewer that supports syntax highlighting for easier readability.
- Manually review the size of the files side by side.
Size and Visual Random Spot Comparison of the tables in HDFS:
- Using Diff or Minus to compare the Hive table (Mediawiki wikitext history) and the iceberg table (wikitext_raw_rc1)
- Manually review the size of same partitions(if exist)
Stream Parsing:
- Comparing/parsing both files in streaming XML Processes
Sep 13 2023
Sep 12 2023
Sep 6 2023
Sep 5 2023
Sep 4 2023
Aug 22 2023
Aug 7 2023
Aug 2 2023
Jul 25 2023
Jul 24 2023
7am UTC on Thursday (2023-07-27) works for me.
Jul 19 2023
Jul 14 2023
Jul 12 2023
Jul 11 2023
Jul 6 2023
@ArielGlenn i used my wikimedia email
Jun 30 2023
The data engineering team had a meeting and the conclusion was capture tags based on
*Frequency,
*Ownership,
*Criticality,
*Requires a certain table e ie Webrequest
*Destination of data source ie Iceberg, Hive
- Remove tags that do not meet this Criteria
May 8 2023
Apr 13 2023
Mar 27 2023
Mar 16 2023
Mar 1 2023
Feb 9 2023
Documented the following datasets and added wikitech links where applicable
- uniques devices
- banner-activity (druid)
- mobile_apps_session_metrics_by_os
Feb 8 2023
I have documented the following datasets and I am awaiting feedback.
- mediawiki_api_request
- mobile apps session metrics
- mobile apps uniques
Jan 23 2023
Jan 20 2023
Jan 19 2023
I would also be requiring access to a Kerberos principal. cc @odimitrijevic @Snwachukwu
Jan 18 2023
Ticket grant Jennifer Ebe LDAP access for Onboarding.