Page MenuHomePhabricator

diegodlh (Diego de la Hera)
User

Projects (6)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Sep 23 2020, 8:24 PM (97 w, 5 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Diegodlh [ Global Accounts ]

Recent Activity

Today

diegodlh created T314806: Do not import JSDOM if using w2c-core from a browser environment.
Mon, Aug 8, 5:50 PM · Web2Cit-Editor, Web2Cit-Core

Mon, Aug 1

diegodlh closed T303294: Publish Web2Cit core library as NPM package as Resolved.

Done, and updated Web2Cit-Server to use this.

Mon, Aug 1, 11:21 PM · Web2Cit-Core
diegodlh added a comment to T314107: Error 429 after a few queries.

This one should be solved. @Superzerocool, could you please confirm? Thanks!

Mon, Aug 1, 9:43 PM · Web2Cit-Server
diegodlh closed T302591: Add custom user agent to http requests, a subtask of T314107: Error 429 after a few queries, as Resolved.
Mon, Aug 1, 9:43 PM · Web2Cit-Server
diegodlh closed T302591: Add custom user agent to http requests as Resolved.

Addressed both in Web2Cit-Core and Web2Cit-Server.

Mon, Aug 1, 9:43 PM · Web2Cit-Core
diegodlh closed T314198: Array vs array comparisons may lead to combinatorial explosion in Web2Cit as Resolved.

Unordered array comparisons disabled in 1d643ece.
Pending merge into Web2Cit-Core's main, and updating Web2Cit-Server.

Mon, Aug 1, 9:38 PM · Web2Cit-Monitor, Web2Cit-Server, Web2Cit-Core
diegodlh triaged T303294: Publish Web2Cit core library as NPM package as Medium priority.
Mon, Aug 1, 3:49 PM · Web2Cit-Core
diegodlh moved T303294: Publish Web2Cit core library as NPM package from v0.0.1 to To do on the Web2Cit-Core board.
Mon, Aug 1, 3:49 PM · Web2Cit-Core
diegodlh added a comment to T314198: Array vs array comparisons may lead to combinatorial explosion in Web2Cit.

One alternative to compare arrays independently of the order of their items could be:

  1. sort both arrays alphabetically
  2. if one array is shorter, equate its length to that of the longest array by adding undefined items; create different versions of this extended array with the undefined items in different positions
  3. compare one array vs all versions of the other array, 1st item vs 1st item, etc; return the highest score
Mon, Aug 1, 12:20 PM · Web2Cit-Monitor, Web2Cit-Server, Web2Cit-Core

Fri, Jul 29

diegodlh moved T314198: Array vs array comparisons may lead to combinatorial explosion in Web2Cit from To do to Doing on the Web2Cit-Core board.
Fri, Jul 29, 10:37 PM · Web2Cit-Monitor, Web2Cit-Server, Web2Cit-Core
diegodlh moved T304772: We should follow redirections when fetching configuration files from Doing to To do on the Web2Cit-Core board.
Fri, Jul 29, 10:37 PM · Web2Cit-Core
diegodlh added a comment to T314198: Array vs array comparisons may lead to combinatorial explosion in Web2Cit.

Unordered array comparisons disabled in 1d643ece.

Fri, Jul 29, 10:36 PM · Web2Cit-Monitor, Web2Cit-Server, Web2Cit-Core
diegodlh triaged T314198: Array vs array comparisons may lead to combinatorial explosion in Web2Cit as High priority.
Fri, Jul 29, 9:53 PM · Web2Cit-Monitor, Web2Cit-Server, Web2Cit-Core
diegodlh created T314198: Array vs array comparisons may lead to combinatorial explosion in Web2Cit.
Fri, Jul 29, 9:53 PM · Web2Cit-Monitor, Web2Cit-Server, Web2Cit-Core
diegodlh added a parent task for T302591: Add custom user agent to http requests: T314107: Error 429 after a few queries.
Fri, Jul 29, 1:28 PM · Web2Cit-Core
diegodlh added a subtask for T314107: Error 429 after a few queries: T302591: Add custom user agent to http requests.
Fri, Jul 29, 1:28 PM · Web2Cit-Server
diegodlh added a comment to T314107: Error 429 after a few queries.

Thanks for pointing this out! I'll work on T302591 as soon as possible then. Marking this one as depending on that one.

Fri, Jul 29, 1:28 PM · Web2Cit-Server
diegodlh added a comment to T314106: Tests would fail if not tests is not set.

The server does fail with a 404 error if both tests and templates files do not exist, or if no paths have been configured in any of them, as shown in this example.

Fri, Jul 29, 1:25 PM · Web2Cit-Server
diegodlh added a comment to T314106: Tests would fail if not tests is not set.

Probably this is because Web2Cit-Core automatically includes a test case for any paths for which a template has been configured in the corresponding templates.json file.

Fri, Jul 29, 1:22 PM · Web2Cit-Server

Tue, Jul 26

diegodlh closed T313758: Consider adding an average score property to Web2Cit server's JSON response as Resolved.

Implemented in 89b9f932.

Tue, Jul 26, 8:44 PM · Web2Cit-Server
diegodlh closed T313722: Consider including API version in Web2Cit server JSON response as Resolved.

Implemented in 89b9f932.

Tue, Jul 26, 8:44 PM · Web2Cit-Server
diegodlh closed T313757: Prepare JSON response for consumption from Web2Cit monitor as Resolved.

Implemented in 89b9f932 and deployed.

Tue, Jul 26, 8:43 PM · Web2Cit-Monitor, Web2Cit-Server
diegodlh added a comment to T313757: Prepare JSON response for consumption from Web2Cit monitor.

make sure that for each field we always return the output, test and score properties, even if they are undefined

Tue, Jul 26, 8:05 PM · Web2Cit-Monitor, Web2Cit-Server

Mon, Jul 25

diegodlh created T313760: Consider having Web2Cit server cache Citoid and web source responses.
Mon, Jul 25, 7:23 PM · Web2Cit-Server
diegodlh updated the task description for T313757: Prepare JSON response for consumption from Web2Cit monitor.
Mon, Jul 25, 7:18 PM · Web2Cit-Monitor, Web2Cit-Server
diegodlh created T313758: Consider adding an average score property to Web2Cit server's JSON response.
Mon, Jul 25, 7:16 PM · Web2Cit-Server
diegodlh created T313757: Prepare JSON response for consumption from Web2Cit monitor.
Mon, Jul 25, 7:12 PM · Web2Cit-Monitor, Web2Cit-Server
diegodlh created T313722: Consider including API version in Web2Cit server JSON response.
Mon, Jul 25, 12:45 PM · Web2Cit-Server

Mon, Jul 18

diegodlh created T313236: Consider revising title validation to accept empty strings.
Mon, Jul 18, 2:37 PM · Web2Cit-Core
diegodlh moved T308452: Improve automatic citation of web sources in Wikipedia from Backlog to Doing on the Web2Cit board.
Mon, Jul 18, 2:30 PM · good first task, Web2Cit, Wikimedia-Hackathon-2022

Jul 6 2022

diegodlh added a comment to T312110: Web2Cit score should handle equivalent language field values.

I think that the best we could do is convert things like spanish to its canonical form es.

Jul 6 2022, 8:12 PM · Web2Cit-Core

Jul 5 2022

diegodlh created T312111: Web2Cit editor should help users choose a value for the language field.
Jul 5 2022, 2:35 PM · Web2Cit-Editor
diegodlh created T312110: Web2Cit score should handle equivalent language field values.
Jul 5 2022, 2:26 PM · Web2Cit-Core
diegodlh added a comment to T308328: Revise "language" field validation.

Note that there are some values that, although valid for CS1/2 templates' language parameter, would not pass Citoid validation. For example, abq-latn, or es-419. We may have to report this to Citoid.

Jul 5 2022, 2:19 PM · Documentation, Web2Cit-Core

Jul 2 2022

diegodlh created T311925: "button" tag not working in Web2Cit XPath expression.
Jul 2 2022, 4:59 PM · Web2Cit-Core

Jun 28 2022

diegodlh created T311519: Web2Cit fails on invalid itemType output instead of proceeding with fallback template.
Jun 28 2022, 2:24 PM · Web2Cit-Server, Web2Cit-Core

Jun 24 2022

GFontenelle_WMF awarded T311317: Consider adding an edit icon next to Web2Cit credit label a Love token.
Jun 24 2022, 7:56 PM · Web2Cit-Gadget
diegodlh created T311318: Web2Cit not working for revistacontinente.com.br.
Jun 24 2022, 4:55 PM · Web2Cit-Server
diegodlh created T311317: Consider adding an edit icon next to Web2Cit credit label.
Jun 24 2022, 4:51 PM · Web2Cit-Gadget

Jun 14 2022

diegodlh added a comment to T310653: Web2Cit user script's unpatch methods may conflict with other monkey-patchers.

In addition, consider checking this enabled flag from within the credit's labelChange event handler, as it could be the case that it remained set from a previous call which did not trigger it (see T310656).

Jun 14 2022, 9:04 PM · Web2Cit-Gadget
diegodlh created T310656: Credit label update deferred if only Web2Cit citation available (no Citoid citation).
Jun 14 2022, 9:02 PM · Web2Cit-Gadget
diegodlh added a comment to T308323: Web2Cit enable/disable checkbox stops working after switching between visual and code editing.

I wonder whether addressing T310653 would solve this.

Jun 14 2022, 8:36 PM · Web2Cit-Gadget
diegodlh closed T308342: Consider using Gitlab to version control the Web2Cit user script as Resolved.

Done: https://gitlab.wikimedia.org/diegodlh/w2c-gadget

Jun 14 2022, 8:32 PM · Web2Cit-Gadget
diegodlh changed Source Repo from https://en.wikipedia.org/wiki/User:Diegodlh/Web2Cit/script.js to https://gitlab.wikimedia.org/diegodlh/w2c-gadget on Web2Cit-Gadget.
Jun 14 2022, 8:31 PM
diegodlh created T310653: Web2Cit user script's unpatch methods may conflict with other monkey-patchers.
Jun 14 2022, 8:24 PM · Web2Cit-Gadget

Jun 13 2022

diegodlh created T310519: tests=false should hide expected output column from Web2Cit translation results.
Jun 13 2022, 3:56 PM · Web2Cit-Server
diegodlh created T310518: Add a shortcut to translate all template + test paths to Web2Cit homepage.
Jun 13 2022, 3:53 PM · Web2Cit-Server
diegodlh closed T302724: Include translation test results in server response as Resolved.

Implemented on the translation-tests branch.

Jun 13 2022, 2:57 PM · Web2Cit-Server
diegodlh closed T302722: Add test configuration support, a subtask of T302724: Include translation test results in server response, as Resolved.
Jun 13 2022, 2:55 PM · Web2Cit-Server
diegodlh closed T302722: Add test configuration support as Resolved.

Implemented in test-configuration branch and merged into main.

Jun 13 2022, 2:55 PM · Web2Cit-Core
diegodlh closed T306555: Consider adding a "fetchAndLoadConfig" method to the "Domain" objects as Resolved.

Fixed in test-configuration branch. Added a fetchAndLoadConfigs method to the Domain object prototype.

Jun 13 2022, 2:06 PM · Web2Cit-Core
diegodlh closed T302239: Consider creating new Webpage objects through a method that maintains a cache as Resolved.

Fixed in test-configuration branch. Now Webpage objects are created via the getWebpage method of a WebpageFactory object included in the Domain objects.

Jun 13 2022, 2:04 PM · Web2Cit-Core
diegodlh closed T302589: Consider creating new Webpage objects via the Domain object as Resolved.

Fixed in test-configuration branch. The Domain object now includes a WebpageFactory object, with a getWebpage method.

Jun 13 2022, 2:02 PM · Web2Cit-Core
diegodlh added a comment to T310439: Citoid duplicates part of url for fina.org.

Noting that this also happens when importing the item into Zotero using the Zotero browser connector.

Jun 13 2022, 12:15 PM · Citoid

Jun 6 2022

diegodlh added a comment to T310001: Bloomberg is redirecting us to "Are you a robot?" captcha.

This feels related to T290834. Both have in common that we are getting an intermediary page, not only breaking translation, but also and more significantly, loosing the original URL. That is, if the user does not spot the error, they will insert a citation with a completely useless URL.

Jun 6 2022, 5:28 PM · Web2Cit-Core
diegodlh added a comment to T302492: Update output URL in case of URL redirection or canonical URLs.

Alternatively, we may also support defining translation procedures for the URL field.

Jun 6 2022, 5:28 PM · Web2Cit-Core
diegodlh added a comment to T290834: Citoid fails to properly process references to zeit.de, picks cookie banner instead.

This also affects Web2Cit (a tool to collaboratively work around automatic citation problems), both where it relies on Citoid (i.e., Citoid selection steps) and where it relies on webpage's HTML (i.e., XPath selection steps). I hope it's OK that I add the Web2Cit-Core tag too.

Jun 6 2022, 4:47 PM · Web2Cit-Core, VisualEditor, The-Wikipedia-Library, Citoid
diegodlh added a project to T290834: Citoid fails to properly process references to zeit.de, picks cookie banner instead: Web2Cit-Core.
Jun 6 2022, 4:46 PM · Web2Cit-Core, VisualEditor, The-Wikipedia-Library, Citoid
diegodlh updated the task description for T310001: Bloomberg is redirecting us to "Are you a robot?" captcha.
Jun 6 2022, 4:39 PM · Web2Cit-Core
diegodlh moved T310001: Bloomberg is redirecting us to "Are you a robot?" captcha from To do to Backlog on the Web2Cit-Core board.
Jun 6 2022, 4:20 PM · Web2Cit-Core
diegodlh created T310001: Bloomberg is redirecting us to "Are you a robot?" captcha.
Jun 6 2022, 4:20 PM · Web2Cit-Core
diegodlh created T309991: Consider supporting JavaScript rendering on target webpages.
Jun 6 2022, 3:01 PM · Web2Cit
diegodlh created T309989: Consider providing a way to reload page with JavaScript disabled.
Jun 6 2022, 2:54 PM · Web2Cit-Editor

Jun 3 2022

diegodlh added a comment to T305574: Consider using a hierarchy of configuration files.

Another point is that having separate templates for separate wikis would allow wikilinking to pages in the fields

Jun 3 2022, 2:09 PM · Web2Cit
diegodlh created T309869: Support wikilinks in Web2Cit citation fields.
Jun 3 2022, 2:08 PM · Web2Cit

Jun 1 2022

diegodlh moved T309708: Escape special regex characters in non-regex "Match" transformation configs from To do to Done on the Web2Cit-Core board.

Fixed in 12da4a7a. Pending deployment.

Jun 1 2022, 1:34 PM · Web2Cit-Core
diegodlh created T309708: Escape special regex characters in non-regex "Match" transformation configs.
Jun 1 2022, 1:20 PM · Web2Cit-Core
diegodlh added a comment to T309706: Support partial ISO dates in Web2Cit date transformation.

Because a date transformation step is included in the fallback template (see T308354 for a discussion of whether this should continue to be the case), partial dates returned by Citoid, some of which may be incompatible with English Wikipedia's citation templates (see T132308), are (incorrectly) force-coverted to full dates. For example, 2020-12 is converted to 2020-12-01, and 2020 is converted to 2020-01-01.

Jun 1 2022, 1:05 PM · Web2Cit-Core
diegodlh created T309706: Support partial ISO dates in Web2Cit date transformation.
Jun 1 2022, 1:03 PM · Web2Cit-Core

May 31 2022

diegodlh added a comment to T132308: Internationalise citoid dates.

what about adding a $wgCitoidEDTF boolean option (just as we have $wgCitoidFullRestbaseURL) to configure on a per-wiki basis whether we want 2010-12 (false) or 2010-12-XX (true) dates? Then, inside CitoidInspector's populateTemplate function, if $wgCitoidEDTF = true, we may append -XX to YYYY-MM values in any field mapping to the date base field (i.e., date, dateDecided, filingDate, issueDate and dateEnacted).

May 31 2022, 10:50 PM · User-notice, User-Josve05a, VisualEditor, Citoid
diegodlh created T309658: Implement text fragment selection in Web2Cit.
May 31 2022, 8:38 PM · Web2Cit-Core
diegodlh created T309654: Consider improving Web2Cit innerText support in XPath selection.
May 31 2022, 8:06 PM · Web2Cit-Core

May 30 2022

diegodlh added a comment to T309310: Web2Cit configuration for www.independent.ie.

Thanks again for your feedback, @AlexisJazz!

May 30 2022, 1:47 PM · Web2Cit

May 27 2022

diegodlh updated subscribers of T304332: Implement JSON-LD selection.

I've been experimenting with JSON-LD selection. What I'm doing so far is:

  1. Concatenate multiple JSON-LD objects in a webpage into a single array (some webpages may have more than one JSON-LD object)
  2. Use JSONPath to select nodes.
May 27 2022, 1:42 AM · Web2Cit-Core

May 26 2022

diegodlh closed T309321: Web2Cit server and config editor do not handle user names with spaces as Resolved.

Thanks, @AlexisJazz. I've just fixed this in 59e0858e. It should be working now. It tried it with the four types of usernames that you mentioned.

May 26 2022, 6:31 PM · Web2Cit-Server, Web2Cit-Editor
diegodlh added a comment to T309310: Web2Cit configuration for www.independent.ie.

Try adding itemType and title fields to your template. Use the default procedure for both. This should fix the "I don't see a difference in the output" of the problem.

May 26 2022, 4:34 PM · Web2Cit
diegodlh closed T309320: Web2Cit configuration file editor fails to save as Resolved.

Fixed in ee84866d.

May 26 2022, 4:17 PM · Web2Cit-Editor, Web2Cit-Server
diegodlh added a comment to T309251: PubMed source dates with no day.

I think this is related to this bug here: T132308. In short, the English Wikipedia (and apparently *only* the English Wikipedia) rejects dates in the YYYY-MM format (when the day is unknown or unspecified) because, they argue, could be mistaken for a date range. For example, 2010-12 could either mean "December 2010" or "2010-2012". As mentioned in that (long) thread, the Citoid team tried to fix it with 2010-12-XX (for example); but although it works in the English Wikipedia, it fails in all others.

May 26 2022, 3:09 PM · Citoid, VisualEditor
diegodlh added projects to T309321: Web2Cit server and config editor do not handle user names with spaces: Web2Cit-Editor, Web2Cit-Server.
May 26 2022, 2:54 PM · Web2Cit-Server, Web2Cit-Editor
diegodlh created T309321: Web2Cit server and config editor do not handle user names with spaces.
May 26 2022, 2:53 PM · Web2Cit-Server, Web2Cit-Editor
diegodlh created T309320: Web2Cit configuration file editor fails to save.
May 26 2022, 2:43 PM · Web2Cit-Editor, Web2Cit-Server
diegodlh added a comment to T298427: Zotero translator needed to get correct author for Condé Nast requests.

Thank you, @AlexisJazz! I've opened a separate task to discuss this, so we don't continue detouring from the topic of this task (i.e., fixing Citoid response for Condé Nast publications). I know it was me who started it, sorry: T309310

May 26 2022, 2:32 PM · VisualEditor, Citoid
diegodlh added a comment to T309310: Web2Cit configuration for www.independent.ie.

Thank you very much for your interest in Web2Cit, for helping us test it, and for reporting the issues you found! It's very helpful for us.

May 26 2022, 2:27 PM · Web2Cit
diegodlh created T309310: Web2Cit configuration for www.independent.ie.
May 26 2022, 1:15 PM · Web2Cit

May 22 2022

Pcoombe awarded T308668: Consider implementing CSS selection in Web2Cit a Like token.
May 22 2022, 3:38 PM · Web2Cit-Core

May 20 2022

diegodlh added a comment to T308452: Improve automatic citation of web sources in Wikipedia.

At Wikimedia Hackathon 2022's Web2Cit session (T308449) we configured Web2Cit translation for news.yahoo.com domain. Using our user script, Wikipedia's automatic citation tool now shows correct citations for webpages from that domain.

May 20 2022, 8:49 PM · good first task, Web2Cit, Wikimedia-Hackathon-2022
diegodlh added a comment to T305574: Consider using a hierarchy of configuration files.

At Wikimedia Hackathon 2022's Web2Cit session (T308449), @Mvolz commented that having separate Web2Cit configurations per Wikipedia may be useful for the specific case described in T132308. That is, that incomplete dates returned by Citoid (e.g., 2010-12, meaning December 2010) throw an error in English Wikipedia citation templates, to avoid confusion with date ranges (i.e., 2010-2012). As described there, it was tried with -XX at the end (i.e., 2010-12-XX), but whereas accepted by English Wikipedia, it was rejected by other Wikipedias.

May 20 2022, 7:42 PM · Web2Cit
diegodlh added a comment to T132308: Internationalise citoid dates.

Hi! I just learned today (at Wikimedia Hackathon 2022's Web2Cit session, T308449) about the problem of incomplete dates (e.g., 2010-12, meaning December 2010) being rejected in English Wikipedia because they could be confused with date ranges (i.e., 2010-2012). From what @Mvolz commented at the session, and from what I managed to read above, it was tried to make Citoid return 2010-12-XX instead, but that fails in other Wikipedias.

May 20 2022, 7:28 PM · User-notice, User-Josve05a, VisualEditor, Citoid
diegodlh closed T308449: [Session] Web2Cit early adoption to improve automatic citations in Wikipedia as Resolved.
May 20 2022, 5:44 PM · good first task, Web2Cit, Wikimedia-Hackathon-2022
diegodlh updated the task description for T308449: [Session] Web2Cit early adoption to improve automatic citations in Wikipedia.
May 20 2022, 1:48 PM · good first task, Web2Cit, Wikimedia-Hackathon-2022

May 19 2022

diegodlh added a comment to T308666: Web2Cit should not fail on misconfigured XPath selection.

These invalid XPath expressions are not being caught at config validation because for some reason jsdom's document.createExpression() is not failing on them. Reported to jsdom here.

May 19 2022, 11:07 PM · Web2Cit-Core, Web2Cit-Server
diegodlh closed T297951: Web2Cit should reject top-level domains as Invalid.

Given a Target URL, the 3-file set defined for the most-specific subdomain is used

May 19 2022, 8:38 PM · Web2Cit
diegodlh moved T302697: Return only one "container-title" field from To do to Backlog on the Web2Cit-Core board.
May 19 2022, 8:33 PM · Web2Cit-Core
diegodlh triaged T302697: Return only one "container-title" field as Lowest priority.
May 19 2022, 8:33 PM · Web2Cit-Core
diegodlh closed T303331: Consider making the "itemwise" property of TransformationDefinition optional as Resolved.

Because transformation step types are defined within a oneOf object in the templates json schema, the default value of the itemwise property is not being used

May 19 2022, 8:28 PM · Web2Cit-Core
diegodlh moved T303331: Consider making the "itemwise" property of TransformationDefinition optional from To do to Backlog on the Web2Cit-Core board.
May 19 2022, 8:19 PM · Web2Cit-Core
diegodlh added a comment to T302696: Add logging middleware.

Logging both Citoid raw and Web2Cit citations (returned when the server is used with option citoid=true, as is the case with the Web2Cit-Gadget) would help us evaluate if Citoid's coverage gap is narrowing with Web2Cit (Web2Cit-Research).

May 19 2022, 3:57 PM · Web2Cit-Server
diegodlh updated the task description for T308449: [Session] Web2Cit early adoption to improve automatic citations in Wikipedia.
May 19 2022, 2:00 PM · good first task, Web2Cit, Wikimedia-Hackathon-2022

May 18 2022

diegodlh added subtasks for T308712: Make Web2Cit support web servers which addapt to "Accept-Language" request headers: T308711: Support sending "Accept-Language" headers to external services, T308710: Handle "Accept-Language" headers sent by Web2Cit clients, T304773: Consider providing an option to make a HEAD request to the target URL to confirm target availability and follow redirects, T304333: Implement Header selection, T304326: Implement URL selection step.
May 18 2022, 11:48 PM · Web2Cit
diegodlh added a parent task for T308710: Handle "Accept-Language" headers sent by Web2Cit clients: T308712: Make Web2Cit support web servers which addapt to "Accept-Language" request headers.
May 18 2022, 11:48 PM · Web2Cit-Server