Page MenuHomePhabricator

Perform code steward process for #geodata extension
Open, NormalPublic

Description

A succinct problem statement to give context for why the review was initiated.

No one seems to be responsible for the geodata extension, despite being very actively in use (unlike other codebases that have gone through this process which had been limited to a niche or few communities- this is in use by millions of pages), and it has important outstanding bugs, which are making the database degrade and blocking MCR/Commons structured data deployment:

The geodata extension allows to create metadata from the coordinates setup on wikis by parsing its content and setting it up on a table. I am going to guess it is the base for geohack, "nearby" articles, community-made maps of multimedia, monuments (Including WLM, etc), natural locations, etc.

Original reporters ignorant comment: It seems that structured coordinates are going to be taken over by wikibase (both on commons and on articles, maybe?), so my proposal would be- to slowly decommission it and let the wikidata and MCR teams and community build better, up to date tools as needed? In other words, the code could be deprecated, the functionality must be maintained, even if slightly modified.

Entry in Developers/Maintainers (It was emptied)

After digging and asking around, there doesn't seem to be a responsible right now, last news is that search team is going to look into it. Even if search would take ownership of the existing code, there needs to be an non-trivial commitment to maintain it at first because long-due issues.

I personally suggested to involve the community and MCR people (structured commons) as there could be an overlap in functionality (not sure)- e.g. coordinates for images being stored on wikibase instead. Both could live toghether, but there is certainly some coordination that could happen there. It is specially important on that to hear from the Commons contributors. Other wikis will need most likely to maintain geodata as it is.

Number, severity, and age of known and confirmed security issues

See above.

Was it a cause of production outages or incidents? List them.

Not yet, but I predict those will happen due to differences between master and replicas on the production tables- #DBAs can take care of helping cleaning up that, but need someone to fix the code to not happen again (check duplicates). If this happened, this would break not only GeoData but all wiki content edits for all wikis in the same section.

Does it have sufficient hardware resources for now and the near future (to take into account expected usage growth)?
Hardware should not be an issue.

Is it a frequent cause of monitoring alerts that need action, and are they addressed timely and appropriately?

I created a high ticket regarding a database degradation, it was not attended by anyone on several months. No alerts as it is part of the parser.

When it was first deployed to Wikimedia production

Can't say.

Usage statistics based on audience(s) served

Can't say, someone else can tell, but my estimation is millions (mainstream feature), including mobile and desktop users. I predict a huge user outrage if this functionality stops working.

Changes committed in last 1, 3, 6, and 12 months
https://phabricator.wikimedia.org/diffusion/EGDA/history/master/

I'd say there are ~1 commit on each month that are functionality related: January: 2, February:1, March: 1, April:0- but it is difficult to say, many probably automatic things like translations, etc.

Reliance on outdated platforms (e.g. operating systems)

I don't this it is relevant- it should be platform agnostic as long as mediawiki is running.

Number of developers who committed code in the last 1, 3, 6, and 12 months
https://phabricator.wikimedia.org/diffusion/EGDA/history/master/
1 non-bots in the last month, 7 non-bots in the last 3 months ?

Number and age of open patches
Number and age of open bugs
See above

Number of known dependencies?
No idea.

Is there a replacement/alternative for the feature? Is there a plan for a replacement?
No, but I mentioned that some functionality, on Commons, could be overridden by wikibase usage- speak with MCR to see what are the plans about that- maybe it should be interoperable. No plans that I know for all other wikis, except I guess Wikidata itself.

  • Submitter's recommendation (what do you propose be done?) **

Search for a maintainer while we do the database fixes so the extension can survive. Talk to MCR tean to understand their plans. If possible maintain as is, but with a responsive team behind it (once the immediate fixes are done, it should not need a lot of work- as long as bugs are fixed, no new functionality or large commitments would be needed).

Event Timeline

jcrespo created this task.Feb 27 2018, 8:22 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 27 2018, 8:22 PM

Hmm, mw:Developers/Maintainers lists Discovery (which isn't existing anymore) as responsible team and @MaxSem (now in Community-Tech) as maintainer.

MaxSem added a comment.Mar 1 2018, 1:40 AM

With all due respect, T35704 is a feature request instead of "important bug", and T55539 breaking replication is {{citation needed}}.

@MaxSem I was told this extension was orphan, if you tell me you are in charge or interested (the level or time available is indifferent for that), I will mark the task as invalid right now.

To be clear, I am *not* interested on destroying it, in fact I wanted to help fix T55539 an T143366 but was told nobody cared :-(. I am happy to be proved wrong.

mark added a subscriber: mark.Mar 13 2018, 12:08 PM
Jrbranaa added a comment.EditedApr 6 2018, 9:34 PM

@jcrespo if you could add the remaining portions of the Rubric[0], that will help us once we start reviewing. Although sunsetting is one of the potential outcomes, we'd like to investigate the funding alternatives as well.

[0]https://www.mediawiki.org/wiki/Code_stewardship_reviews#Rubric

jcrespo renamed this task from Sunset #geodata extension to Perform code steward process for #geodata extension.Apr 9 2018, 7:52 AM

I've changed the title to better reflect that I don't want to remove this, in fact, what I want is there is better support for it, which right now is affecting me.

MaxSem added a subscriber: EBjune.Apr 9 2018, 6:28 PM

I believe GD id now search team's responsibility, adding @EBjune.

EBjune added a comment.Apr 9 2018, 7:49 PM

Well, I'm not sure how that becomes the search team's responsibility when all the map team people moved to audiences, switched teams (which includes @MaxSem, the previously listed maintainer), or are no longer at WMF. I'll talk with the team to help figure out where it really belongs, but at this point, we have pretty much zero familiarity or time to put into it should anything be required.

jcrespo updated the task description. (Show Details)Apr 10 2018, 6:39 AM
jcrespo updated the task description. (Show Details)Apr 10 2018, 7:18 AM

Search Platform discussed this, we can probably deal with the Elasticsearch integration part of the extension, but there are other parts that deal with coordinate management, map display, and the integration with parsing hooks that we are completely unfamiliar with and not really staffed to deal with.

That said, we'll take a deeper look at the extension to identify what, specifically, we can and can't support and then figure out where to go from there and who else we might be able to loop in.

jcrespo added a comment.EditedApr 11 2018, 7:18 AM

@EBjune The largest issue right now, from the reporter point of view, that would threaten the stability of the site is some database-related work. #DBAs want to take care of that, but may need some code maintenance. Is that something that your team could help with? It should be a 1-time thing, as far as the database bugs are concerned.

We could also have some discussion with Wikidata/Structured Data Commons to see if they can take care of the others.

Thanks for the additional information @jcrespo, it will help as we review this extension's stewardship/funding moving forward. @EBjune, any additional insight regarding Search Platform team's potential role will be appreciated. In the coming weeks we'll also be soliciting broader feedback on this extension from the community as well.

Just to close the loop on this, Search Platform will be the code stewards for the Geodata extension. For any code support needs around database updates, please open tickets and tag them Discovery-Search.

@EBjune: Thanks. Should H33 be adjusted to add the Discovery-Search tag automatically when the GeoData project tag is added? Should the "GeoData" entry on https://www.mediawiki.org/wiki/Developers/Maintainers get updated?

EBjune added a comment.Jun 1 2018, 6:26 PM

Thanks @Aklapper I went ahead and updated the maintainers page, and yes, Discovery-Search can be added automatically to GeoData tagged issues

phuedx added a subscriber: phuedx.Jun 7 2018, 8:21 AM
EBjune triaged this task as Normal priority.Jun 7 2018, 5:16 PM
EBjune moved this task from needs triage to watching / waiting on the Discovery-Search board.
Aklapper added a comment.EditedJun 19 2018, 8:58 PM

yes, Discovery-Search can be added automatically to GeoData tagged issues

I realized that H33 is not about Discovery-Search but about Discovery so my question did not make sense.
@EBjune: Question: I could update H143 to make Herald add Discovery-Search automatically to [new] tasks under GeoData though, if that is wanted?
(Note that Herald will not apply rules retroactively; use batch-editing if you want to edit project tags of existing tasks.)

Edit: Had not realized that #Discovery-Search-Backlog is an alias for #Discovery-Search so I went ahead and edited H143 as that's the outcome you originally asked for. Done!