Page MenuHomePhabricator

Publish Source Code for wikimediafoundation.org
Closed, ResolvedPublic

Assigned To
Authored By
Reedy
Aug 9 2018, 12:39 AM
Referenced Files
None
Tokens
"Like" token, awarded by Platonides."Heartbreak" token, awarded by Addshore."The World Burns" token, awarded by Yair_rand."The World Burns" token, awarded by Framawiki.

Description

Can we publish the source code for wikimediafoundation.org to gerrit?

Event Timeline

Varnent changed the task status from Open to Stalled.Aug 9 2018, 1:13 AM
Varnent claimed this task.
Varnent triaged this task as Low priority.
Varnent subscribed.

I will check on this after the site launches and code changes for this round are finalized.

Just wanted to say I think doing this is very much part of our core values about openness and transparency. Several days ago I found a typo in the website but couldn't find the place to fix it and now I forgot where it was...

Just wanted to say I think doing this is very much part of our core values about openness and transparency. Several days ago I found a typo in the website but couldn't find the place to fix it and now I forgot where it was...

It’s possible some might not be fixable like that depending on how it’s setup. Dynamic ish pages based on Wordpress

Just wanted to say I think doing this is very much part of our core values about openness and transparency. Several days ago I found a typo in the website but couldn't find the place to fix it and now I forgot where it was...

It’s possible some might not be fixable like that depending on how it’s setup. Dynamic ish pages based on Wordpress

I, personally, would change the CMS to make it happen (from wordpress to anything that works) as I hold these values very dear to myself but I can understand that people might have different opinions or their cost-benefit analysis gives different results to them.

Indeed all of the options available were database driven, and managing the content via code would not be very good for things like the blog. Typos and other things can be fixed faster in a dynamic system and should be shared with Communications or posted on the Meta-Wiki talk page for this site. Additionally, Gerrit would be a mirror and not the primary location of the code. It is more to make the code available for use and review than for tweaking the live site. This is true for the former blog and other Foundation managed (vs community managed) sites. When code for those sites was hosted primarily outside Gerrit (which in general is decided by Technology dept), if the code from those sites was made available, it was after the site was completed and fully launched (at least to my knowledge). I hope that helps clear up any confusion on the use of code and expectations around it being made available in the future on Gerrit.

Additionally, Gerrit would be a mirror and not the primary location of the code. It is more to make the code available for use and review than for tweaking the live site. This is true for the former blog and other Foundation managed (vs community managed) sites. 

Any wikimedia hosted website (regardless of community vs foundation like the annual plan) must have the code in the git repo as its source of truth for auditability, reproducability (e.g. so we can move to new server) and as being the most basic of basic best practises. In my mind this is a hard requirement - no ifs ands or buts. That this is optional for the externally hosted foundation sites is in my opinion deplorable in the extreme (but surely you are still storing the code in version control somewhere even if not in a public repo???) but i guess most of the concerns that make this a hard requirement for wikimedia hosted sites make it somebody elses problem for externally hosted sites.

Edit: I may have misinterpreted @Varnent's comment and jumped to conclusions which fueled anger which wasnt entirely well founded. I originally interpreted the statement about the code not being for tweaking as meaning the code would be intended as interesting background, may not be up to date, and is generally seen as a nice fyi and not important. There is a more charitable interpretation that he maybe simply meant that most of the content is controlled by the cms and you wouldnt patch the code to fix a typo anymore than you would patch mediawiki to alter wikipedia's main page. I also had jumped to conclusions upon seeing T95129 being open that the code to the blog theme had not been disclosed despite 3 years of asking, thus jumping to the conclusion of there being a pattern of using the hosting-but-not-distributing gpl loophole to effectively use WMF resources to fund the development of what is effectively proprietary software. This is not the case; it has been disclosed for a while now albeit the licensing of non code assets is still unclear. I still stand by the substance of my post however. I still am extremely disappointed at what appears to be an attitude viewing public source code as a nice to have and not a core requirement. I also dislike that the plan seems to be a public mirror of an official private repo instead of having the official repo be public (i dont care if the official repo is in gerrit. Making some wordpress svn repo public would be sufficient in my mind. Having our own copy would also be nice in that scenario for backup purposes). That said, my original comment came from a place of anger which was informed by incorrect assumptions, so I am sorry about that.

Dumb question, but where exactly is this being developed? Does the foundation actually have access to the version control where it is, or is all of that being handled on the contractor end?

Dumb question, but where exactly is this being developed? Does the foundation actually have access to the version control where it is, or is all of that being handled on the contractor end?

Yes - the version control for all Wikimedia Foundation sites based on Wordpress is managed externally and is accessible to the Foundation. Code revisions are reviewed by Automattic (who also maintains the security patches for the site) before being published to the production site. More information from when this was once done for the blog: T95129

This task is effectively a blocker for nearly all the other bugs in this component, so it's hard to understand how it can possibly be considered lower priority.

@Nemo_bis: That's a misleading generalization as this task 'only' blocks contributors from proposing patches. Missing source code does not generally block tasks as those folks who 'own' that website can of course access and fix issues described in tasks.

Even though it might not be the "preferred" representation of our content, as a matter of good faith we could publish a dump of the wordpress CMS content, either as a raw database dump or in whatever "better" format wordpress may support.

Additionally, Gerrit would be a mirror and not the primary location of the code. It is more to make the code available for use and review than for tweaking the live site. This is true for the former blog and other Foundation managed (vs community managed) sites. 

Edit: I may have misinterpreted @Varnent's comment and jumped to conclusions which fueled anger which wasnt entirely well founded. I originally interpreted the statement about the code not being for tweaking as meaning the code would be intended as interesting background, may not be up to date, and is generally seen as a nice fyi and not important. There is a more charitable interpretation that he maybe simply meant that most of the content is controlled by the cms and you wouldnt patch the code to fix a typo anymore than you would patch mediawiki to alter wikipedia's main page. I also had jumped to conclusions upon seeing T95129 being open that the code to the blog theme had not been disclosed despite 3 years of asking, thus jumping to the conclusion of there being a pattern of using the hosting-but-not-distributing gpl loophole to effectively use WMF resources to fund the development of what is effectively proprietary software. This is not the case; it has been disclosed for a while now albeit the licensing of non code assets is still unclear. I still stand by the substance of my post however. I still am extremely disappointed at what appears to be an attitude viewing public source code as a nice to have and not a core requirement. I also dislike that the plan seems to be a public mirror of an official private repo instead of having the official repo be public (i dont care if the official repo is in gerrit. Making some wordpress svn repo public would be sufficient in my mind. Having our own copy would also be nice in that scenario for backup purposes). That said, my original comment came from a place of anger which was informed by incorrect assumptions, so I am sorry about that.

It happens to all of us. :) I believe the private nature of the host's repo is partly a security issue that applies to all clients (as most do not even have a public mirror available). Although I obviously cannot speak for them on this. I can say that it has indeed always been the intention to make the code publicly available. I have already followed up with the host about making the code publicly available, and they are looking into it. Indeed the content is controlled by the CMS and not codebase, so typo corrections cannot be done via the code (public or not). Making the content dump available will make it available in one code-like place, but will also not provide a method of suggesting content updates. The best method for that remains the site's talk page on Meta-Wiki.

Updates to the code will be managed similar to code updates for other externally hosted Foundation websites (again - the blog is an example where this has been done multiple times - both before and after the mirror was available). They can be proposed here, ones which the Foundation decides to act on are then submitted by staff with access to the master repo, and then reviewed by the host before uploading to the production repo.

Dumb question, but where exactly is this being developed? Does the foundation actually have access to the version control where it is, or is all of that being handled on the contractor end?

Yes - the version control for all Wikimedia Foundation sites based on Wordpress is managed externally and is accessible to the Foundation. Code revisions are reviewed by Automattic (who also maintains the security patches for the site) before being published to the production site. More information from when this was once done for the blog: T95129

And the more detailed technical documentation on how the mirroring was implemented: https://meta.wikimedia.org/wiki/Wikimedia_Blog/SVN-GitHub_mirror_of_the_WordPress_theme
I and/or @Tgr (who was instrumental in figuring this out at the time) might be able to help adapting it to the new site.

Thank you! Are GitHub pull requests the best way for people to submit patches?

It is a mirror, so similar to other sites hosted this way, patches cannot be submitted via this repo (or rather they are not monitored and overwritten when mirror is updated). Any code changes will need to be handled by the internal team with access to the private repo.

It is a mirror, so similar to other sites hosted this way, patches cannot be submitted via this repo (or rather they are not monitored and overwritten when mirror is updated). Any code changes will need to be handled by the internal team with access to the private repo.

I meant that if I want to propose a change to the code, and have created a git patch, then what's the best way to do that?

I am open to ideas, but for now I suppose you will need to send them to me as project lead. My understanding is the mirror may overwrite any commits before they are seen as it's a mirror and not a fork.

I am open to ideas, but for now I suppose you will need to send them to me as project lead. My understanding is the mirror may overwrite any commits before they are seen as it's a mirror and not a fork.

Issues and pull requests aren't overwritten. They should be able to cherry pick them into their private repo from there

If they do any force pushes, that'll remove any commits only in that repo.

People are welcome to try and make commits to the public mirror, I am simply sharing what I was told while setting it up. Happy to be proven wrong, but also do not want to give false impressions of how I understand it will work moving forward. If people submit via the public mirror, I cannot promise we will see it or it will ever make its way to the private repo. If people ping me, I can at least let them know that we are aware of it and have it.

I put up a very small PR at https://github.com/wikimedia/wikimediafoundation-org/pull/1 as a demo. The raw patch file can be accessed with https://patch-diff.githubusercontent.com/raw/wikimedia/wikimediafoundation-org/pull/1.patch (git am ~/wherever/1.patch to import). I don't see any good documentation online about how to merge pull requests across repositories, I can definitely help with writing some though.