Page MenuHomePhabricator

System documentation integrated in source code
Open, LowestPublicFeature

Description

Proposed by Aaron Schulz in 2013, extracted from old Mentorship_programs/Possible_projects.

"It would be really nice if inline comments, README files, and special documentation files could exist in the source code but be exported into a formatted, navigable system (maybe wiki pages or maybe something else).It could be something like doxygen, except better and orientated to admins and not developers. Of course it should integrate with mediawiki.org and https://doc.wikimedia.org. The idea would be that one could:

  • Keep documentation close to the code and thus far more up to date
  • Even enforce documentation updates to it with new commits sometimes
  • Reduce the tedium of making documentation by using minimal markup to specify tables, lists, hierarchy, and so on, and let a tool deal with generating the html (or wikitext). This could allow for a more consistent appearance to documentation.
  • When things are removed from the code (along with the docs in the repo), if mw.org pages are used, they can be tagged with warning box and be placed in maintenance category."

Submitted here for feedback. If we can have a rough plan and at least one potential mentor perhaps we could push this project idea to https://www.mediawiki.org/wiki/Summer_of_Code_2013#Project_ideas


Version: 1.21.x
Severity: enhancement

Details

Reference
bz46526

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:16 AM
bzimport set Reference to bz46526.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #0)

http://svn.wikimedia.org/doc

https://doc.wikimedia.org/ now.

I hope it's clear for potential contributors that we're not after writing a new documentation (deployment) system from scratch, but first to come up with a rough plan with executable, handable subtasks that are (hopefully) based on existing tools and infrastructure (e.g. improving oxygen). :)

We need to make this proposal clear to ourselves before attempting to propose it to contributors.

When I read the proposal I see an extension able to go through the source code of MediaWiki core and extension, capture the documentation and export it to structured MediaWiki pages e.g. under the "Docs:" namespace. Those pages wouldn't be editable, and whenever there is a change in the source code there the corresponding updates would be made in the wiki pages.

I have no idea / no opinion about the implementation, other than yes, let's reuse as much as possible software available out there and maintained by others.

... but I'm not sure whether this is in line with what Aaron / others have in mind. CCing Chad and hashar for ideas & sanity check.

At first:

This project should NOT be handled as a GSOC 2013 project. It is prone to failure. We have such a debt on this subject that there is no way a new comer will solve it or even come close to an idea of a solution in less than 8 weeks. GSOC should be for what it is: small projects for newbies :-]

It is not going to help us nor the student.


My points regarding documentation is very simple:

  • we have almost no documentation (nor any tutorials)
  • I do not care about the documentation on mediawiki.org. It is almost never accurate.

Most of the time I do not read our inline comments and documentation either. It is either very incomplete, sometime misleading and most of the time not present at all. Solutions: read the code, ask the original developer.

To answer the proposal in comment #0

Keep documentation close to the code and thus far more up to date
Even enforce documentation updates to it with new commits sometimes

We need to first start writing documentation. Until we do, there is no point in setting up whatever crazy system. The root issue is we suck at writing doc.

The documentation should be kept in sync with the code. The best way to handle that is to have the doc+code changes to be atomic, hence to land in the source tree as a single commit. Then we can use whatever tool to generate the documentation automatically (we use Doxygen for now).

Reduce the tedium of making documentation by using minimal markup to specify

tables, lists, hierarchy, and so on, and let a tool deal with generating the
html (or wikitext). This could allow for a more consistent appearance to
documentation.

Markup is totally irrelevant. Doxygen let you put HTML in it, it even sometime support something close to wikitext (like stars to build a list). All the nice markup is not going to be useful when there is no text to actually format (see point 1: we suck at writing doc).

When things are removed from the code (along with the docs in the repo), if

mw.org pages are used, they can be tagged with warning box and be placed in
maintenance category."

Trivially solved by keeping the code and documentation at the same place and with review. If the proposal is to get some kind of robots that analyze the code to then update the wiki and wait for some random non-dev to edit the wiki: it is prone to failure.

My recommendation would be to make the documentation a focus for next fiscal year. Start having old developers write tutorial and write documentation while they are mentoring new developers. But we all know we are too busy to handle that. So I guess it is going to be a dead horse. Sorry :(

A few questions come to mind reading this proposal:

  1. Does anyone have ideas on how we can keep documentation translatable if it is somehow part of the code? MediaWiki.org currently has this possibility.
  2. It is widely known and documented (pun intended), that developers are not great writers, nor ideal people to train wiki administrators and wiki operators. How will non-developers be able to contribute, especially as we move away for the wiki engine as a documentation mechanism?

Comment 3 addresses some more issues that will not be resolved by throwing away our current documentation, however incomplete and sometimes out of date, and starting something new. And before we can start something new, we will probably have to bikeshed on it for a few years.

This issue could be WONTFIXed, if you'd ask me.

Take an example from Nike's slogan, and reserve time in your development cycles to actually allow others to learn what you already know, be it as a written tutorial, properly prepared web cast, documentation page, or whatever: Just do it!

Perhaps too hard to start with MediaWiki core, but would be nice to get that with extensions, so MediaWiki pages could almost be automatically generated, and a step towards bug 26992.

Ok, I have removed this proposal as a GSOC 2013 candidate. I still believe it is worth discussing the proposal, though.

"Develop clear documentation and APIs to enable developers to create applications that work easily with MediaWiki" is a top WMF priority according to
http://strategy.wikimedia.org/wiki/Wikimedia_Movement_Strategic_Plan_Summary/Stabilize_Infrastructure

API documentation is usually written in the source code. "Keep documentation close to the code" is a general principle of good open source software development. This request is about "system documentation", tutorials et al are different beats, out of scope here.

About L10n, I'm sure we are not the first project with this problem. It is a convention to document in source code in English. Localization of source code exists but afaik it focuses in UI strings, no docs (but I could be wrong). If the docs would be exported to MediaWiki pages, then you could have your translation point there, using the same workflow being used e.g. at mediawiki.org. The fact that the English / source page is updated manually or by a sync with source code shouldn't be relevant to translators. They would get the same notifications when new updates are available, and the UI of those wiki page would have the same language bar we have for manually updated pages.

Of course a main point to argue is whether we can just keep using https://en.wikipedia.org/wiki/Doxygen and integrate https://doc.wikimedia.org/ better to mediawiki.org.

And I agree with hashar that having good docs (in whatever form) is the first priority, otherwise the doc tools are just pointless. This is why I digged that URL reminding us what are our top priorities until 2015.

Has there been any progress on this proposal since March?

(In reply to comment #3)

This project should NOT be handled as a GSOC 2013 project. It is prone to
failure. We have such a debt on this subject that there is no way a new comer
will solve it or even come close to an idea of a solution in less than 8
weeks.

Is there a portion of this task that could be proposed as a GSoC / OPW project?

(In reply to comment #7)

Is there a portion of this task that could be proposed as a GSoC / OPW
project?

Not really. It would require a cultural change among developers, for example rejecting patches which do not come with inline documentation / examples. I can't see us enforcing writing documentation anytime soon since most of our code is undocumented anyway. And of course, we lack resource to document our existing code.

(In reply to comment #8)

(In reply to comment #7)

Is there a portion of this task that could be proposed as a GSoC / OPW
project?

Not really. It would require a cultural change among developers, for example
rejecting patches which do not come with inline documentation / examples. I
can't see us enforcing writing documentation anytime soon since most of our
code is undocumented anyway. And of course, we lack resource to document our
existing code.

Is this simply a WONTFIX, then?

vladjohn2013 wrote:

Hi, this project is still listed at https://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#System_documentation_integrated_in_source_code

Should this project be still listed in that page? If not, please remove it. If it still makes sense, then it could be moved to the "Featured projects" section if it has community support and mentors.

Wikimedia will apply to Google Summer of Code and Outreachy on Tuesday, February 17. If you want this task to become a featured project idea, please follow these instructions.

@Spage, your opinion about this project in the context of GSoC / Outreachy is welcomed.

Yes, we want to do this. Despite hashar's gloom, we do have documentation in source code: core/docs has some text files , extensions/Wikibase/docs has .wiki files, many extensions have README.md files, etc.

I think a GSOC student could work on a proof of concept to publish some of these to pages on mediawiki.org and/or incorporate them in the existing doxygen.

My bias is for the former. There are a ton of issues to work out for this:

  • is it push (part of make doc CI step) or pull (page mentions some file in git, and an extension or bot transcludes it)? (Answer: yes :) )
  • if we allow fancy markup, how do devs preview? (Answer: Parsoid grunt watch buzzword integration!)
  • how to allow on-wiki editing and annotation. (Answer: marked section of wiki page.)

But starting small with a basic goal of "Publish extensions/Wikibase/docs/*.wiki files to mw.org" seems doable in a GSOC timeframe, though with big risks of failure.

I have lots of ideas for this but my priority is #dev.wikimedia.org

@Spage, could we trick you into becoming a primary/co-mentor for this GSoC/Outreachy project?

Qgil lowered the priority of this task from Low to Lowest.Feb 19 2015, 4:38 PM

@Spage and I discussed this project yesterday (being both involved in #engineering-community ). Even if S likes the idea a lot, we think it is better for us to focus all our energy in #dev.wikimedia.org. Currently there are many moving targets around technical documentation. This project might make more sense in the next round, when everything should be more consolidated.

Let's keep this task in Possible-Tech-Projects, but back to backlog until the next round.

@Qgil I would just wont fix this idea as a volunteer/gsoc whatever event. That needs to be lead by Engineering management and made a Wikimedia Engineering quarterly goal or something like that.

I think a GSOC student could work on a proof of concept to publish some of these [bits of documentation in source code] to pages on mediawiki.org

My proposal to do this piece is T91626: Technology to transclude git content into wiki pages. It seems to me that Aaron's proposal

inline comments, README files, and special documentation files could exist in the source code but be exported into a formatted, navigable system (maybe wiki pages or maybe something else)... better than doxygen"

is big and complex (what is this "system"? how will developers learn it?) if it's not export to wiki pages.

Qgil claimed this task.

I created this project idea a couple of years ago as a result of a relatively casual comment from Aaron in some relatively casual context. I am declining this task. At this point it is clear that T91626 is a better place to start.

@Qgil, I am not sure I understand the reasoning to close this. Blocking it on T91626 seems more appropriate to me, particularly since you mention that it is "a better place to start" (emphasis mine).

@waldyrious, the reasoning to close this task is explained by the many comments above from key contributors proposing to decline it and bet on other approaches.

@waldyrious, the reasoning to close this task is explained by the many comments above from key contributors proposing to decline it and bet on other approaches.

Sorry, I re-read all the comments and I still can only see @hashar's comments as fitting your description. And even he clarified "I would just wont fix this idea as a volunteer/gsoc whatever event" to make it clear he wasn't against the general idea itself.

I can only imagine I'm interpreting this differently than you. Could you clarify your reasoning, maybe quote the key passages you're alluding to, or otherwise summarize your perspective of the discussion?

Here's my perspective, for reference: as I see it, this idea (having documentation in the source code, with a good way to consume it online) has a lot of merit, and like @Spage, I am very enthusiastic about it. Hashar's assessment that it's too complex in scope for a beginner contributor to tackle seems right to me, and your own assessment that it might be better to focus the energy on other tech docs projects for now seems reasonable. At best, then, I'd mark this as the equivalent of bugzilla's LATER (e.g. lowest priority or something like that).

Spage added a subscriber: Anomie.

I talked about this with @hashar and @Anomie ( notes).
Hashar strongly advocated to do this using doxygen rather than T91626: Technology to transclude git content into wiki pages, and Krinkle broadly agrees. Aaron Schultz's own includes/filebackend/README generates doc in doxygen on doc.wikimedia.org.

doxygen supports Markdown formatting, so probably developers can use .md files for lightly-formatted documentation. There's more to figure out, but I Been Bold and added a Text documents section to Coding Conventions describing this approach.

Reopening based on discussions. I'm not sure where to start adding documentation files to doxygen; [T91626] has an overview of them all.

I think it would be useful to make a distinction between reference-type documentation (which should be embedded in code comments and extracted using doxygen-like tools) and manual/tutorial type documentation (which is more prose-like and less tied to specific code files). Here's a proposed nomenclature from a draft document I've been (sporadically) working on for a while now. I believe our documentation browser (doc.wikimedia.org) should seamlessly integrate both generated documentation (from structured comments in code files) and prose-like documents. I'd argue that the latter should live on mediawiki.org simply because that lowers the barrier for contributors to improve them.

Alternatively, such documents could be contained in the code repositories, as suggested above, but in a way that makes their editing as convenient as on-wiki docs. A very good example of such an integration is the documentation for the Julia programming language (http://docs.julialang.org/). Notice how all pages have an "edit on github" link in the top right, which allows live in-browser editing requiring only a github account -- no git installation, no repository cloning/updating (or god forbid, merging/rebasing), no git-review hook, and indeed no command-line workflow at all. They use Sphynx for the documentation generation from the rst source files, and are planning to do the documentation building step automatically for every pull request to the documentation files. If we could cook up a similarly contributor-friendly workflow (I'd prefer .md rather than .rst, though), then by all means such documentation could live in the code repository; otherwise, I strongly suggest that it's kept on-wiki. Note also how the top section in their documentation helpfully provides pointers to both manual-type and reference-type ("standard library") documentation. For the time being they are still maintaining the latter manually, but they are working on a system to extract documentation from structured code comments.

So the question I'd pose is: would it be easier/better for us to (1) set up a system like the described above, i.e. keep everything in the code repositories but with an easy, web-based workflow for contributing to the manual-type docs, or (2) figure out a way to integrate the on-wiki documentation with reference-like auto-generated documentation from structured code comments? The alternative of simply consolidating long-form documentation into text/md files in the repositories, requiring usual developer workflow to contribute to, would, as I see it, just exacerbate our already existing problems with maintaining documentation, since that'd raise the contribution barrier.

I believe our documentation browser (doc.wikimedia.org) should seamlessly integrate both generated documentation (from structured comments in code files) and prose-like documents. I'd argue that the latter should live on mediawiki.org simply because that lowers the barrier for contributors to improve them.

But then the source code patch that changes code behavior can't include a fix to the docs. That's the problem we have right now.

Alternatively, such documents could be contained in the code repositories, as suggested above, but in a way that makes their editing as convenient as on-wiki docs. A very good example of such an integration is the documentation for the Julia programming language (http://docs.julialang.org/).

Nice. We already have the doc building step on each merge to core, e.g. https://gerrit.wikimedia.org/r/#/c/199920/ triggered mediawiki-core-doxygen-publish.

If we could cook up a similarly contributor-friendly workflow (I'd prefer .md rather than .rst, though), then by all means such documentation could live in the code repository;

Yes I'd like an easier workflow for changes to comments in source code and to documentation in git. Neither gerrit nor Phabricator's differential lets you edit in-place to create or modify a change. WMF projects mirror to github, I'm trying to determine the status of T37497}.

otherwise, I strongly suggest that it's kept on-wiki.

I'm pragmatic and weakly principled. I am not going to stand in the way of a developer putting documentation in git, either in source code comments or e.g. in an overview.md file. I recently added the approach that already works for the latter in Coding conventions, without mandating it or even providing guidelines when to do it.

So the question I'd pose is: would it be easier/better for us to (1) set up a system like the described above, i.e. keep everything in the code repositories but with an easy, web-based workflow for contributing to the manual-type docs, or (2) figure out a way to integrate the on-wiki documentation with reference-like auto-generated documentation from structured code comments?

The simple approach to (2) is links to doc.wikimedia.org. I proposed T91626 as an additional tool for (2) but it hasn't gained support.
(1) can't replace a documentation wiki unless we build additional generated documentation support. I think developers envision this happening at doc.wikimedia.org, but I don't understand i!
It's not a binary choice. As you say, manual/tutorial type documentation is different from reference.

The alternative of simply consolidating long-form documentation into text/md files in the repositories, requiring usual developer workflow to contribute to, would, as I see it, just exacerbate our already existing problems with maintaining documentation, since that'd raise the contribution barrier.

You get different problems :) The huge win is code changes and doc updates can be atomic -- you eliminate a separate doc maintenance step because reviewers should reject any code change that leaves the doc incorrect. But there are challenges not only in easing contribution, but also in: explaining behavior changes between releases, localizing, searching, etc.

I appreciate your thoughts. I think what we should do is consider each systematic change to the status quo on its merits. See the subtasks of T93026: remove wiki documentation that duplicates generated documentation (tracking).

Hi @Spage. I agree with pretty much everything you say. Provided that T37497 is resolved satisfactorily, and that both reference and overview-type documentation can be rendered nicely into a browsable interface, I would have no objections to migrate as much documentation from the wiki as possible to live with the code it corresponds to.

I'm pragmatic and weakly principled. I am not going to stand in the way of a developer putting documentation in git, either in source code comments or e.g. in an overview.md file. I recently added the approach that already works for the latter in Coding conventions, without mandating it or even providing guidelines when to do it.

I didn't mean that in a prescriptive sense. If anyone (developers or not) wants to improve documentation, it would be silly to stand in the way. What I meant is that I wouldn't want us (as a community) to promote a way of documentation that would make casual contributions (the wiki way!) significantly harder.

Qgil removed Qgil as the assignee of this task.Mar 27 2015, 1:14 PM
Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:14 AM