Page MenuHomePhabricator

Give text files in Git correct extensions
Closed, ResolvedPublic

Description

We have files in Git with no extension that are actually wikitext and others that are plain text. Other files end in .txt (which doxygen garbles, T106116: The Doxygen version in CI parses README files as garbled C.); some are actually wikitext, and others are structured text.

Per Manual:Coding conventions#Text files (wut spage wrote), actual wikitext files should end with .wiki (.mediawiki also works on GitHub but we prefer shorter), and most plain text files could easily be massaged into Markdown files that should end with .md. (If a text file is neither Markdown nor wikitext, then it's better to have a .txt extension, but it's not worth renaming 132 extensions' COPYING files to COPYING.txt.)

I think it's pretty safe to git mv files in core to give them the right extension, hence tagging this easy. The doc is only regenerated on +2, so you should test your changes locally: install the same version of doxygen that jenkins-bot's https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen-publish/12245/console | mediawiki-core-doxygen-publish task uses (sample run), then rebuild core's documentation with maintenance/mwdocgen.php.

Event Timeline

Spage raised the priority of this task from to Low.
Spage updated the task description. (Show Details)
Spage added subscribers: Spage, hashar, Krinkle.
Restricted Application added a project: Documentation. · View Herald TranscriptOct 26 2015, 11:19 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Its traditional in unix land that certain files have names without extensions, like README and COPYING.

Seems like the real bug here is that doxygen isn't assuming that extensionless files are text, as would be sane.

@Bawolff, are we taking care of the doc/ folder here or we are looking at files in the whole MediaWiki code base. I understand what is required and also what you said above is correct, there are some files that are named without extension. Makefile, INSTALL, HISTORY and more...

This is what I get:

$ find mediawiki/core -type f -and -not -name '*.*' |grep -v /vendor/|grep -v '/.git'
./COPYING
./CREDITS
./docs/code-coverage/README
./docs/html/README
./docs/kss/Makefile
./docs/php-memcached/ChangeLog
./docs/php-memcached/Documentation
./docs/php-memcached/README
./docs/README
./extensions/README
./FAQ
./Gemfile
./HISTORY
./images/README
./includes/filebackend/README
./includes/filerepo/README
./includes/jobqueue/README
./includes/libs/README
./includes/utils/README
./INSTALL
./maintenance/benchmarks/README
./maintenance/dev/README
./maintenance/Doxyfile
./maintenance/hiphop/run-server
./maintenance/language/zhtable/Makefile
./maintenance/language/zhtable/README
./maintenance/Makefile
./maintenance/mwjsduck-gen
./maintenance/README
./maintenance/storage/make-blobs
./README
./resources/assets/file-type-icons/COPYING
./resources/lib/jquery.chosen/LICENSE
./resources/lib/jquery.i18n/CREDITS
./resources/lib/jquery.i18n/GPL-LICENSE
./resources/lib/jquery.i18n/MIT-LICENSE
./resources/lib/jquery.ui/themes/smoothness/PATCHES
./resources/lib/moment/LICENSE
./resources/lib/mustache/LICENSE
./resources/src/mediawiki.toolbar/images/ksh/LICENSE
./resources/src/mediawiki.toolbar/images/ru/LICENSE
./run_ctags
./serialized/Makefile
./skins/README
./tests/parser/README
./tests/phpunit/data/media/README
./tests/phpunit/data/xmp/README
./tests/phpunit/includes/GlobalFunctions/README
./tests/phpunit/includes/phpunit/LICENSE
./tests/phpunit/includes/phpunit/README
./tests/phpunit/Makefile
./tests/phpunit/README
./tests/phpunit/TODO
./UPGRADE
$ 

Some are scripts but most others are actual plain file that sometime happens to use wikitext. I am not a fan of renaming them all with a .mediawiki or .md extension. Seems that is to help with T106116: The Doxygen version in CI parses README files as garbled C. on which you already found the solution to have files with no extensions to default to markdown:

EXTENSION_MAPPING = ".no_extension=md"

So I would just do that in the Doxyfile configuration.

Note that at least /CREDITS must be Wikitext it is used for https://www.mediawiki.org/wiki/Special:Version/Credits . Others might need to be wikitext as well but I have investigated.

Change 249307 had a related patch set uploaded (by Spage):
Rename to CREDITS.mediawiki, add symlink

https://gerrit.wikimedia.org/r/249307

[list of files with no extension] Some are scripts but most others are actual plain file that sometime happens to use wikitext. I am not a fan of renaming them all with a .mediawiki or .md extension.

But without an extension, diffusion and GitHub will continue to display these files as preformatted lines of code (here is skins/README in Diffusion and in GitHub). If the file looks good as markdown or is wikitext, please let developers say so! For example, here is skins/README renamed to README.md ); @hashar, would you -1 that rename?

Seems that is to help with T106116: The Doxygen version in CI parses README files as garbled C. on which you already found the solution to have files with no extensions to default to markdown:

EXTENSION_MAPPING = ".no_extension=md"

So I would just do that in the Doxyfile configuration.

Sure, let's try that. But we still need to give text files the right extension for the reason above.

Note that at least /CREDITS must be Wikitext it is used for https://www.mediawiki.org/wiki/Special:Version/Credits . Others might need to be wikitext as well but I have investigated.

@Krenair added "Symlink README.mediawiki to README so Github renders it as wikitext.", but that fix seems to no longer work, https://github.com/wikimedia/mediawiki/blob/master/README.mediawiki :-( But we could instead make /CREDITS a symlink to CREDITS.mediawiki (I did so in https://gerrit.wikimedia.org/r/249307 ).

Its traditional in unix land that certain files have names without extensions, like README and COPYING.

True, but README has clearly been superseded by the README.md convention in GitHub and Diffusion, and all the guidelines I can find say mention both, e.g.: " It is also recommended that you include a file called COPYING (or COPYING.txt) containing the CC0 legalcode as plain text."
The extension-less tradition has always been hostile to Windows users, and we have newer conventions where we identify the markup in text files so that computers can display them nicely.

Seems like the real bug here is that doxygen isn't assuming that extensionless files are text, as would be sane.

But what kind of text? Something to dump out in a <pre> tag (good if it has code snippets or tabular info), or in a <poem> parser tag/HTML div with white-space: pre to preserve whitespace, or (as hashar proposes), something to run through a markdown processor? I can't see a way to tell Doxygen to do either of the first two processing, so I think treat as .md as hashar proposes is the best/only solution.

GitHub handles both .wiki and .mediawiki as MediaWiki wikitext, but diffusion treats .wiki as raw code (sample); maybe we should configure phabricator.wikimedia.org to treat .wiki as MediaWiki wikitext as well.

Note that at least /CREDITS must be Wikitext it is used for https://www.mediawiki.org/wiki/Special:Version/Credits . Others might need to be wikitext as well but I have investigated.

@Krenair added "Symlink README.mediawiki to README so Github renders it as wikitext.", but that fix seems to no longer work, https://github.com/wikimedia/mediawiki/blob/master/README.mediawiki :-( But we could instead make /CREDITS a symlink to CREDITS.mediawiki (I did so in https://gerrit.wikimedia.org/r/249307 ).

No, I think it works, the point is you browse to https://github.com/wikimedia/mediawiki and see the README.mediawiki file rendered there instead.

Spage updated the task description. (Show Details)Nov 5 2015, 11:18 PM
Spage set Security to None.

Change 252373 had a related patch set uploaded (by Spage):
Rename README to README.md

https://gerrit.wikimedia.org/r/252373

Change 252373 merged by jenkins-bot:
Rename README to README.md

https://gerrit.wikimedia.org/r/252373

Change 249307 abandoned by Hashar:
Rename to CREDITS.mediawiki, add symlink

Reason:
Not much need for that given my previous comment.

https://gerrit.wikimedia.org/r/249307

Framawiki moved this task from Backlog to Doing on the good first task board.Dec 2 2017, 1:33 PM
Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptDec 2 2017, 1:33 PM
Annysah01 added a subscriber: Annysah01.

@Spage I would love to work on this

Given the discussion in T116690#1760152 etc it's unclear to me what is wanted here, and what is the best way forward.
Hence removing good first task.

Change Doxygen config to EXTENSION_MAPPING = ".no_extension=md" ?
Rename some files? Doesn't sound like a good idea according to https://gerrit.wikimedia.org/r/c/mediawiki/core/+/249307
Do anything / nothing at all and decline this task?

Krinkle closed this task as Resolved.Oct 22 2020, 10:51 PM

All relevant files have afaik already been renamed to their proper extension. The issues preventing that have since been resolved.