Explain the wiki syntax in detailed EBNF
Closed, DeclinedPublic

Description

Author: xmlizer

Description:
It is important to make a project to give the exact EBNF syntax wich contain all
the subtilities of the wikisyntax


Version: unspecified
Severity: enhancement

bzimport added a project: MediaWiki-Documentation.Via ConduitNov 21 2014, 6:42 PM
bzimport added a subscriber: Unknown Object (MLST).
bzimport set Reference to bz7.
bzimport created this task.Via LegacyAug 10 2004, 4:36 PM
bzimport added a comment.Via ConduitAug 23 2004, 9:23 PM

leercontainer-bugzilla wrote:

(In reply to comment #0)

It is important to make a project to give the exact EBNF syntax wich contain all
the subtilities of the wikisyntax

Why don't you start a meta page with the basic framework?

bzimport added a comment.Via ConduitAug 31 2004, 1:34 PM

alpeterson wrote:

[[meta:EBNF]]

http://www.garshol.priv.no/download/text/bnf.html

http://www.cl.cam.ac.uk/~mgk25/iso-ebnf.html

(I didn't know what ebnf stood for...)

bzimport added a comment.Via ConduitSep 1 2004, 12:24 AM

timwi wrote:

I boggled my mind over this recently. What exactly would the [E]BNF for Wiki
Syntax describe?

In theoretical computer science, formal grammars are used to generate a language
(a set of strings). Some grammars can be turned into a characteristic algorithm,
i.e. one that determines if a given string is in the language. The algorithm is
said to "accept" or "reject" input strings. However, MediaWiki is supposed to
accept *ALL* strings: all strings are valid inputs and are turned into some
valid XHTML.

In practice, grammars are used to write parsers such as the one I'm currently
working on. Here, the grammar tells the parser what to do - or more precisely,
the production rules do, and as such, they sort of set out the semantics of the
mark-up. But how do you clarify semantics without the production rules?

Makes you wonder about stuff :)

bzimport added a comment.Via ConduitSep 1 2004, 12:25 AM

timwi wrote:

Oh, and I forgot to mention this. EBNF seems to be for context-free grammars
only. The MediaWiki syntax for lists is not context-free however. I am
circumventing this in my parser by using a post-processing step, but if you're
only writing BNF, you can't do that...

bzimport added a comment.Via ConduitOct 4 2004, 4:22 AM

wmahan_04 wrote:

(In reply to comment #4)

Oh, and I forgot to mention this. EBNF seems to be for context-free grammars
only. The MediaWiki syntax for lists is not context-free however. I am
circumventing this in my parser by using a post-processing step, but if you're
only writing BNF, you can't do that...

In light of that, is this bug WONTFIX? Or is it possible to describe wiki
in some sort of pseduo-BNF, short of duplicating your flex/bison parser?

bzimport added a comment.Via ConduitDec 15 2005, 11:10 PM

robchur wrote:

This bug is, "go write it on Meta" fix. ;-)

brion added a comment.Via ConduitDec 15 2005, 11:13 PM

Not sure I understand why this was closed.
A formal grammar is something we really need (and it may require
fixes to the grammar as well ;)

HappyDog added a comment.Via ConduitJul 15 2006, 1:15 PM

Some work has been going on at mediawiki.org
(http://www.mediawiki.org/wiki/Markup_spec and
http://www.mediawiki.org/wiki/Markup_spec/BNF/). It's early days and any input
would be appreciated.

bzimport added a comment.Via ConduitJan 22 2007, 2:35 PM

antoine.musso wrote:

Another work on meta:
http://meta.wikimedia.org/wiki/Wikitext_Metasyntax

tstarling added a comment.Via ConduitFeb 17 2008, 7:24 AM

A hopefully complete representation of the MW 1.12 preprocessor in ABNF is at:

http://www.mediawiki.org/wiki/Preprocessor_ABNF

tstarling added a comment.Via ConduitFeb 17 2008, 11:55 AM

Please note that the set of production rules alone does not allow you to derive the correct parse tree from a given input text. Wikitext is ambiguous in lots of complex and interesting ways. The disambiguation rules need to be specified along with the grammar.

I found the preprocessor ABNF project an enlightening exercise. You can say a lot about the syntax in a short space. And while I attempted to explain the disambiguation process, I know of no way to do this rigorously, without resorting to writing algorithms.

AzaToth added a comment.Via ConduitMar 19 2008, 6:44 PM

It seems that with http://www.mediawiki.org/wiki/Preprocessor_ABNF this bug is fixed

tstarling added a comment.Via ConduitMar 20 2008, 12:28 AM

No it is not fixed. That page only describes a tiny portion of parser behaviour.

GWicke added a comment.Via ConduitAug 9 2012, 5:40 PM

We have a fairly complete PEG tokenizer grammar in Parsoid (http://www.mediawiki.org/wiki/Parsoid), which describes the context-free portions of wikitext. Context-sensitive portions are handled in token stream transformers. The PEG parse tree is flattened to a token stream so that we can support unbalanced template expansions, and finally converted into a DOM using a tree builder library according to the error recovery algorithms described in the HTML5 spec.

The grammar is interspersed with actions and uses syntactic scope flags to compress the grammar productions a bit, so it is not the most readable grammar ever. Unrolling productions for all scope permutations might not help that much either, as this would increase the size of the grammar a lot.

GWicke added a comment.Via ConduitMay 10 2013, 11:49 PM

Describing all of WikiText in EBNF is simply impossible, as parts of it are context-sensitive. Closing as wontfix for that reason.

Aklapper added a project: Documentation.Via WebJan 7 2015, 3:37 PM

Add Comment