We need to decide what the syntax for defining a chart looks like. We also need to decide how charts are embedded in articles.
Provisional recommendation:
via internal notes https://docs.google.com/document/d/1Wu7dlDmLhReglmh9pNXrt6YvYGHIp46mgyyVQrt7YQc/edit
A #chart: parserfunction will accept two Data: namespace parameters, one for the format definition (a Data:....chart page) and the other an optional tabular page (Data:....tab) to allow reusing the same definition with many data sets:
{{#chart:format=Weather monthly history.chart
|data=ncei.noaa.gov/weather/Detroit.tab}}In cases where only a single data set is used and it's defined in the format description, it can be omitted in the invocation.
{{#chart:}} parser function parameters:
- format specifies the Data:.chart page with the format definition. If it is left out, a default line graph will be emitted with labels from the tabular data.
- data specifies the Data:.tab page with the source data. In the future this could point to other subtypes of Data: pages such as an encapsulated SPARQL query to Wikidata, hence using “data” rather than “table”. If it is left out, a default data file specified in the chart format will be rendered.
Tabular data structure is documented at https://www.mediawiki.org/wiki/Help:Tabular_Data
Recommend a similar JSON layout for the Data:.chart pages and their localizable text strings.
Here’s a sample election chart with its inline data definitions taken out and template parameters reworked a bit; note that the xAxis* and yAxis* params have been moved into sub-objects. Text has been extended to be localizable in the same format used for Data:.tab pages, which will fill out some column titles by default.
Invocation:
{{#chart:format=1993 Canadian federal elections.chart}}Format description in Data:.chart page (see full list of params to come):
{ "version": 1, "type": "line", "width": 350, "height": 200, "xAxis": { "title": "", "angle": -40, "type": "date" }, "yAxis": { "title": { "en": "%support", "fr": "%soutien" } }, "legend": { "en": "Party", "fr": "Parti" }, "interpolate": "basis", "showSymbols": true, "colors": [ "#9999FF", "#EA6D6A", "#F4A460", "#87CEFA", "#3CB371", "#FF00FF" ], // Types and column titles are specified in the .tab // you can override the data source via invocation params // to reuse the format on different data sets, but this one // will be used in previews of the format page or if you // don’t specify in invocation "source": "1993 Canadian federal election.tab" }
The matching Data.tab page would be:
{ "license": "CC0-1.0", "description": { "en": "1993 Canadian federal election", "fr": "Élections fédérales canadiennes de 1993" }, "schema": { "fields": [ { "name": "date", "type": "string", "title": { "en": "Date", "fr": "Date" } }, { "name": "pc", "type": "number", "title": { "en": "PC", "fr": "PC" } }, { "name": "liberal", "type": "number", "title": { "en": "Liberal", "fr": "Libéral" } }, { "name": "ndp", "type": "number", "title": { "en": "NDP", "fr": "NPD" } }, { "name": "bq", "type": "number", "title": { "en": "BQ", "fr": "BQ" } }, { "name": "reform", "type": "number", "title": { "en": "Reform", "fr": "Réform" } } ] }, "data": [ ["1993/09/9",35,37,8,8,10], ["1993/09/14",36,33,8,10,11], ["1993/09/20",35,35,6,11,11], ["1993/09/25",30,37,8,10,13], ["1993/09/26",31,36,7,11,13], ["1993/09/26",28,34,7,12,15], ["1993/09/30",25,39,6,12,17], ["1993/10/02",26,38,8,12,14], ["1993/10/08",22,37,8,12,18], ["1993/10/16",22,40,7,13,16], ["1993/10/19",21,39,6,14,17], ["1993/10/22",18,43,7,14,18], ["1993/10/22",16,44,7,12,19], ["1993/10/25",16.04,41.24,6.88,13.52,18.69] ] }
Format JSON summary
Based on the parameters for Module:Graph
- version - number: 1 for now, can be incremented in case of back-incompatible version changes in future
- width - number: canvas size in CSS pixels
- height - number: canvas size in CSS pixels
- type - string: "line", "area", "bar"/"rect", "pie", "stackedline", "stackedarea", "stackedrect"
- interpolate - string: "monotone", "basis" etc (?)
- colors - array<String>: list of hex codes ("#123456") per data column
- x - object: X axis config
- title - string/loc (else empty?)
- min - number (else auto)
- max - number (else auto)
- format - string: format strings (is this safe to expose or do it differently?)
- angle - number: degrees to rotate the X axis off normal (else 0)
- type - string: data type ("integer", "number", "date", "string")
- grid - boolean: whether to show grid lines on this axis,
- y - object: Y axis config
- same as xAxis but no angle
- legend - string/loc: title for the legend box
- linewidth - number: css px
- showValues - object
- format - string: format string (is this safe to expose or do something different?)
- fontcolor - string
- fontsize - number
- offset - number
- angle - number (pie charts)
- showSymbols - boolean: whether to show symbol markers on the data points
- innerRadius - number (pie charts)
- source - string: pointer to _Data:.tab_ (or other in future) source data page
Older task notes below:
What the legacy Graph extension does
The legacy Graph extension uses a parser tag that contains JSON, like this:
<graph title="example graph"> { "version": 2, "width": 950, "height": 400, ..... } </graph>
The chart definition is always inlined in the article. Reuse of graph definitions is achieved by having a template or Lua module generate the <graph> tag and the JSON inside it.
The chart data can be inlined in the graph definition, or the graph definition can refer to a page in the Data: namespace on Commons (or to multiple Data: pages, or to certain other data sources like the pageviews API or Wikidata SPARQL queries).
Option 1: Inline chart definitions
In this option, chart definitions are inlined in the article. Data is not inlined, but always lives on a Data: page.
Most likely to use a parser function with parameter-passing style, rather than JSON format, to be comfortable for template-wielding editors and Lua module writers wanting to build meta-libraries around the parser function.
The building blocks we have to work with are:
- A parser function ({{#chart: ... }})
- The main argument ({{#chart:foo}})
- Unnamed parameters ({{#chart:foo|value1|value2|...}}_
- Named parameters ({{#chart:foo|param1=value1|param2=value2|...}})
Named parameters similar to what Module:Graph takes are likely to work well with many existing Graphs uses, and not complex to implement: however this means folks will likely build things using templates rather than Data: pages for char data & definition sharing.
To reuse charts across articles, you'd either have to generate them with templates/Lua, or make the definitions so simple that they don't need to be templated (for example, if all it takes to render a temperature graph is {{#chart:data=Monthly temperature in San Francisco.tab|type=temperature|period=1991-2020}} then maybe that doesn't need a template).
Option 2: Chart definition on its own page
In this option, each chart definition lives on its own page in a new Chart: namespace. These Chart: pages would use a JSON content model, and could have custom preview and editing functionality to make the JSON easier to work with. These chart definitions would not include or refer to data, to allow them to be reused with different data sources on different articles. This would allow us to do things like create one Chart:Temperature that is used for all temperature/climate graphs on articles about cities: the chart would look the same on all of these articles, but the data would be different for each one.
Embedding a chart in an article would be done by referring to the Chart: page and the Data: page, perhaps like this: {{#chart:Temperature|data=Monthly temperature in San Francisco.tab}}. This would pull the chart definition from [[Chart:Temperature]] and fetch the data from [[Data:Monthly temperature in San Francisco.tab]].
This approach would rely on templates much less: reuse of charts would be accomplished by putting chart definitions on their own pages separate from the data, and chart definitions could not themselves contain templates (but maybe they could take parameters).
It is likely that existing data sets will need to be cut down for individual graphs, so allowing column selection and a range limit would be good options with shared data sets.
This would limit parameters to simply selecting the definition, data set, and subranges to render into.
Note that Data:....tab pages can have localized field names, which we can use in the chart rendering. We would want the same for any overridden labels in the chart definition.
Note the chart definitions could live in the Data: namespace as well instead of a separate Chart: namespace, in which case it'd be Data:Foobar.chart or such.
Recommend writing up a few sample charts on each option and looking at what the difficulties look like.
Acceptance criteria
- Decide on requirement for using separate Data: and/or Chart: definitions and settle on syntax split
- Write a specification explaining what the syntax is
- Write a basic ADR for this decision