Page MenuHomePhabricator

Consider having top-level configurable Web2Cit objects
Open, Needs TriagePublicFeature


Feature summary (what you would like to be able to do and where):

Some aspects of Web2Cit should be configurable, and they should apply globally for a given "use case" of Web2Cit. With "use case" I mean: translating a target webpage with the Web2Cit-Server, editing translation configurations for a domain the Web2Cit-Editor, running the Web2Cit-Monitor, or any other use of Web2Cit-Core.

To achieve this, consider:

Briefly, when the server, editor, etc is started, the configuration file is read. Then, a new Web2Cit object is created, passing the configurations read from the file. Finally, each time a new Domain or Webpage object are needed, they are created via the Web2Cit object (may change with T302589) initialized at the beginning (which in turn passes the necessary configurations to other modules down the road).

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):

These are some global configurations that may be defined this way:

  • storage mediawiki: instance (e.g.,, wiki path (e.g., /wiki), storage path (e.g., /Web2Cit/data), and api path (e.g., /w/api.php)
  • citoid api
  • user agent, to fetch target html, target citoid response, and domain configurations (see T302591)
  • field configuration: supported field names, expected output (validation pattern and whether they are array or not), to what final citation field they should map, if they are force-required (see T302019)
  • fallback template
  • catchall pattern (on or off)

Benefits (why should this be implemented?):

Having a central place (the configuration file) where to define some aspects of Web2Cit behavior and making sure they are used consistently and globally.