Page MenuHomePhabricator

Provide a way to add hyperlink in Quarry results/output
Open, Needs TriagePublic

Description

Possible use cases include http://quarry.wmflabs.org/query/879 where I would like to show the corresponding link for each file.

It could be done by assuming some notations such as

  • wikitext-style [[link]] (with enwiki as default?)
  • HTML-style link

Details

Reference
bz72874

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:57 AM
bzimport added a project: Quarry.
bzimport set Reference to bz72874.
bzimport added a subscriber: Unknown Object (MLST).

DragonflySixtyseven mentions that results such as https://quarry.wmflabs.org/query/17462 would be more useful if they were hyperlinks. I agree.

One option: a "Linkify" button that can handle the simple cases. For example, pressing the button would turn the text "Psi.PNG" into a link like "Psi.PNG". We could create an array between SQL column name and default link behavior (e.g., {'img_name': 'https://wiki.project.org/wiki/File:[img_name]'}).

Some users are already using CONCAT() to surround results with wiki markup; for example: https://quarry.wmflabs.org/query/17459. Another option would be to pass this type of output to (for example) https://en.wikipedia.org/w/api.php and render the parsed wiki markup as HTML. I don't love this option as it requires using CONCAT() and it gets creepily close to T137179: Setup an easy way to have Quarry dump information / results on a wiki page but doesn't resolve that task.

I'd appreciate any thoughts on the best way to solve this task.

The use-cases I've seen or wanted myself, would benefit from being able to specify the CONCAT()-like behaviour. I.e. not just linking to a default location based on column names. E.g. If I'm generating a list of usernames, I might want those usernames to be linked directly to:

If we dropped the toollabs example (because my test didn't work with that as an interwiki link) and keep it to simple wikilinks formatted as prefix+suffix, does that simplify things, and point to an answer? I suspect that would solve the majority of use-cases.
(Caveat: I'm not widely familiar with other peoples' Quarry usage, nor a dev, so might be underestimating or misunderstanding)

MZMcBride renamed this task from Provide a way to hyperlink texts to Provide a way to hyperlink Quarry results/output.Jul 6 2017, 12:58 AM
MZMcBride updated the task description. (Show Details)

Ideally this would be defined in the query code, e.g.

SELECT
   page_namespace, page_title, -- @quarry:format page_title
   el_to, -- @quarry:format url
   gt_lat, gt_lon, -- @quarry:format coordinate
   rev_user_text -- @quarry:format link https://en.wikipedia.org/wiki/User:%s

Doesn't seem horribly hard for single fields (use something like sqlparse, detect comment immediately following column name), less clear how it would could work for combined fields like namespace + title. Also for page links you'd need to know namespace names which is nontrivial.

As a very simple first step, fields that match an URL regexp could be linkified.

Framawiki renamed this task from Provide a way to hyperlink Quarry results/output to Provide a way to add hyperlink in Quarry results/output.Apr 16 2018, 9:55 PM

I like Tgr’s both proposals, even if the implementation is nontrivial for the first comment.

Perhaps for the format page_title it can be used the MediaWiki API with cached results for e.g. 7 days (this request).

Also it is needed the mapping dbname → server name. This could be this (ordered) dictionary:

  1. r'^([a-z]{2,3}(_[a-z_]+)?|test2|test|simple|nostalgia)wiki_p$' → r'\1.wikipedia.org'
  2. r'^([a-z]+)(wik(?:inews|isource|imedia|tionary|ivoyage|ibooks|iquote|iversity))_p$' → r'\1.\2.org'
  3. r'^(wikidata|mediawiki)wiki_p$' → r'www.\1.org'
  4. r'^testwikidatawiki_p$' → r'test.wikidata.org'
  5. r'^sourceswiki_p$' → r'wikisource.org'
  6. r'^([a-z0-9]+)wiki_p$' → r'\1.wikimedia.org'

A more advanced suggestion (but could be another task) would be to have a dictionary about the datatype of common field names. E.g. default formatting for rev_user_text is @quarry:format link https://{$wgServer}/wiki/User:%s. But some fields have no meaning alone, like page_title without page_namespace.

I would have a first fix in JavaScript for the URLs, which is a simpler case, but I’m not sure it is really useful given there are not so much URLs in MediaWiki databases. Probably we should instead implement the more complete proposal, because the main format which is really helpful is page_title.

PS: I’m stuned about the quality of the documentation of dataTables.

diff --git a/quarry/web/static/js/query/view.js b/quarry/web/static/js/query/view.js
index 24ef11b..3b67df5 100644
--- a/quarry/web/static/js/query/view.js
+++ b/quarry/web/static/js/query/view.js
@@ -124,10 +124,12 @@ $( function () {
                                title: htmlEscape( header ),
                                render: function ( data /* , type, row */ ) {
                                        if ( typeof data === 'string' ) {
-                                               return htmlEscape( data );
-                                       } else {
-                                               return data;
+                                               data = htmlEscape( data );
+                                               if ( /^https?:\/\//.test(data) ) {
+                                                       data = '<a href="' + data + '">' + data + '</a>';
+                                               }
                                        }
+                                       return data;
                                }
                        } );
                } );

I would have a first fix in JavaScript for the URLs, which is a simpler case, but I’m not sure it is really useful given there are not so much URLs in MediaWiki databases.

It's easy to construct the URLs in SQL, though.

In T74874#4664854, @Tgr wrote:

It's easy to construct the URLs in SQL, though.

The JS patch adds a clickable link, URLs constructed in SQL would not be clickable.

I tested sqlparse, with the idea to solve T188538 with the same library, but it seems a bit overkill to just retrieve the comment after the field names, there is hardly parsing error detection (you have to search 'Error' tokens).

I prepare some regexes to extract the comments after field names in SELECT commands as described in T74874#3595371.