Page MenuHomePhabricator

I can't authenticaticate in Wikimedia Commons Query Service
Closed, ResolvedPublic

Description

I'm trying to access SparQL endpoint from Powershell - https://commons-query.wikimedia.org/ , but I'm getting HTML Login page instead of query results even if I provide "wcqsSession" token in my request.

This SO answer provides following snippet:

import sys
import requests

ENDPOINT = "https://wcqs-beta.wmflabs.org/sparql"
QUERY = """
SELECT ?file WHERE {
  ?file wdt:P180 wd:Q42 .
}
"""

r = requests.get(
    ENDPOINT,
    params={"query": QUERY},
    headers={"Accept": "application/sparql-results+json", "wcqsSession": "<token retrieved after logging in"}
)


print(r.text)

I've replaced <token retrieved after logging in with wcqsSession token I've got from browser (73 chars), but still I'm getting HTML instead of json.

This is a Pwsh alternative:

#$endpoint = 'https://commons-query.wikimedia.org/sparql'
$endpoint = 'https://wcqs-beta.wmflabs.org/sparql'
#$endpoint = 'https://commons-query.wikimedia.org/'

Invoke-RestMethod -Uri $endpoint -Method Get -Headers @{
  Accept = "application/sparql-results+json"
  wcqsSession = $wcqsSession
} -Body @{query=$sparql; format='json'}

It returns HTML as well (regardless of which endpoint url was used).

I wonder if I'm doing something wrong or is there another way to query WCQS with SparQL from Powershell?

Event Timeline

I'm not sure that wcqsSession is a header that WCQS will inspect. It should work if set as a cookie:
(also please use https://commons-query.wikimedia.org/sparql directly)

import requests

wcqsSession = "my wcqsSession cookie"

resp = requests.get("https://commons-query.wikimedia.org/sparql", params={"query": "SELECT * { ?s ?p ?o } LIMIT 1", "format": "json"}, cookies={"wcqsSession": wcqsSession})
resp.raise_for_status()
print(resp.json())

@Arrbee : could you give us guidance on the priority of this?

@Arrbee : could you give us guidance on the priority of this?

It should work if set as a cookie

Thank you, it works. Here is how it looks with Powershell:

$wcqsSession = Get-Clipboard # Get this token from browser headers

$wcqsSparql = [System.Uri]'https://commons-query.wikimedia.org/sparql'
$sparql = 'SELECT * { ?s ?p ?o } LIMIT 1'

$wcqsCookie = New-Object System.Net.Cookie
$wcqsCookie.Name = "wcqsSession"
$wcqsCookie.Value = $wcqsSession
$wcqsCookie.Domain = $wcqsSparql.DnsSafeHost

$WebSession = New-Object Microsoft.PowerShell.Commands.WebRequestSession
$WebSession.Cookies.Add($wcqsCookie)

$wcqsHeaders = @{ Accept = "application/sparql-results+json" }
Invoke-RestMethod -Uri $wcqsSparql.AbsoluteUri -Method Get -Headers $wcqsHeaders -Body @{query=$sparql} -WebSession $WebSession -OutVariable resp
dcausse claimed this task.
dcausse moved this task from Incoming to Done on the Discovery-Search (2025.02.10 - 2025.02.28) board.

@Podbrushkin thanks for confirming that it works and for adding a correct answer to SO.
I'm closing this ticket but feel free to re-open if you think I missed something.