Now that we're losing access to Yahoo's index, we need to migrate existing copyvio tools to using a different API.
Description
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | None | T116957 Plagiarism detection tools for text (tracking) | |||
| Resolved | kaldari | T131169 Help CorenBot migrate to a new API | |||
| Resolved | kaldari | T125459 Investigation: Can we find a new search API for CorenSearchBot and Copyvio Detector tool? | |||
| Resolved | bd808 | T132943 Create an API request proxy so that Tool Labs tools can access the Yandex API from a single IP address | |||
| Resolved | bd808 | T132950 Create project Yandex-proxy | |||
| Resolved | yuvipanda | T132982 Static IP for yandex-proxy01.yandex-proxy.eqiad.wmflabs |
Event Timeline
Comment Actions
The authorization for the new API is done through a basic authorization header passed as part of the API request.
In PHP this would be implemented something like:
$accountKey = 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA';
$webSearchURL = 'https://api.datamarket.azure.com/Bing/Search/Web?$format=json&Query=';
$context = stream_context_create(array(
'http' => array(
'request_fulluri' => true,
'header' => "Authorization: Basic " . base64_encode($accountKey . ":" . $accountKey)
)
));
$request = $webSearchURL . urlencode( '\'' . $_POST["searchText"] . '\'');
$response = file_get_contents($request, 0, $context);
$jsonobj = json_decode($response);@coren: Let me know what I can do to help. Do you have the code in a version control system?
Comment Actions
The new Google Search API is now available for use from Tool Labs.
API documentation:
https://developers.google.com/custom-search/json-api/v1/using_rest#making_a_request
Proxy documentation:
https://wikitech.wikimedia.org/wiki/Nova_Resource:Google-api-proxy
Ping me to get the key and cx value.