Page MenuHomePhabricator

How to get specific information form wiki page.
Closed, DeclinedPublic

Description

Author: kuntalemail

Description:
required Information to show on my website

I want to show my wiki page text on my web site. The page title is “Metallica”.

I need specific content from page as follow:

1)Summary data (As shown on page right most side in tabular form)
2)Just need basic information of page (Not included Contents list, History and another detail)

I have attached an image for required Information.

I have used API through wikislurp-0.1 library. It is working for get wiki text from Media wiki. But when it parse wiki text then it get error as follow:

Request: POST http://en.wikipedia.org/w/api.php, from 174.143.11.196 via sq60.wikimedia.org (squid/2.7.STABLE9) to ()

Error: ERR_INVALID_REQ, errno [No Error] at Wed, 09 Nov 2011 09:43:37 GMT

Another point:
I try to parse wiki text with get method (through URL) but I got whole page data. As I requested above, I required only specific data, is there any method in API or a way to get that specific information.

Please guide me on this.


Version: unspecified
Severity: normal

Attached:

metallica.JPG (590×841 px, 148 KB)

Details

Reference
bz32307

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 12:08 AM
bzimport set Reference to bz32307.
bzimport added a subscriber: Unknown Object (MLST).
  • Bug 32308 has been marked as a duplicate of this bug. ***

Most of this doesn't belong on bugzilla, as it's more of a help request. These belong on the appropriate place on mediawiki.org

For getting the "summary data", you'll have to grab the wikitext, and decide yourself what is the summary, this isn't defined, or extracted by MW.

Similar for the "basic information", you'll just have to grab the infoboxes from the wikitext, and parse out the data you're wanting.

As for Wikislurp, it almost seems abandoned [1], though some changes in 2010 to "make it work again". It's most likely that it is at fault for some reason or other

Also, putting 4 different "points" in one request just makes life hard.

What URL are you using for the parsing? What are you sending, what are you getting back?

[1] https://github.com/NeilCrosby/wikislurp

(In reply to comment #0)

Request: POST http://en.wikipedia.org/w/api.php, from 174.143.11.196 via
sq60.wikimedia.org (squid/2.7.STABLE9) to ()

Error: ERR_INVALID_REQ, errno [No Error] at Wed, 09 Nov 2011 09:43:37 GMT

That's most likely a bug in your client library. Squid doesn't barf with "invalid request" for no reason. There's a remote possibility there's a bug on our side, but in order to confirm that we'd have to see the HTTP headers sent for the failing POST request.

The rest of the report consists of support questions. You're welcome to ask for support on the API mailing list (mediawiki-api@lists.wikimedia.org), but Bugzilla is not a support forum.

kuntalemail wrote:

extremely sorry to post it on wrong place, I was very much stuck in parsing the data from wiki, I thought there is some API which can make my life easier so I took a chance with wikislurp but as you mentioned it don't work anymore, I will have to search some another script I guess, though Many Thanks for your kind reply and if you don't mind can you please tell me some place where I can find relevant info

Kind Regards