Page MenuHomePhabricator

Error mysqlquery
Closed, ResolvedPublic

Description

When run listpages.py -mysqlquery (or call function pagegenerators.py) occurs error that not found modules "oursql" and "MySQLdb". These files no in pwb and python packages.

E.g. python scripts\listpages.py -mysqlquery:"SELECT * FROM page LIMIT 10"
or list = pywikibot.pagegenerators.MySQLPageGenerator('SELECT * FROM page LIMIT 10')

Traceback (most recent call last):
  File "c:\python35\lib\site-packages\pywikibot-2.0rc3-py3.5.egg\pywikibot\pagegenerators.py", line 2210, in MySQLPageGenerator
    import oursql as mysqldb
ImportError: No module named 'oursql'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
...
  File "c:\python35\lib\site-packages\pywikibot-2.0rc3-py3.5.egg\pywikibot\pagegenerators.py", line 2212, in MySQLPageGenerator
    import MySQLdb as mysqldb
ImportError: No module named 'MySQLdb'
<class 'ImportError'>

Event Timeline

Vladis13 created this task.Aug 3 2016, 7:01 PM
Restricted Application added subscribers: pywikibot-bugs-list, Aklapper. · View Herald TranscriptAug 3 2016, 7:01 PM
Vladis13 updated the task description. (Show Details)Aug 3 2016, 7:11 PM

pip install oursql python-mysql should install the required packages.

Vladis13 added a comment.EditedAug 3 2016, 9:42 PM
  • "pip install oursql" occurs errors too. Oursql is critical outdated. On install it breaks on old format command "print " and something problem with Visual Studio. Also it require install "cython".
  • "pip install python-mysql" occurs error:
Could not find a version that satisfies the requirement python-mysql (from versions: )
No matching distribution found for python-mysql

I tried just connect to DB-replica with edited the script User:Legoktm/wmflib, but Tool Labs require SSH connect. The script get only 1 host/user/password, but need connect with SSH-server, then with DB. I don't understand how do it.

Mpaa added a subscriber: Mpaa.EditedAug 3 2016, 10:48 PM

pip install MySQL-python ?
This is only for python2 i think ...

Turns out, oursql has a branch for Python 3. But

pip install oursql3

give error too: "utf-8' codec can't decode byte 0xcd in position 20: invalid continuation byte".

I wanted list pages that transcludes a template, for use in an autonomous script.
Now command "python scripts\listpages.py -transcludes" seems works. Perhaps something reinstalled at tries above.
I will try get the list by call "import os ; os.system(command)", although I wanted using function of pwb-framework.

(This looks like a support request to me, and issues with Oursql like T142021#2521093 should be reported to Oursql instead.)

Vladis13 added a comment.EditedAug 5 2016, 3:58 AM

I repoted to Qursql. But I doubt that there will be repaired and ever be answered.
Problem of Pwb is that the error is now, and no way works with MySQL, and obsolete info in documentation.

Also, "python scripts\listpages.py" does not write a list of pages in file.
And, redirect output in file on Windows does not work (like "python scripts\listpages.py -transcludes:"template_name"> list.txt" or "... | more > list.txt"). It gives error of python, as if ">" is argument of its call.

Also, I tried on Tool labs run "python scripts/listpages.py -mysqlquery:"SELECT * FROM page LIMIT 10"", where is installed shared Pwb 2.7 with Oursql.
In user-config.py set correct user\pw:

db_hostname = 'enwiki.labsdb'
db_username = '***'
db_password = '***'
db_name_format = '{0}_p'
db_connect_file = user_home_path('replica.my.cnf')

but again get error:

File "/data/project/shared/pywikipedia/core/pywikibot/pagegenerators.py", line 2504, in MySQLPageGenerator
    for row in row_gen:
  File "/data/project/shared/pywikipedia/core/pywikibot/data/mysql.py", line 61, in mysql_query
    port=config.db_port)
  File "connection.pyx", line 150, in oursql.Connection.__cinit__ (oursqlx/oursql.c:5482)
  File "connection.pyx", line 183, in oursql.Connection._raise_error (oursqlx/oursql.c:5885)
oursql.InterfaceError: (2002, "Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)", None)
Mpaa added a comment.Aug 5 2016, 8:01 PM

You got your query wrong.

mpaa@tools-bastion-03:~/core$ python scripts/listpages.py -mysqlquery:"SELECT page_namespace, page_title FROM page WHERE page_namespace = 0 LIMIT 10"
   1 "?"
   2 "A Bruised Reed Shall He Not Break"
   3 "A Family Sketch"
   4 "A Little While"
   5 "Abstinence sows sand all over"
   6 "Amarillis I Did Woo"
   7 "And the sins of the fathers shall be"
   8 "And with what body do they come?" --
   9 "And with what body do they come?" —
  10 "Arcturus" is his other name --
10 page(s) found
Mpaa added a comment.Aug 5 2016, 8:06 PM

Also, "python scripts\listpages.py" does not write a list of pages in file.
And, redirect output in file on Windows does not work (like "python scripts\listpages.py -transcludes:"template_name"> list.txt" or "... | more > list.txt"). It gives error of python, as if ">" is argument of its call.

For the redirection the is already an open bug (cannot recall which one).
Anyhow,

mpaa@tools-bastion-03:~/core$ python scripts/listpages.py -mysqlquery:"SELECT page_namespace, page_title FROM page WHERE page_namespace = 0 LIMIT 15" > file.txt
15 page(s) found
mpaa@tools-bastion-03:~/core$ cat file.txt 
   1 "?"
   2 "A Bruised Reed Shall He Not Break"
  ...
  15 "Birds of Prey" March

You got your query wrong.

mpaa@tools-bastion-03:~/core$ python scripts/listpages.py -mysqlquery:"SELECT page_namespace, page_title FROM page WHERE page_namespace = 0 LIMIT 10"

You run it from user-home directory, I installed configs of pwb in my user-home and it there works too. But me need run it from scripts in tool-home directory. In documentation Help:Tool_Labs/Developing#Setup Pwb the items 4-5 write that user-config.py should create in tools.tool@tools-login, same writen in User:Russell Blau/Using pywikibot on Labs.

This comment was removed by Vladis13.

Is there any progress on this? Could we add those libraries to pip requirements? What is the current state of reported bugs to those packages?

Xqt triaged this task as High priority.May 28 2017, 12:23 PM
Dvorapa added a comment.EditedJun 11 2017, 9:58 PM

pip 9.0.1, python 3.6.1
Successfull pip installations:

oursql

  • oursql3

mysql connectors

  • alembic
  • sqlalchemy
  • pymysql

One of them could be used, bud the code needs to be changed. MySQLdb in py3.5+ is currently supported only in MySQL's official mysql-connector-python which is not present in pypi archives.

Could be solved together: T89976

This comment was removed by Herzi.Pinki.

@Milimetric, could you please be so kind and add some boost on this topic?

matmarex removed a subscriber: matmarex.Feb 12 2018, 1:48 PM

(I am not very familiar with Pywikibot and I don't think I can help here.)

Yeah, I don't know much about pywikibot, who maintains it? I thought it was a crucial package for a lot of wiki bots, how can it be broken for so long?

Dvorapa added a comment.EditedFeb 12 2018, 3:29 PM

It is a crucial package for a lot of wiki bots indeed, but this issue is really old and I think nobody really knows today, how the code for this feature was supposed to work in the past. if I look into the code, I can see many outdated 3rd party libraries glued together in a not simple way to produce the result. As I wrote above, there are some alternative updated libraries, but I have no clue how to make it work again.

Pywikibot has got many alternative approaches (see https://doc.wikimedia.org/pywikibot/api_ref/pywikibot.html?highlight=pagegenerators#generator-options) to get and process certain list of pages or you can easily use e.g. Quarry to get list of pages from database to workaround this issue.

You can see project members e.g. here, but many of them are inactive for a long period or just mentor new developers to the project. I would also like to fix this issue, but it needs major changes to code, for which I haven't got so much time yet and probably I also lack of knowledge to fix this.

@Dvorapa, I have a rather complicated sql script calculating the center of a set of coordinates and sorting all the coordinates by decreasing distance from that center to find smelling coordinates. You can find the sql here: https://quarry.wmflabs.org/query/12034 (1)
I want to use it in python to run it across all combinations of iso-codes as found here: https://quarry.wmflabs.org/query/24402?action=history (2)
I wanted to use the sqlPageGenerator to get the results of script (1).
First I tried to iterate over all combinations of iso-codes in the quarry script, but failed to do so (not enough sql knowledge, lacking permissions to define stored procedures as well as cursors). I don't want to frickle around by downloading the results of script (1) multiple times.

If you can give me an advise, how this can be done in a different way, I will gladly try to go the other way.

Regarding the state of the sqlpagegenerator: If it is not supported any more, it should be removed from the docs.

@Herzi.Pinki Sorry but I don't understand what are you trying to do using Quarry or pwb. What I can recommend is: use both services for what they are supposed to. Pywikibot processes some list of wiki pages/WD claims and outputs/changes their properties; Quarry lists some wiki pages/WD claims and their properties. To make them work together you have to create a list of pages in Quarry, give it to pywikibot in a text file using -file: argument (or in python script using file pagegenerator) and process them in the way you want.

Lack of permissions on Quarry for some tasks is limiting, but secure. What I do for complicated tasks and recommend to you:

  1. get a list of what Quarry can handle (usually just not really sorted raw data of size in MB) and save it as CSV/Excel/JSON or what are you familiar to work with
  2. write and run simple (python) script that imports data, processes data (removes duplicate or invalid data, fixes some data, sorts data, prepares list of pages to process by pywikibot) and outputs final text file output.txt containing pages to change and maybe some additional notes about them for pywikibot to work with
  3. write and run simple pywikibot script to process and fix pages on wiki listed in output.txt

This easy three step worked for me many times before and I am really used to it.

Dvorapa added a comment.EditedFeb 12 2018, 4:51 PM

If I understand you correctly, you just wanted to use pywikibot to process your SQL multiple times only by chynging the country? As far as I can see, I would recommend for your problem to download database dump and run an SQL query on you machine. You can create subtables on you own PC as you want so this could be to easiest way to deal with your issue

If I understand you correctly, you just wanted to use pywikibot to process your SQL multiple times only by chynging the country?

Yeah, that was my intent. And to create a maintenance wiki page based on the results. I will think about your hints. Thanks a lot.

We have built a little tool called "reportupdater". We have only used it in our production cluster for now, but there's no reason it can't be used outside. Basically, its mission is to make it easy to execute a templated SQL query on one or more wikis, with as many parameters as you need, and generate regular report output from the results. If that sounds useful, the docs are here: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Reportupdater

And I'm happy to help set this up on labs to see how useful it is outside of our world.

Change 416370 had a related patch set uploaded (by Mpaa; owner: Mpaa):
[pywikibot/core@master] mysql.py: add PyMySql as pure-Python MySQL client library

https://gerrit.wikimedia.org/r/416370

Mpaa added a comment.Mar 5 2018, 8:48 PM

I am trying to address PyMySql support.

To support (and test) three different libraries on two platform is beyond my current setup.
As PyMySql is pure python and support CPython >= 2.6 or >= 3.3, my proposal is to support this library only.
And discontinuing completely MySQLdb and oursql.

Comments welcome.

There were support issues with Python 3.5+ in PyMySql in history, but I think they are solved and I agree with your proposal

+1 @Mpaa, PyMySql is the right choice here. Feel free to ping me on code reviews and I can help test if you need different platforms.

@Milimetric Perhaps @Mpaa haven't mentioned it yet, but he created a patch waiting for code review: https://gerrit.wikimedia.org/r/#/c/416370/

Change 416370 merged by jenkins-bot:
[pywikibot/core@master] mysql.py: add PyMySql as pure-Python MySQL client library

https://gerrit.wikimedia.org/r/416370

Dvorapa closed this task as Resolved.EditedJun 16 2018, 7:53 PM
Dvorapa claimed this task.
Dvorapa removed a project: TestMe.

The patch was merged. I've tested it on Toollabs (Debian/Ubuntu) and also on my local machine (Arch Linux) using sql dump and it works for me as expected on both Python 2 and 3. I'm closing this for now, but please test it by yourself too (expressly on Windows) and reopen/open new task if any problem occurs.

Change 495739 had a related patch set uploaded (by Hashar; owner: Hashar):
[pywikibot/core@master] mysql: remove traces of 'oursql' dependency

https://gerrit.wikimedia.org/r/495739

Change 495739 merged by jenkins-bot:
[pywikibot/core@master] mysql: remove traces of 'oursql' dependency

https://gerrit.wikimedia.org/r/495739