Page MenuHomePhabricator

Automate process to get article summaries
Closed, DeclinedPublic

Description

In order to populate the Wiki-highlights microsite, we need to get article section summaries for about 30 articles, at least 5 sections for each article. The goal of this task is to find a way to automate that process somehow, possibly with the Interrogator tool for example: https://gitlab.wikimedia.org/repos/machine-learning/chatgpt-interrogator.

If using a LLM to obtain summaries, from the manual process tried so far, the available input is a prompt along with a article text, and the desired output is the summarized text itself.

Related Objects

StatusSubtypeAssignedTask
OpenNone
DeclinedNone

Event Timeline

eamedina triaged this task as Medium priority.Aug 24 2023, 5:58 PM

From an initial attempt at testing out the Interrogator tool for this purpose, I came at a version error where it looks like the selenium version needs to be upgraded: https://stackoverflow.com/questions/76913935/selenium-common-exceptions-sessionnotcreatedexception-this-version-of-chromedri


eamedina@wmf3285 chatgpt-interrogator % poetry run prompter --prompt-file /Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/autochatgpt/prompts.csv
Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/autochatgpt/main.py", line 22, in send_prompts
    bot = ChatGPTBot(headless=False, account_type=account_type)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/autochatgpt/chatgptbot.py", line 20, in __init__
    self.driver = self.set_driver(headless, self.implicitly_wait_time)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/autochatgpt/chatgptbot.py", line 34, in set_driver
    driver = uc.Chrome(options=options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/.venv/lib/python3.11/site-packages/undetected_chromedriver/__init__.py", line 466, in __init__
    super(Chrome, self).__init__(

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/.venv/lib/python3.11/site-packages/selenium/webdriver/chrome/webdriver.py", line 45, in __init__
    super().__init__(

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/.venv/lib/python3.11/site-packages/selenium/webdriver/chromium/webdriver.py", line 56, in __init__
    super().__init__(

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 206, in __init__
    self.start_session(capabilities)

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/.venv/lib/python3.11/site-packages/undetected_chromedriver/__init__.py", line 724, in start_session
    super(selenium.webdriver.chrome.webdriver.WebDriver, self).start_session(

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 290, in start_session
    response = self.execute(Command.NEW_SESSION, caps)["value"]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 345, in execute
    self.error_handler.check_response(response)

  File "/Users/eamedina/internet/wikimedia/machine-learning/chatgpt-interrogator/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
    raise exception_class(message, screen, stacktrace)

selenium.common.exceptions.WebDriverException: Message: unknown error: cannot connect to chrome at 127.0.0.1:63578
from session not created: This version of ChromeDriver only supports Chrome version 114
Current browser version is 116.0.5845.110
Stacktrace:
0   undetected_chromedriver             0x00000001008646b8 undetected_chromedriver + 4937400
1   undetected_chromedriver             0x000000010085bb73 undetected_chromedriver + 4901747
2   undetected_chromedriver             0x0000000100419616 undetected_chromedriver + 435734
3   undetected_chromedriver             0x000000010044bd10 undetected_chromedriver + 642320
4   undetected_chromedriver             0x0000000100442f98 undetected_chromedriver + 606104
5   undetected_chromedriver             0x000000010048aa08 undetected_chromedriver + 899592
6   undetected_chromedriver             0x0000000100489ebf undetected_chromedriver + 896703
7   undetected_chromedriver             0x0000000100480de3 undetected_chromedriver + 859619
8   undetected_chromedriver             0x000000010044ed7f undetected_chromedriver + 654719
9   undetected_chromedriver             0x00000001004500de undetected_chromedriver + 659678
10  undetected_chromedriver             0x00000001008202ad undetected_chromedriver + 4657837
11  undetected_chromedriver             0x0000000100825130 undetected_chromedriver + 4677936
12  undetected_chromedriver             0x000000010082bdef undetected_chromedriver + 4705775
13  undetected_chromedriver             0x000000010082605a undetected_chromedriver + 4681818
14  undetected_chromedriver             0x00000001007f892c undetected_chromedriver + 4495660
15  undetected_chromedriver             0x0000000100843838 undetected_chromedriver + 4802616
16  undetected_chromedriver             0x00000001008439b7 undetected_chromedriver + 4802999
17  undetected_chromedriver             0x000000010085499f undetected_chromedriver + 4872607
18  libsystem_pthread.dylib             0x00007ff8071bb1d3 _pthread_start + 125
19  libsystem_pthread.dylib             0x00007ff8071b6bd3 thread_start + 15

Is this task still relevant in the context of the current microsite and experiment?