Page MenuHomePhabricator

[WikiGPT] Improve search results of WikiGPT
Closed, ResolvedPublic

Description

At the moment the corpus fed into ChatGPT includes one article. We want to expand that and include at least 3-4 articles

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Improve search by adding infoboxtoolforge-repos/wiki-gpt!12isarantoT329016-improve-searchmain
add hyperlinks in resultstoolforge-repos/wiki-gpt!5isarantoadd-hyperlinksmain
remove irrelevant links from answertoolforge-repos/wiki-gpt!4isarantoremove-irrelevant-linksmain
Customize query in GitLab

Related Objects

Event Timeline

isarantopoulos renamed this task from Add more articles to he corpus (instead of 1) to Add more articles to the corpus (instead of 1).
isarantopoulos renamed this task from Add more articles to the corpus (instead of 1) to [WikiGPT] Add more articles to the corpus (instead of 1).
isarantopoulos moved this task from Unsorted to In Progress on the Machine-Learning-Team board.

I fed into our prompt the first 3 articles returned from a search on wikipedia. However this seems to bring some confusion as some articles may be irrelevant

Screenshot 2023-02-03 at 6.02.32 PM.png (174×1 px, 123 KB)

In order to deal with this we changed the prompt by adding this

Treat each paragraph as a separate piece of information and use it for your response only if it is relevant to the question.

Adding this seemed to solve this problem as we also got some responses related to Call of Duty (video game that takes place at WWII)

Screenshot 2023-02-06 at 7.37.12 PM.png (614×2 px, 465 KB)

There was also this mishit as chatGPT assumes some stuff like the following:

Screenshot 2023-02-06 at 7.45.57 PM.png (616×2 px, 538 KB)

Because Elon Musk stated that he will step down as CEO (some time in the future) the model Jack Dorsey is the CEO of twitter as the article about him was returned in the search.
We also added the following instruction to our prompt in order to tackle this:

If the question is not explicitly answered in the text, don't assume an answer and say that you don't know.

However this only was solved when we combined it with more information (more articles - e.g. 10).

Screenshot 2023-02-06 at 7.40.04 PM.png (670×2 px, 564 KB)

Below we paste some other examples that yield interesting/nice results:

Screenshot 2023-02-06 at 7.53.58 PM.png (826×2 px, 910 KB)

Screenshot 2023-02-06 at 7.54.40 PM.png (868×2 px, 796 KB)

Screenshot 2023-02-06 at 7.57.27 PM.png (800×2 px, 760 KB)

Screenshot 2023-02-06 at 7.57.58 PM.png (742×2 px, 621 KB)

Screenshot 2023-02-06 at 8.21.48 PM.png (868×2 px, 855 KB)

isarantopoulos renamed this task from [WikiGPT] Add more articles to the corpus (instead of 1) to [WikiGPT] Improve search results of WikiGPT.Feb 8 2023, 10:32 AM

I figured out how to not have irrelevant links and text in the answer.
Initially I got this when asking "Where can I buy ski goggles in Jordan?"

Screenshot 2023-02-08 at 12.37.25 PM.png (590×2 px, 526 KB)

after tweaking the prompt with this Answer the following question and add these links in brackets **only if they are relevant to the question** {links}:

and this

If the question is not explicitly answered in the text, don't assume an answer and respond only with the following message
"I am sorry, but I am unable to answer this question. I can only answer questions that are answered inside the content of Wikipedia"
without providing additional information and links.

we got back these results

Screenshot 2023-02-08 at 12.37.31 PM.png (360×1 px, 57 KB)