[Session] Generative AI: how will it change the landscape for Wikimedia projects
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Tohaomg
	Apr 16 2024, 9:36 AM

Description

Title of session: Generative AI: how will it change the landscape for Wikimedia projects
Session description: The current rise of the generative AI will significantly change the technological landscape and the place of Wikimedia projects in it. Wikimedia should be prepared for the changes the generative AI will bring. At least, we need to have a vision what to do. There are a lot of questions we have to discuss: how will we detect misinformation, to what exent the AI could be used to write and illustrate articles, could chat-bots (in particular those developed by the search engine companies) completely replace encyclopedias, could AI break how SEO currently works and push Wikipedia out of search engine results, etc.
Username for contact: @Tohaomg
Session duration: 50 min
Session type: discussion
Language of session: English
Prerequisites: none

Notes from session

Generative AI: how will it change the landscape for Wikimedia projects

Date and time: May 4, 2024, 14:00

Relevant links

Phabricator task: https://phabricator.wikimedia.org/T362624
Session slides: no slides

Presenter

@[[phab:p/Tohaomg/|Tohaomg]]

Notes

Generative AI is not something the wiki movement is prepared to manage, there are a lot of challenges and requires a lot of knowledge to know how to address it.
Today you can easily fake a voice, photo or video.
Examples: Donald Trump, In Ukraine Russian's spread videos of President Zelensky "giving up" generated by AI
It requires effort to determine if something is true. Being used by evil forces and have influence on Wiki-projects
Q; When I think of Wikimedia and AI I always think about the challenges in people reading Wikipedia and whether they can get answers. They are more comfortable with interacting. How come we use it to edit and correct and haven't thought about it so much, generating false sources and enabling to add references to statements, I wasn't concerned but am now :)
The competition is a concern, Google and Apple are developing their own AIs that could give answers to some questions and instead of going to Wikipedia people will go to those search engines and never reach Wikipedia and we will become obsolete like a paper encyclopedia. Much easier to take your smart phone and read Wikipedia and I fear we may become obsolete in very same way, also from my personal POV in Ukranian Wikipedia we have a big problem with automatic not proof read translation. Many editors just take the text of English paste it into Google translate and copy the result and paste it into Ukranian Wikipedia without editing it. This translation is not perfect.
Q: How do you know people are doing that?
First of all I saw one doing it on a conference sitting behind laptop; second they don't hide it. Don't see
also happening with content translations - people don't read the info they're adding int
Q: Are they doing this with ContentTranslation? What is the motivation for doing this? __ we're chasing quatlity instead of numbers, always wanting new articles to be more and more prollific . AI can scale dramatically.

Q: There's a lot of people coming in and going out do we want to move to a bigger room
<itnermission as we move to ballroom 1>

So we are gathered here to talk aboiut generative AI and how we should prepare for the new age when Internet/electricity/steam engine became a thing...\

The wiki community has a challenge to face, people can change something and no one will notice.
Some cases like in Chinese Wikipedia there was a whole cluster of made up articles, fantasy that were discovered only years later. In Polish
<those hundreds of articles in Chinese were human generated>

.[L] I was curious how commons is handling AI generated images right now. It's really interesting looking at deletion requests and seeing associated discussion - is it realistic, is it historically accurate, imperalist bias, and this is very valid and be worried about and we should also look at and explore on how we can stop that. I'm a big user of DALL.E but I like to generate diverse images of different skin colors and there's very strange results when I do that. Last week, I got a very racist image of a woman in color when I typed in the prompt "women with brown skin relaxing reading a book on couch".
[M]The AI community loves our data. They use it for a lot of information but they don't know how to contribute and expand the human knowledge. The market has changed, how do we react to it. We can partner with organizations that are ethical and will give fund back to our work via donations. Our job is to get information out to the world. We have the most wide ranging languages and cultural information from across the road and it's not being fed into this models (image, audio). By contributing to thee models they will become diverse and the open source AI communities are really apt for this (talked to 50 people at the last meeting). It's about not knowing what we have, not being able to access it . We work with them to improve that. They didn't know about Wikidata for example, only Wikipedia.
[host] in my country Ukraine there was one case where people wanted AI and one where they don't. Kiev metro, voice actor was used to announce station names, and say "be careful traindoor is closing". Was doing this for 50 years and died and then city government had to replace him. Tried different voices but nobody was liked by the city residents, so they proposed to put his voice into AI.
[host] Opposite case: Ministry of affairs of Ukraine said they had developed an AI spokeswoman and released videos of this woman commmenting on recent events and foreign policy but was not a real person. People don't like her because pyschologically they want a real person to reassure them. Don't believe the message as its a heartless machine.
[Susana] Jose of Wikipedia knowledge - there was some discussion I overheard that some labelling / community practices for tagging conflict don't travel to these areas so it can be treated differently. This understanding of special nature would then be understood by the reusers of the knowledge. There is likely a lot of legistation to come. Right now it is a wild west/gold rush. We are in a good position to define those practices as we know about the whole thing.
[QD] there are special practices around indigenous knowledge.
[Susana] we have very few

We have research in Brasil about it

(Will send a link to telegram group or put it in this etherpad)

The discrimination in systems will be propagated. When I go to commons sometimes I am scared as I am not sure what I am going to see as a lot of things are unfiltered and a bit too unfiltered for my taste - for example pictures of violence. I do have situations where I wish content was arranged differently. There are ways we could ask volunteers to adjust/moderate differently. Think of using AI

the people that are able to do the things, this time comes with privilege. Mostly white men moderating. You could use generative AI to represent under-represented voices. Will take them less time to do it. Build more equity by equipping these people with tools to contribute.
[Host] need a whole orchestra to replicate your idea, but now you can download a program on your notebook, print some keys and listen how it sounds. Another of my concerns is that current AIs are trained on the data which was generated by people. Sum of all knowledge available to people and used to train an AI. It's being reused againx3 and not capable of creating original ideas. It may be reusing the old ideas over and over again
and we will be stuck in one point of time forever. When I write to AI I write "please" so they will spare me in the war with the robots.

If humans were not creating content is the only way, this would happen. It is important to put the right training data.

[HOST] what if people become too lazy to generate their own data? and rely only on AI?

Not everyone can do it is the thing at the core. You need to have the knowledge.
Just curious has anyone typed into ChatGPT to see what it thinks about making Wikipedia obsolete and got 2 answers: (1)Generative AI has the ability to contribute but is unlikely to make Wikipedia obsolte. its a dynamic platform. Is unlikely to replace the human touch and editorial processes. (2) Lots of points about legal/ethical/human touch/interpretations. the following italic text is the question and answer posed:

Q posed to ChatGPT3.5: Will generative AI make wikipedia obsolete
A1: Generative AI certainly has the potential to contribute to information generation and synthesis. However, it's unlikely to render Wikipedia obsolete. Wikipedia is more than just a collection of information; it's a dynamic platform curated by volunteers worldwide, ensuring accuracy, neutrality, and comprehensiveness. Generative AI might assist in creating content, but it's unlikely to replace the human touch required for maintaining Wikipedia's standards and navigating its complex editorial processes.
A2: Generative AI has the potential to create content, but it's unlikely to make Wikipedia obsolete for several reasons:
Quality and Reliability: Wikipedia is known for its rigorous editorial standards and community-driven oversight. Generative AI, while capable of generating text, may not consistently produce accurate, unbiased, or well-researched content that meets Wikipedia's standards.
Contextual Understanding: Generative AI lacks true comprehension of context, nuance, and real-world knowledge. Wikipedia's articles are written by humans who understand the topics deeply and can connect related information in meaningful ways.
Dynamic and Living Community: Wikipedia is a dynamic platform with a large community of editors continuously updating and improving content. Generative AI, on the other hand, creates static outputs based on existing patterns and data.
Legal and Ethical Considerations: The use of generative AI for content creation raises concerns about plagiarism, copyright infringement, and ethical use of information. Wikipedia's content is mostly generated by individuals who adhere to legal and ethical guidelines.
Human Touch and Interpretation: Wikipedia articles often reflect the diversity of human perspectives and interpretations. Generative AI lacks this human touch and may struggle to convey complex ideas accurately.
In summary, while generative AI may automate certain aspects of content creation, it's unlikely to replace the multifaceted contributions and quality control mechanisms that characterize Wikipedia. Instead, AI can complement Wikipedia by assisting human editors in tasks like fact-checking, language translation, or identifying content gaps.

Thanks for bringing that up! ;) it's a good exaple of collaborating. but we can also be having several pessimistic comments - driven to bring this in a more positive direction. If we stop creating, we are lost anyway.
It's very human to create.
I like the idea of using generative AI to make the moderation better because I think many of the problems we have mentioned - lack of neutrality, sources are problems we already have (racism,). The problem with AI is it increases the volume and makes these issues worse as it's faster to make those same mistakes. One way to attack is to use AI for moderation as well as creation and quality of articles.
It's not about the fact we're going to stop doing this thing. There is nothing we do as a people. We had a discussion with developers yesterday that when you are using AI you are doing it to support your creation. e.g. when you want to get started or be given a push when stuck on an idea rather than blatantly . Most of us not doing that. I'm evolving it. When history creates things, we use it to build on top of it. e.g. what comes next, we just don't have enough practice. There's a middle ground before here and there.
the best thing ever is the +1 - perfectly happy with that. humans should be doing that.
I think to achieve the goal of making Wikipedia - we need to add more interactivity. Maybe AI is needed to control that. We are too slow.
a very good point - working on small wiki's - are we covering the right topics? are there others we should be doing? we need feedback on consumption of articles. people search for things and if they got a response or not - would be good to get that info on if they found what they were looking for
Are we ready to ?? What better could we do so we can do more than ChatGPT

[HOST] I gave a speech last year in UK, lot of people in audience with different interests. started expaining about AI bot. concluded with - we're all technical folks and understand this. but then when talking with wikipedians - they understood the first ten min and then were lost. couldn't understand. made an effort to explain AI to your local communities so that they can understand and use it in a good way (to use it in a wise way)

tedious formatting of wikipedia articles, edit a lot of text but not sure about lists or tables. in our group - we use chatGPT to generate information on artists in events - to determine who of those people have and article and not (and need to have one written).

[AA] What did you mean by interactive?
[BB] Ask questions to understand information.
[Host] Come to Helsinki to hear more about this subject :)
[Susana] Event is fully booked. Let me pitch what it's about. Intersection of GLAM and use of AI. What do institutions want from this environment and how does AI influence this. Lightning talks, workshops, hack day.

Questions

Are there already AI cleanup templates?
https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AI_Cleanup

Photos

Social

Event Timeline

Tohaomg created this task.Apr 16 2024, 9:36 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 16 2024, 9:36 AM

Hello! 👋 The 2024 Hackathon Program is now open for scheduling! If you are still interested in organizing a session, you can claim a slot on a first-come, first-serve basis by adding your session to the daily program, following these instructions. We look forward to hearing your presentation!

roti_WMDE subscribed.Apr 17 2024, 7:41 AM

MGerlach subscribed.Apr 17 2024, 8:16 AM

Maryann-Onyinye moved this task from Proposed sessions to Scheduled Sessions on the Wikimedia-Hackathon-2024 board.Apr 17 2024, 10:27 AM

debt triaged this task as Medium priority.Apr 17 2024, 7:25 PM

Lofhi subscribed.Apr 25 2024, 1:00 AM

AnnWF subscribed.Apr 25 2024, 1:31 AM

Hi @tahaomg looking forward to seeing you at hackathon! I noticed in the schedule this is scheduled for 30 minutes but in the task description here it says 50 mins. Which is correct?

In T362624#9755367, @Jdlrobson wrote:

Hi @tahaomg looking forward to seeing you at hackathon! I noticed in the schedule this is scheduled for 30 minutes but in the task description here it says 50 mins. Which is correct?

I have edited the timetable so that this session takes two neighbouring slots with total duration of 1 hour.

SocialKnowledge subscribed.May 1 2024, 7:13 PM

Ferdi2005 subscribed.May 3 2024, 7:01 AM

debt updated the task description. (Show Details)May 4 2024, 2:31 PM

Restricted Application added subscribers: Stang, Base. · View Herald TranscriptMay 4 2024, 2:31 PM

debt closed this task as Resolved.May 13 2024, 7:39 PM

Stang unsubscribed.Oct 24 2024, 3:35 AM

[Session] Generative AI: how will it change the landscape for Wikimedia projectsClosed, ResolvedPublicActions