Page MenuHomePhabricator

Inquiry possible massive copyright infrigment on Wikitionary in Esperanto
Open, Needs TriagePublic

Description

@Robin_van_der_Vliet reported in a discussion in the Kompetuko channel on Telegram that the Esperanto version of Wiktionary is full of material illegally imported from the Plena Ilustrita Vortaro de Esperanto (PIV) and Reta Vortaro (ReVo).

Here is the original statement (in Esperanto):

Sed mi persone ne tre rekomendas, la kvalito de tiu projekto estas vere malbona, kaj en ĝi nun ankaŭ troviĝas multaj artikoloj kontraŭleĝe kopiitaj de PIV/ReVo. Mi mem jam plene rezignis pri ĝi, mi pensas, ke estas pli bone fokusiĝi je la plibonigado de ReVo

If that proves correct, some action need to be performed, like:

  • removing corresponding material
  • evaluate possibility to regularize the situation through a demand to license the original works into a compatible license and update involved pages to bring proper credits

In any case, first steps can be:

  • list affected material of suspected infringement
  • involve the Esperanto Wikitionary community to
    • see what can be done together
    • asses the actual state of situation,
    • see what we should do if any action indeed need to be done

Indeed, so far this issue only goes by the suspected copyright infringement given the input of Robin. But it should also be taken into account that there are some material that ReVo used in it's initial release (1997) which is explicitely taken from Plena Vortaro (PV), which was already in public domain by then. ReVo itself is under GPL.

PIV also derives from PV, with a first release in 1970. It's released under full raw copyright for now.

So prior to any other step, we should probably give a list of at least some examples of copyright infringements, as opposed to something that was legally taken from PV and also appears in PIV and ReVo because they have this common ancestor.

That said, even if there was no copyright infringment and that all copied material came from the public domain PV, the work should be properly credited on each page using it. Not only to comply with law, as PV was edited in France where the *droit d'auteur* includes attribution without duration limit even for public domain works. But a will to give credit to authors and ability to trace information sources for our audience should be enough of a reason to make that happen, as it all comes to respect of human dignity which is hopefully still a core value of our community.

Event Timeline

Novaĵoj kaj malnovaĵoj pri PIV, Bertilo kaj kopirajto (16 jan. 2021 by @Taylor) is also related to this topic.

In a nutshell, apart a recall that we must always respect copyright, it states that

  • Sennacieca Asocio Tutmonda is the current copyright owner of PIV
  • Bertilo Wennergren joined PIV's editorial team, and is trying to convince its teammate to publish PIV under a license compatible with Wiktionary.
  • PIV is nevertheless still under strict copyright for now.

Also check the log of this page:
https://eo.wiktionary.org/w/index.php?title=Speciala%C4%B5o:Protokolo&page=PIRATA%C4%B4O

When an admin discovered a copyright problem, they moved it to that page and subsequently deleted the page.

Removing All-and-every-Wiktionary as this is not about all Wiktionaries; please see its name and description

Thanks @Robin_van_der_Vliet for the feedback. It seems that @Taylor took the expected actions on these pages, and that's seem good to me (not as good as if we had instead free material to share on the topic of course).

I think PIV is about 45k entries (word forms, not definitions). That might be a very huge amount of work to check all of them manually. Should we try to create a bot to deal with that? That is, for each Esperanto word in Vikivortaro, check the rate of correspondence of definitional material with material in PIV. That should open the path to periodically generating reports, and ease the work of administrators on where to put attention.

Removing All-and-every-Wiktionary as this is not about all Wiktionaries; please see its name and description

Yes, but since we have never a All-and-every-Wiktionary or #Wiktionary-eo tag, it was the closest I found pertaining to this topic. 🤔

I am the malicious sysop deleting articles all the time. There is a large number of such articles, and there are several ways to search for them. It will take years to get rid them all, as I want to add useful content and improve quality of the project besides only deleting.

I expect following action from WMF : make more clear and explicit rules about "importing", particularly from online dictionaries. People do not know what is legal and what isn't.

There are other (smaller) wiktionaries suffering from very same problem.

I expect following action from WMF

Could you elaborate how "WMF" is involved here and why?

Removing All-and-every-Wiktionary as this is not about all Wiktionaries

Not about all, but about several. -eo- is not the only one affected.

Could you elaborate how "WMF" is involved here and why?

Isn't it obvious? Whose TOU is allegedly being violated? The small text under the edit box links to "https://foundation.wikimedia.org/wiki/Terms_of_Use/en", not to this "phabricator".

If this problem is real then it is a legal bug, not a technical bug, and it't questionable whether it belongs here at all.

User "Robin_van_der_Vliet" (member of the editorial team of PIV) could do more to get this issue sorted out, namely by re-adding (with help of Bertilo) a suitable copyright notice to the relevant page " https://vortaro.net/klarigoj.html ". Harbouring a strong private persuation that "copyright is ON by default and no notice is needed" will not change the mind of people not knowing this fact or giving a f**k into it.

The current TOU is NOT sufficient. We can read there:

Lawful Behavior – You do not violate copyright or other laws.

...

  1. Refraining from Certain Activities

Infringing copyrights, trademarks, patents, or other proprietary rights under applicable law.

...

Ban a user from editing or contributing or block a user's account or access for
actions violating these Terms of Use, including repeat copyright infringement;

Unfortunately, we have several contributors (many of them sysops or bureaucrats) involved in copying large amounts of content from copyrighted online dictionaries to wiktionaries. Most likely they are not aware that this is copyright infringement. Some of them argue with "free for educational use".

You won't get many friends in small "communities" by deleting or proposing for deletion 100:s of articles that have been there for years, introduced by several "valuable" contributors (many of them sysops or bureaucrats), and being edited many times by various users since creation or introduction of the illegal content.

A clear statement from WMF about copying from copyrighted online dictionaries to wiktionaries is needed, besides mass deletions on this wiktionary and several other wiktionaries, otherwise people will NOT understand what the problem with their "conttributions" is.

After someone (not me) opened this issue here (maybe " https://meta.wikimedia.org/wiki/Requests_for_comment " would be a better place), this must be finally resolved, for all wiktionaries affected. There should be no "probably illegal" or "possibly illegal" content on public WMF wikies.

PV can be downloaded here: " https://app.box.com/v/PlenaVortaro "

Isn't it obvious?

No, that's why I asked. See https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Legal_department in case you would like to make them aware.