Wikipedia Is Filled With AI-generated Content. So Much So That It Already Has Its Own Team Dedicated To Finding It.

Content generated by artificial intelligence reaches all angles. Amazon has arrived; where the ChatGPT script books appeared; and it was done some mediato scientific articlesto in videosto musicto picturesto photographs. For everything. Generative AI has reached everywhere. Also to Wikipedia.

There is a problem. One that the platform already addresses through a group of collaborators dedicated to finding and removing this content. your name: WikiProject AI Cleanup.

It works like Wikipedia. It is important to know that Wikipedia is open and that anyone can write and create on it. This has a positive point: if I have something or have experience in the subject, I can enrich the encyclopedias with my contributions, either by adding new things, clarifying existing things or editing erroneous ones. The negative is because … anyone can edit Wikipedia and post whatever they want. If we add a tool like ChatGPT, the problem is solved more.

AI is flooding everything. Second explains Ilyas Lebleu, founder of the WikiProject AI Cleanup initiative, started it all when he noticed “the prevalence of unnatural writing that is clear signs generated by AI.” They used ChatGPT to replicate similar styles, both white and in a bottle.

404 Middle A very good example resonates: the Ottoman fortress of Amberlisihar, a construction built in 1466. From his Wikipedia pagealong with 2,000 words, history, construction, content… everything you’d expect. The thing is, it’s not a fort. It’s false, based on an AI hallucination. That article was published in January 2023 and was not published until December.

The same with Photius. In this article Darul Uloom Deoband it was published this image which from before could pass to the image of time. However, you only have to look at the hands (and really pay a little attention to detail) to find that it was generated with AI. It was removed because “the article contributes little, could be confused with contemporary works of art and is anatomically incorrect.” It should be noted that not all AI-generated images are removed, but the inappropriate ones.

The beginning of Darul Uloom Deoband under the pomegranate by Ai

The description of the image reads as follows: “The image was created by AI of the beginnings of the Darul Uloom Deoband seminary. This image created by AI depicts Professor Mahmud Deobandi teaching his student Mahmud Hasan Deobandi – the first student of the seminary -. , who later became “Shaykh al-Hind” to be recognized and played an important role in the Indian independence movement.Clues indicating that it was generated by AI can be clearly seen in the hands, the book and the feet, for example.

AI volunteers. vs. The WikiProject AI Cleanup is “a collaborative effort to combat the growing problem of unsourced and poorly written AI-generated content on Wikipedia.” Anyone can sign up and participate. The goal is not to restrict or remove the use of AI, but “to verify that its output is acceptable and constructive, and to fix or remove it if not.”

This is not an easy task. Because if LLMs are good at something, it is possible that they can pass off their creations as legitimate texts. But to make some sense. Phrases like “like an AI language model”, ultra-generic descriptions (“a town known for its fertile fields”) or sound overly promotional or positive are indicative of the AI behind it.

AI-generated texts can contain clues indicating their synthetic origin: highly promotional sound, generic descriptions, etc.

On the other hand, you would think that detecting this type of content would be as simple as seeing if it has features or not, but AI can also hallucinate. So explains group on its Wikipedia page, where it claims that AI can find sources or offer existing sources, but at all off topic.

This article Leninist stories clearly written by AI and quoted from Russian and Hungarian sources, which seemed to be true, but were not. It was cut off. In this another article of the Estola albosignata beetle, real French and Germanic sources were cited, which at no time said of the beetles. The article has been published.

Generative AI seems to be stagnating. Big Tech believes it has vinegar up its sleeve; "agents" to do for us

The challenge of AI. The use of AI is not bad in itself, but it creates a challenge when it comes to trust. If Wikipedia allowed AI-generated content to run wild, the content would not be reliable. AIs hallucinate, fake information. Although everything seems to make a lot of sense to use the correct language, the information, dates, names or events explained may not be accurate.

And this is not only a matter of Wikipedia, but there is a risk of false, inaccurate or fake information scoured the Internet. And if the false information on Wikipedia, one of the great principles of training information for LLMs, it is possible that LLMs are established with specific indications that cause more definite consequences, and thus infinite. That is why the work of those volunteers is so important.

Cover image | Xataka

In Xataka | Download Wikipedia: how to read articles or ALL of Wikipedia offline

Source link