Chapter 21 AI and the future of wikis
People have been using artificial intelligence for decades: the first AI-based therapist, ELIZA, was released in 1966, and the first online machine translator, at babelfish.altavista.com, was launched in 1997. And people have been talking about artificial intelligence for much longer than that: a 16th century Prague rabbi, Judah Loew ben Bezalel, is said to have created a sentient creature, known as a golem, out of mud; the golem could be awakened or deactivated by placing the right slips of Hebrew wording in its mouth, a sort of proto-programming. But the current AI boom – the one that has everyone thinking about their job prospects, and nervous parents nudging little Johnny to perhaps consider becoming a plumber – has to date from November 30, 2022. That was the day OpenAI released ChatGPT. People had seen chatbots before – remember ELIZA back in 1966 – but this was different: it could converse on seemingly any subject, and do creative tasks like writing poetry. It felt... human. After two months, it had gained 100 million users; it was the most quickly-adopted technology in history.
More than three years later, the capabilities of AI wildly surpass what ChatGPT could do. AI can write full essays and short stories, create (and improve) entire software applications, and generate video that can fool the average person. Instead of simply a chatbot, AI can run as autonomous agents that do tasks (and can themselves direct other agents). It’s presumably just a matter of a few years’ time before we see the first fully AI-generated movie - conceived of, written and generated by a computer. After that -- who knows? Maybe the first company in which all executive decisions are made by AI. (Or perhaps this has been done already.)
An obvious question is, what does this mean for humanity? That question, while important, is outside the scope of this book, but here we can ask a more specific, and relevant, one: what does it mean for MediaWiki?
AI and software
There is a revolution happening in the software industry: a shorthand for the trend is that, much as in 2011 it was declared that "software is eating the world", today it’s said that "AI is eating software". At the end of 2025, one estimate held that 42% of newly-written code was AI-generated; that amount is sure to keep increasing.
In 2025, the term "vibecoding" was coined to describe programming via AI, but that term may already be on the wane, as AI assistance becomes simply the norm for software development -- soon, presumably, it will just be "coding", with some other term, perhaps "hand-coding", used as a retronym for the kind of laborious programming that people did in the old days.
A MediaWiki extension released in late 2025, Layers, feels like a good encapsulation of the new age we live in. It is a quite useful extension that provides a long-desired feature: the ability to annotate images by applying lines, shapes and text over them (sadly it’s not mentioned elsewhere in this book, but it is worth trying out). The graphical editing interface is robust, featuring a palette of the sort you’d expect to see if you’re familiar with tools like Microsoft PowerPoint, or Inkscape. But the truly amazing thing about the Layers extension is that every part of the code was AI-generated. This includes the graphical editor, which does not use an existing open-source JavaScript library but rather is entirely original to the extension. Its developer is an engineer but not a software developer, using a combination of GitHub Copilot, Claude Opus, and Gemini.
What does all of this mean for MediaWiki development? For core MediaWiki, not necessarily anything. Core MediaWiki’s developers have so far been quite resistant to AI-generated code; and even if they change their minds, the software is rather mature at this point. (Barring, say, a port of the code to Node.js, done as a weekend project.) For extensions and skins, at the risk of sounding utopian, it may mean that we are entering an era of much faster development time, of debugging sprints that resolve bugs in hours that might otherwise have languished for years, and ultimately of functionality that remains usable instead of being abandoned.
AI and Wikipedia
A mostly separate question is what AI means for the content that goes inside of wikis. Let’s look at the example of Wikipedia -- which is not only the world’s most important wiki (technically, it’s a collection of wikis, one for each language), but is also, directly and indirectly, responsible for much of the usage and development of MediaWiki. Wikipedia is both a microcosm of the promises and challenges that AI holds for all wikis, and very much a major question on its own.
It’s hard to predict the future, but let’s look at some potential futures for Wikipedia.
Pure AI for encyclopedias
Do we really need Wikipedia at all? That would have been considered a bizarre question even five years ago, but today the answer is less clear. There is already one AI-based online encyclopedia, Grokipedia (grokipedia.com), which launched in 2025; tomorrow there could be ten, with potentially each AI company, each search engine, and maybe even political organizations and entire governments offering their own alternative. Grokipedia itself feels like a proof-of-concept at this point, with few images, and availability only in English. These are solvable technical issues: for example, translating the contents of any particular AI-generated encyclopedia into dozens of languages seems simple enough, for Grokipedia or any other "vendor". (Covering over 300 languages, as Wikipedia does, may be out of reach at the moment.) For AI companies, it would be an advertisement for the strength of their LLM. For search engines, it would be an extension of what they already do: if you phrase your Google search in the form of a question, for example, there is a good chance that an "AI Overview" will appear that is not entirely different from a short encyclopedia article, complete with both external and internal links. And for governments and activist organizations, it could be a way to promote their worldview on controversial topics. (Some countries have already funded or at least promoted their own non-AI Wikipedia alternatives, although for now this has been the domain only of authoritarian countries like China, Russia and Cuba.)
Even assuming that Wikipedia always offers comprehensive and accurate information, there are still advantages that AI encyclopedias can offer. Grokipedia was explicitly created in order to offer a more right-wing slant than Wikipedia’s (its creator, Elon Musk, has referred to Wikipedia as "Wokepedia"). Other AI encyclopedias could be configured to output themselves in a variety of different ways: taking any other position on the political spectrum, for example, or making articles shorter or longer, or citing more or fewer questionable sources like YouTube videos.
In fact, it’s entirely possible that an AI encyclopedia in the future could offer this sort of customization directly to the user, with the user specifying a series of preferences, from language to political standpoint to preferred article length, and the contents of each article then being tailored to those preferences. To take a relatively anodyne example, on the English-language Wikipedia there is a sort of perpetual slow-motion war between those who want to keep and expand "In popular culture" sections, listing all the times the subject of an article has been referenced or depicted in movies, songs, etc., and those who want to remove or greatly reduce these listings on the grounds that they are frivolous. Both sides act on the assumption that there is an ideal set of popular culture references for each article – but what if there isn’t? What if some readers would like to read about all the times that, say, Charlemagne has been mentioned on TV shows, but others do not, and that’s how it will always remain? For that matter, what if the same person sometimes wants to see ancillary information like this, and sometimes not? An AI-based encyclopedia could easily offer that customization with a series of checkboxes and buttons: no more need to wage battles.
One analysis holds that more than 50% of the articles on Grokipedia are copies, in whole or in part, of the equivalent article on the English-language Wikipedia. Perhaps in the future, though, AI-generated encyclopedias will stop relying on Wikipedia altogether and instead go directly to their own set of (hopefully) reliable sources: books, newspapers, scholarly articles, etc. There is nothing sacrilegious about this idea: Wikipedia openly states that it should not be considered a reliable source (Wikipedia articles cannot cite other Wikipedia articles, for example). So presumably a machine-based encyclopedia, if it wanted to get as close as possible to the truth, would skip the middleman and go directly to the actual reporting.
Where does all this leave Wikipedia? Perhaps only as a relic of, as they say, a simpler time, when wanting to impart a piece of knowledge to the world meant going to a keyboard and typing it all out.
Keeping the status quo
On the other hand, perhaps things will stay mostly as they are. For all the talk of constant progress in AI, there is also the perception among some that generative AI is hitting a plateau in its abilities: that five years from now, it will undoubtedly be better at everything but its output will still be a little "off", with images that look a little too painterly, videos that look unnatural in a dozen different ways, human vocals that sound lifeless, and text that is unnecessarily wordy. Wikipedia, though its readership may continue to shrink, will remain the definitive source for encyclopedic information. This seems to be the view of the average Wikipedia enthusiast, perhaps stated in an increasingly defiant tone: "Knowledge is human", proclaimed the closing credits of the 2026 "Wikipedia 25 virtual birthday party" online video – a statement that seems to hinge on a very specific meaning of the word "knowledge". (And perhaps accidentally insults animals.)
If Wikipedia goes away or declines, one big danger is that AI will increasingly use the output of other AI as its text training data. There is a term for this phenomenon: "model collapse", and the fear is that hallucinations and other errors will magnify over time, leading to all sorts of chaotic output (though still stated with a high degree of certainty, of course). LLM makers are keenly aware of this risk, and in some cases have sought out assurances from the Wikimedia Foundation that AI will not be used to generate the text on Wikipedia. If even AI companies want Wikipedia to stay as it is, that’s in a sense as big an endorsement of human-generated content as they come.
So what would Wikipedia look like, in a world with an ever-greater usage of AI, but with Wikipedia remaining as the standard encyclopedia of choice? Of course, the main answer is that it would look like it does now. There will undoubtedly be more "digital offerings" around Wikipedia. There are already dozens of podcasts in which the hosts read out and/or discuss Wikipedia articles (some of the former are marketed as sleep aids, for what it’s worth). The next step is probably short-form videos around Wikipedia content, of the kind that have taken over social media. These would presumably have to be AI-generated in order to have enough of them to make the whole enterprise worthwhile. Ironic, perhaps – but it’s certainly possible that AI will emerge as a tool that’s useful for some tasks more than others.
Wikimedia as a repository
Wikipedia does not exist in a vacuum: its images and other files are hosted on the site Wikimedia Commons, and for certain languages (though not English), the majority of infobox data comes from the site Wikidata. In the future, some amount of the text (again, probably much less so for English) may come from the planned site Abstract Wikipedia, written in a highly technical-looking syntax (currently nicknamed "Abstractese") that is planned to be automatically translatable to each language. What if these three resources – Commons, Wikidata and Abstract Wikipedia – ultimately become a more important source for LLMs than Wikipedia itself? The use of Wikidata, especially, hearkens back to the dream of the "semantic web", which was conceived in 2001 and peaked in hype around 2010. But perhaps LLMs will eventually see this dream reach its fruition. Structured data is much less ambiguous than text, and less likely to be the source of hallucinations. It also lends itself to more sophisticated querying – a question like "What are the 10 biggest lakes in Angola?" can much more accurately be answered by a SPARQL query (of Wikidata, presumably) than scouring its memory for facts involving lakes in Angola.
None of this is news to anyone who has ever shown interest in the semantic web. In fact, for the semantic web, SPARQL querying is part of a bigger vision that involves querying a massive variety of structured data sources, along with some agentic capabilities, with the ultimate goal being a system that can execute commands such as "Book me a vacation next month near any of Angola’s 10 biggest lakes". Of course, it’s conceivable that a purely text-trained AI system could accomplish this as well, but having RDF/SPARQL-based querying, rather than simply word matching, would presumably be more likely to provide a successful outcome.
But back to encyclopedias. A system that used AI to generate articles with text, images and infoboxes generated from some combination of Commons, Wikidata and Abstract Wikipedia could represent the best of both worlds: the reliability of human-created (or at least human-supervised) content along with the customizability of a computer-generated end product.
Wikis and chatbots
For regular wikis, there is also a large, and still growing, set of chatbot applications that can be used to answer questions about and/or edit the wiki’s content. The one I feel the closest kinship to is Wanda, a MediaWiki extension that I have been involved in developing. It provides a chat interface directly within the wiki that can connect to a variety of different LLMs, and in each case can run wiki searches on the user’s query, passing the results to the LLM so that it can craft a coherent answer. It can do querying of structured data, such as data from the Cargo extension and Wikidata. It also allows for automated edits to the wiki, again using that LLM’s output. It is not unique, but is perhaps more convenient for users than other current alternatives.
Do we still need MediaWiki?
And now the big question: what does all of this mean for MediaWiki? If knowledge systems of the future are going to be read by, and written to, computers (or at least, a combination of computers and humans), do we need wikis at all? Can’t we just throw a bunch of text documents and spreadsheets online, or on an intranet, and let the bots figure out the rest?
I don’t believe so -- and that holds true even if Wikipedia itself gradually fades into obsolescence. There is a big difference between Wikipedia and smaller wikis, especially internal ones. Wikipedia is intended merely as a summary of what reliable sources say: as an encyclopedia, it is a tertiary source, and is not in itself considered reliable, or citable. Smaller wikis that are specific to some organization, on the other hand, usually serve as what could be considered a primary or secondary source -- a reliable "single source of truth", outweighing whatever other internal documents may cover the same topics. This is often simply done out of necessity. Wikipedia’s sources are published, vetted, and often easily accessible. For a corporate wiki, on the other hand, the sources for the information may be a jumble of documents that are contradictory, out-of-date, and hard to locate -- or may be knowledge that is not written anywhere at all. So we have the ironic situation where a small enterprise wiki with 50 readers may, at least in some sense, be considered more indispensable than Wikipedia with its billions of readers.
So we’ve established that a single source of truth, easily accessible, will remain important. But does it have to be a wiki? In my view, there are at least two important aspects to wiki software, and MediaWiki in particular, that make it irreplaceable. First is version history. People often say that the heart of a wiki is the "edit" tab, but arguably the real heart is the version history. After all, people can edit the wiki in other ways besides the edit tab: forms, the API, etc. (For bots, the API is actually the only way to go.) But the version history is irreplaceable: a complete audit trail of all changes that have been made, with full information on each change, who made it and when -- with the ability to restore things back to any previous revision.
A version history is so useful that many non-wiki applications end up approximating one, in one way or another. Some word processing applications like Google Docs have it. Any software application that has to pass government regulations tends to have something approximating a version history, though these tend to be tacked-on afterwards, and sometimes show simply an approximation of each set of changes, with no ability to restore back to an older version. Even Grokipedia includes a version history, showing all the user-suggested changes to each article that have been accepted (and, interestingly, rejected).
Extensions like Cargo, Semantic MediaWiki and Wikibase provide the second set of functionality that cannot be replicated by AI: storage of data in a structured way, queryable via SPARQL, SQL or the like. LLMs cannot do this sort of data storage directly, which leaves them three main choices: look through their own “memory” and attempt a best guess; do a web search or other sort of text search; or do some sort of database query. As we have seen, the third is the most powerful of these options, and arguably the most reliable. A structured data approach allows answering questions much more accurately, especially aggregative questions like "How many employees joined the company before 2015?". It also allows for more complex visualizations -- like maps and calendars -- that LLMs by themselves cannot (yet?) directly produce.
This combination of complete versioning and structured querying has been an extremely powerful one when humans directly handle all the input and output, and all indications are that it will continue to remain powerful even if/when computers provide an AI wrapper around both input and output.
Index
- Between the Brackets — 1
- Abstract Wikipedia — 1
- AbuseFilter extension — 1
- access control — 1
- Ace editor — 1
- Active Directory — 1
- AdManager extension — 1
- Admin Links extension — 1
- ads — 1
- AI crawlers — 1
- Anonymous editing — 1
- Anvesha — 1
- Apertium — 1
- API for MediaWiki — 1
-
Approved Revs extension — 1
- and caching — 1
- arraymap — 1
- arraymaptemplate — 1
- authentication — 1
- backing up a wiki — 1
- behavior switches — 1
- blank page error — 1
- blanking pages — 1
- blikis — 1
- blocking an IP range — 1
- blocking users — 1
- BlogPage extension — 1
- BlueSpice — 1
- Bogart, Humphrey — 1
- Bucket extension — 1
- bug reports — 1
- caching of pages — 1
- calendars — 1, 2
-
Canasta — 1, 2
- use for wiki farms — 1
- CAPTCHA — 1
- Cargo extension — 1, 2
-
categories — 1, 2
- on Wikipedia — 1
- CategoryTree extension — 1
- CentralNotice extension — 1
- chat — 1
-
cheat sheet — 1
- for Wikipedia editing — 1
- CheckUser extension — 1
- CirrusSearch extension — 1
- Cite extension — 1
- CodeEditor extension — 1
- CodeMirror extension — 1
- Codex — 1
- comments — 1
- conferences — 1
- ConfirmAccount extension — 1
- ConfirmEdit extension — 1
- ContactPage extension — 1
- ContentTranslation extension — 1
- Crocker, Lee Daniel — 1
- CSV parsing — 1
- Cunningham, Ward — 1
- dashboard — 1
- Data Transfer extension — 1
- database systems — 1
- debug log — 1
- debugging — 1
- DeleteBatch script and extension — 1
- deleting pages — 1
- deleting revisions — 1
- deleting users — 1
- diagram extensions — 1
- diffs — 1
- DiscussionTools extension — 1
- DismissableSiteNotice extension — 1
- Disqus — 1
- Docker — 1
- DrawioEditor extension — 1
- DynamicPageList extensions — 1
- Echo extension — 1
- edit conflicts — 1
- edit summary — 1
- edit with form tab — 1
- editing — 1
- Elastica — 1
- Elasticsearch — 1
- Email Authorization extension — 1
- Extension Distributor — 1
- Extension Matrix — 1
- extensions — 1
- External Data extension — 1
- external links — 1
- Facebook authentication — 1
- Fandom — 1, 2
- FlaggedRevs extension — 1
- Flex Diagrams extension — 1
- Flow extension — 1
- Gadgets extension — 1
- gallery tag — 1
- Git — 1
- global search-and-replace — 1
- Google AdSense — 1
- Google Docs — 1, 2
- Google Programmable Search Engine — 1
- GoogleCustomWikiSearch extension — 1
- Header Tabs extension — 1
- history page — 1
- hosting — 1
- HTML in wikitext — 1
- HTML2Wiki Converter — 1
- images — 1
- images directory — 1
- importing files — 1
- infoboxes — 1
- InstantCommons — 1
- internal links — 1
- Intersection extension — 1
- Interwiki extension — 1
- interwiki links — 1
- InviteSignup extension — 1
- IP addresses — 1
- job queue — 1
- jQuery — 1
- jQuery UI — 1
- JSON parsing — 1
- JSONPath — 1
- Kiwix — 1
- LAMP stack — 1
- language codes — 1
- Layers extension — 1
- LDAPAuthentication2 extension — 1
- LDAPAuthorization extension — 1
- Lighttpd — 1
- LiquidThreads extension — 1
- LocalSettings.php — 1
- LockAuthor extension — 1
- Lockdown extension — 1
- logo — 1
- logs — 1
- Lua — 1
- magic words — 1
- mailing lists — 1
- Manske, Magnus — 1
- Maps extension — 1
- marketing wikis — 1
- Math extension — 1
- MediaWiki
- Mermaid extension — 1
- Microsoft Office — 1
- Microsoft Word — 1
- Minerva Neue skin — 1
- mobile display — 1
- MobileFrontend extension — 1
- Moderation extension — 1
- modules — 1
- moving pages — 1
- Mpdf extension — 1
- MsUpload extension — 1
- MUDCon — 1
- multi-language wikis — 1
- mwcli — 1
- MyVariables extension — 1
- namespaces — 1
- navigation boxes — 1
- NeoWiki extension — 1, 2
- Nginx — 1
- Nuke extension — 1
- Nupedia — 1
- OAuth extension — 1
- OOjs — 1
- OOUI — 1
- Open CSP — 1
- OpenID Connect extension — 1
- OpenLayers — 1
- page banners — 1
- Page Exchange extension — 1
- Page Forms — 1, 2
-
page names — 1
- starting with a lowercase character — 1
- Page Schemas extension — 1
- parser functions — 1
- ParserFunctions extension — 1
- Parsoid — 1
- PDF export — 1
- PdfHandler extension — 1
- performance — 1
- permissions — 1
- Phabricator — 1
- PHP — 1
- PluggableAuth extension — 1
- PollNY extension — 1
- pop culture wikis — 1
- Popups extension — 1
- preferences — 1
- professional support — 1
- protecting pages — 1
- purge action — 1
- QuizGame extension — 1
- ratings extensions — 1
- read access — 1
- real names — 1
- recent changes — 1
- red links — 1
- redirects — 1
- refresh tab — 1
- registration — 1
- RegularTooltips extension — 1
- Renameuser extension — 1
-
Replace Text extension — 1
- as search tool — 1
- ResourceLoader — 1
- Responsive skins — 1
-
rollback — 1
- mass — 1
- SAML — 1
- Sanger, Larry — 1
- Scribunto extension — 1
- search engine optimization — 1
- searching — 1
- Semantic Compound Queries — 1
- Semantic Drilldown — 1
- Semantic MediaWiki — 1, 2
- Semantic Result Formats — 1
- Semantic Scribunto — 1
- Semantic Web — 1
- semantic::core — 1
- shared user login — 1
- Shibboleth extension — 1
- sidebar — 1
- SimpleBatchUpload extension — 1
- single sign-on (SSO) — 1
- site notice — 1
- SiteMetrics extension — 1
- skins — 1
- slideshows — 1
- SmiteSpam extension — 1
- social networking — 1
- SocialProfile extension — 1
- sockpuppeting — 1
- spam — 1
- SpamBlacklist extension — 1
- special pages — 1
- Special:ListFiles — 1
- Special:RandomPage — 1
- Special:SpecialPages — 1
- Special:UserRights — 1
- Springboard extension — 1
- statistics — 1
- StringFunctions extension — 1
- StructuredDiscussions extension — 1
- subpages — 1
- successful wikis — 1
- super-pages — 1
- SyntaxHighlight extension — 1
- tables — 1
-
talk pages — 1
- archiving — 1
- templates — 1
- Temporary accounts — 1
- thanking users — 1
- Threaded discussions — 1
- thumbnails — 1, 2
- tooltips — 1
- transclusion — 1
- Translate extension — 1
- Translatewiki.net — 1
- translation — 1
- unblocking users — 1
- undeleting — 1
- updating MediaWiki — 1
- uploading — 1
- UploadWizard extension — 1
- UrlGetParameters extension — 1
- user groups — 1
- usernames — 1
- users
- variables — 1
- Vector 2022 skin — 1
- Vector skin — 1
- version history of MediaWiki — 1
- Visual diffs — 1
- VisualEditor extension — 1, 2, 3, 4, 5
- VoteNY extension — 1
- Vue.js — 1
- Wales, Jimmy — 1, 2
- Wanda extension — 1
- watchlist — 1, 2, 3
- Widgets extension — 1
-
wiki farms — 1
- running one's own — 1
- Wikia — 1
- WikiApiary — 1
- Wikibase — 1
-
Wikidata — 1, 2, 3
- using within forms — 1
- WikidataPageBanner extension — 1
- WikiEditor extension — 1
- WikiForum extension — 1
- Wikimania — 1
- Wikimedia Foundation — 1, 2
- Wikinews — 1
- Wikipedia — 1, 2
- WikiSEO extension — 1
- Wikisource — 1
- WikiSysop user — 1
- wikitext — 1, 2
- wizard — 1
- write access — 1
- WSOAuth extension — 1
- WYSIWYG — 1
- WYSIWYM — 1
- XML parsing — 1
- XPath — 1
- YouTube videos — 1
