Working with MediaWiki
2nd ed., HTML version

Chapter 15 MediaWiki administration

Administering a MediaWiki wiki is generally not that hard, once you've done the initial setup. It involves both actions done via the web interface, and actions done on the back end, like editing LocalSettings.php and installing extensions. Usually there are just one or a handful of people with access to the back end, and the same or a slightly larger group of people with administrative access on the wiki itself.
This entire book is geared in large part toward MediaWiki administrators, so in a sense most of this book could be fit under the topic of “MediaWiki administration”. But this chapter is meant to hold some of the tools and actions that are relevant only to administrators, that didn't fit in elsewhere.

Configuration settings

There are many settings for core MediaWiki that can be modified in LocalSettings.php – essentially, all the variables that start with “$wg”. Some are covered in this book, though it's a very small percentage of the total set. You can see the full listing here, grouped by functionality type:
Here is a useful one that is not mentioned elsewhere in the book:

Debugging

MediaWiki is software, and software unfortunately can go wrong. If you run into a problem, the issue may be file directory permissions, database user permissions, missing files, missing database tables, bad settings in LocalSettings.php, incompatible versions, or even (perish the thought) bugs in the code. (Which, by the way, are much more likely to happen in extensions than in core MediaWiki.)
ini_set( 'display_errors', 1 );
If it's being added to LocalSettings.php, it should be near the top of the file, right under the “<?php” line.
The easiest tool to use for any kind of debugging is the MediaWiki debug toolbar. It puts all the necessary information (SQL calls, warnings, debug displays) in one easily-accessible place at the bottom of the browser. For those of us used to having done MediaWiki debugging the old-fashioned way, it's a remarkably useful tool. You can enable it by adding the following to LocalSettings.php:
$wgDebugToolbar = true;
However, you may not want everyone to see the debugging toolbar, during the time it's enabled (if you enable it, everyone will see it). Thankfully, there are other options. If you see an error message that starts with "Fatal exception of type "MWException"", and you want to see the actual error message, you can see it by adding the following to LocalSettings.php:
$wgShowExceptionDetails = true;
$wgDebugLogFile = "/full/path/to/your/debug/log/file";
This file needs to be writable by your web server.
If seeing the error message isn't enough to let you figure out the solution, often the easiest approach, as with a lot of software, is just to do a web search on the text of the error message – it could well be that others have come across, and maybe diagnosed, this problem. If you believe that the problem is coming from a specific extension, it's a good idea to check that extension's main page, or its talk page, to see if there's any mention of it.

Improving MediaWiki performance

This is not a web performance book, but if you feel your wiki is too slow, or you're worried about the results of increased traffic in the future, here are some helpful tips:
  • Make sure your web server and PHP have enough memory assigned to them.
  • There are a variety of caching tools that can be used in conjunction with MediaWiki (and with each other), like Varnish and memcached. You can see all the options for caching here:

The MediaWiki cache

MediaWiki does extensive caching of pages: when you go to a wiki page, chances are that it wasn't generated on the spot, but rather is a cached version that was created sometime in the previous day or so. (This doesn't apply to pages in the “Special” namespace, which are generated anew every time.)
The MagicNoCache extension lets you mark some pages as never to be cached, via the “__NOCACHE__” behavior switch. See here:
Caching becomes an issue when Cargo or Semantic MediaWiki are installed, because pages that are cached don't automatically show the latest set of query results; this can cause confusion to users if they add some data and it then doesn't appear in query results elsewhere. The best workaround for this problem is to install the MagicNoCache extension, using it on every page that contains a query.
Another option is to use the Approved Revs extension (here) – although it's not intentional, pages that have an approved revision don't get cached. This may change in the future, but at the moment it's a side effect that one should be aware of.
Cargo and SMW both provide their own tab/dropdown, which only administrators see, called either “Purge cache” (Cargo) or “Refresh” (or SMW); both point to the “action=purge” URL, preventing admins from having to type it in manually.

The job queue

There are certain tasks that MediaWiki has to run over an extended period of time, in the background. The most common case comes when a template is modified. Let's say that someone adds a category tag to a template – that means that every one of the pages that include that template need to be added to that category. This process can't be done all at once, because it would slow down the server considerably, or even temporarily crash it. Instead, the process is broken down into “jobs”, which are placed in a “job queue” – and then those jobs are run in an orderly way.
Behind the scenes, the job queue is really just a database table called “job”, which holds one row for each job. These jobs are run in sequential order, and once a job is run its row is deleted.
Jobs are run every time the wiki gets a page hit. By default, one job is run on every hit, but this number can be modified to make the running of jobs slower or faster, by changing the value of $wgJobRunRate. To make the running of jobs ten times faster, for instance, you would add the following to LocalSettings.php:
$wgJobRunRate = 10;
Conversely, to make it ten times slower, you would set the value to 0.1. (You can't actually run a fraction of a job – instead, having a fractional value sets the probability that a job will be run at any given time.)
You can also cause jobs to be run in a more automated way, instead of just waiting for them to be run (or hitting “reload” in the browser repeatedly to speed up the running). This is done by calling the script runJobs.php, in the MediaWiki maintenance/ directory. You can even create a cron job to run runJobs.php on a regular basis – say, once a day.
There are various parameters that runJobs.php can take, such as setting the maximum number of jobs to be run, or, maybe more importantly, the type of job to be run. To enable the latter, each job type has its own identifier name, which can be found in the database. You can read about all the parameters for runJobs.php here:
In addition to core MediaWiki, extensions can create their own jobs as well. Some extensions that do are Data Transfer, DeleteBatch, Nuke and Replace Text.

Admin Links

Figure 15.1: Admin Links page, when various other extensions are installed (Approved Revs, Replace Text, Page Forms, Cargo, Data Transfer)
The other nice feature of Admin Links is that it provides a link to the “Admin links” page within the user links at the top of every page, so that the dashboard is always just a click away. Here is how the top of the page looks in the Vector skin, with Admin Links installed:

Replace Text

To run a replacement, go to Special:ReplaceText. This action is governed by the 'replacetext' permission, which by default is given to administrators.
You can see the top of the Special:ReplaceText page in Figure 15. What follows below that is a list of namespaces that the user can select from; then below that are some additional options for the replacement, which are shown in Figure 15.
Figure 15.3: Bottom of Special:ReplaceText
Hitting the “Continue” button brings the user to a second page, listing the exact matches for the search string, so that the user can manually select which pages will have their contents and/or titles modified.
Every change made by Replace Text shows up in page histories, with the user who initiated the replacement appearing as the author of that edit.
For more complex transformations, you'll probably have to rely on bots and the MediaWiki API, which we'll get to next.

Getting user IP information

Conversely, if you don't want this information stored, for privacy reasons, you can disable storage by adding the following to LocalSettings.php:
$wgPutIPinRC = false;

Bots and the MediaWiki API

There are various tools for making automated changes to the wiki's contents, like the Replace Text extension. But in many cases the set of edits required is too specific to be handled by an automated tool. For all those cases, there are bots, and the MediaWiki API.
A bot, in MediaWiki terminology, is a script that does one or more specific kind of edits, or retrieves one or more pieces of data. A bot can be written in any programming language: it just has to connect with the MediaWiki API, which does the actual work of writing and reading data. Most of the major programming languages have one or more MediaWiki API libraries written for them, which take care of the details of logging in to the wiki and connecting to the API. But even without a library, it's not that hard to create a MediaWiki bot – the script just needs to hit some MediaWiki URLs.
If a bot makes any edits on a wiki, it should ideally be logged in as a user – and ideally that user should be a separate account, which gets added to the "bots" group. You can see these kinds of accounts all over Wikipedia – they're the ones fixing broken <ref> tags, renaming categories, adding signatures to unsigned talk-page messages, etc. On other wikis, they're quite a bit less common, but some smaller wikis do make significant use of them.
This page holds some information, and helpful links, on creating and running bots:

The MediaWiki API

The MediaWiki API is essentially a set of URLs that one can access in order to read from and write to the wiki. They all involve different parameters passed in to the file api.php. That file is located in the same directory as index.php; so, for instance, if your wiki has URLs of the form mywiki.com/w/index.php?title=..., the main API URL can be found at mywiki.com/w/api.php. (For more recent versions of MediaWiki, the API is linked from the Special:Version page.)
If you go that main URL, you'll see a fairly exhaustive (automatically generated) explanation of all the API actions available. API actions are defined by both core MediaWiki and a number of extensions. You'll also see a listing of the different formats that the results can be displayed in, including JSON and XML. For example, adding "format=jsonfm" to the URL will display results in a pseudo-JSON format that users can read on the screen, while "format=json" will result in actual raw JSON.
We won't get into the details of all the API functionality available here, but you can see it at api.php – and you can also read more about it at:

Search engine optimization

Search engine optimization, or SEO, is the practice of attempting to get the pages of one's web site to show up as high as possible in search-engine results, most notably on Google. It's a controversial field: to its proponents, it's an indispensable way to get web traffic, while to its detractors, it's at best tacky, and at worst the domain of hucksters, spammers and scammers. Nevertheless, for people who run public wikis, showing up high in search results can be important.
In MediaWiki, the subject of every page is also: the page's name, a part of its URL, the text in the top-level header, and the text that shows up in internal links to that page. That sort of consistency is extremely important for search engines in associating that word or phrase with that specific URL. Tied in with that, there's usually only one top-level header per page: the name of the page is contained within the only <h1> tag on the page, which is another thing that helps to establish the page's subject for search engines.
{{#seo: title=... | titlemode=... | keywords=... | description=... }}
The “title=” parameter either replaces, is appended or prepended to the contents of the HTML <title> tag, depending on the value of the “titlemode=” parameter, which can be either replace, append or prepend. The “keywords=” and “description=” parameters get placed as the “name” and “content” attributes, respectively, of an HTML <meta> tag. If you don't know how best to set all of these tags, it's a good idea to look up their meaning, and how they should be best used for SEO.
You can find more information about WikiSEO here:
If you're using infobox-style templates on most pages, a good strategy is to place the tag within the templates, so that you don't have to add it manually to each page; and then populate it with specific parameters from the infobox.

Running a wiki farm

It's not uncommon for organizations and corporations to want to run more than one wiki; sometimes many more. A company that runs public wikis on different topics, for advertising revenue or any other reason, may end up running a large number of them. Internally, companies may want to host more than one wiki as well. Access control to data is one reason, as noted on here: the most secure way to keep a set of wiki data restricted to a defined group of users is to keep it in a separate wiki. And different departments within an organization could each want their own wiki, either to keep their data restricted or just because they have little need for sharing data with other groups. In a very large company or other organization, the number of such independent subdivisions that would want their own wiki could number even in the hundreds.
Of course, each group that wanted their own wiki could simply set one up themselves; if they all use MediaWiki, installation is free and generally not too difficult. (That, in fact, is how wikis have historically been introduced into organizations: small groups setting them up themselves, in what's known as "skunkworks" projects). But that kind of setup can quickly become unwieldy: if a different person needs to become a wiki expert for each wiki to be created and maintained, that's too much work being expended. Even if all the wikis are managed centrally by a single IT person or department, that can become a tedious amount of work when it's time to upgrade the software.
In such a situation, what you should be using is what's known as a “wiki farm”, or sometimes “wiki family”: a group of wikis that are managed from a single place, and to which it's easy to add additional wikis. In MediaWiki, there are a variety of ways to create a wiki farm. The best reference for reading about the different approaches, and how to set up each one of them, is here:
There are many approaches listed on this page: single vs. multiple code bases, single vs. multiple databases, single vs. multiple instances of LocalSettings.php, etc. However, there's only one approach we really recommend, which is to use a single code base, multiple databases and multiple settings files. This essentially corresponds to the “Drupal-style sites” approach described in that page.
We won't get into the full technical details here, but the basic idea is this: you have a separate database for each wiki, as well as a separate settings file. Each per-wiki settings file gets included from within LocalSettings.php. The individual settings files set the database name for each wiki, and let you customize the wiki's settings, including standard features like the wiki name, logo, skin and permission; in addition to allowing for extensions that are only included for some wikis.
The “Wiki family” manual includes a simple combination of a PHP and shell script for this approach, that together let you create and update the database for each wiki.
You also need to decide on a URL structure for the different wikis: the two standard approaches are to use subdomains, like “wiki1.mycompany.com”, or subdirectories, like “mycompany.com/wiki1”. This structure has to be handled by a combination of LocalSettings.php (which has to figure out which settings file to use, based on the URL), and the server configuration, which, if Apache is being used, is usually the file httpd.conf. The specific settings for both are covered within the “Wiki family” manual.
$wgSharedDB = "main-database-name";
Though “shared DB” sounds like a big deal, by default only tables that have to do with user information are shared.

Multi-language wikis

Of all the things that wiki administrators typically want to do, possibly the most conceptually tricky is to have their wiki support multiple languages. That's because there's a trade-off in place: you want the text each person reads in their language to be as precise as possible, but at the same time you want to avoid redundancy, because redundancy means more work to try to ensure that the contents in different languages all match each other.
First, some good news: the text of the interface itself – like the text in the “Edit” and “View history” tabs, or the text in special pages – is usually not an issue, because if a user sets their own language under “User preferences”, chances are good that all of that text has been translated into their language, thanks to MediaWiki's top-notch translation setup.
That just leaves the contents of the wiki. For that, the right approach depends mostly on whether the content is meant to be created only by users who speak one language, but read in multiple languages; or whether content is meant to be generated by users speaking multiple languages.
There are essentially three approaches. In order from most difficult to least difficult, they are:
Figure 15.4: A bar with links to different translations of a page, provided by the Translate extension