You can download the database from download.wikipedia.org and use MySQL to query.
Asking developers to run queries on the live database
Some developers have the ability to run queries on the live database. You may request they do this at tswiki:Query service.
Users with both technical access to the database and the willingness to perform queries may be found by viewing the list of query-service users.
Using an external tool
It was once possible to run an sql query tool under http://www.wikisign.org, which supported Wikipedias of major languages etc. Currently inactive.
- When copying a query from this page, some browsers copy a '*' in front of the SELECT statement. This has to be removed!
- Most of these queries return article titles. These do not contain the namespace! The namespaces have numeric codes in the separate cur_namespace field. 0 is the regular article namespace, others can be found on Wikipedia:Namespace
To avoid time-consuming or malformed queries, here are some useful suggestions (please add more - some requests might be found on this page's talk page):
Note: to convert the returned list in mysql to wiki-format use: s/(\ |\|)//g;s/^/*\[\[/;s/$/\]\]/
Quick instructions for running your own MySQL queries
This entire process can be completed in ~1 hour on a broadband connection. <screenshots coming soon>
Example: (sample output)
mysql> SELECT c.cur_title FROM wikipedia.cur c
-> WHERE c.cur_text LIKE "%directory.google.com%";
- Also consider installing the MySQL Query Browser, which is a very pretty graphical user interface for MySQL queries. To use it, you start the MySQL server on the command line, and then run this program on top of it. Of the options it gives you, you need only to type in 'localhost' for the server address and push enter. (You will likely need to restart if installing both the MySQL server and query browser at the same time.)<screenshots coming soon>
To limit results:
- "SELECT ... LIMIT 20".
You can also add an offset;
- "SELECT ... LIMIT 100, 20" will give you 20 records starting at the 101st.
 Queries to find new users
Here's a couple of queries to find the 20 most newly created users - useful to find people to give welcome messages. It only counts users that have made edits (most users who create a login never make a single contribution and are just passing by, so it's not really worth welcoming them).
- SELECT user_name, COUNT(*) FROM user, cur WHERE user_id=cur_user GROUP BY user_id ORDER BY user_id DESC LIMIT 20
The above query gives the user names of the 20 most recently created users and the number of edits they have made. The query is not perfect - only edits that have not been 'overwritten' by other edits are counted here. To do the same thing looking only at 'overwritten' edits use:
- SELECT user_name, COUNT(*) FROM user, old WHERE user_id=old_user GROUP BY user_id ORDER BY user_id DESC LIMIT 20
 Queries to find stub articles
For short pages containing "see" (takes about 2 seconds):
- SELECT cur_title FROM cur WHERE cur_is_redirect=0 AND LENGTH(cur_text)<50 AND cur_text LIKE '%see%'
finds all pages with less than 50 chars.
 Queries to find articles with one edit (new)
SELECT cur_title FROM cur WHERE cur_is_new AND cur_namespace=0;
Finds all pages that have only been edited once.
 Queries to find inappropriate links
 Double Redirects
- SELECT ca.cur_namespace, ca.cur_title FROM cur ca, cur cb, links l WHERE ca.cur_is_redirect=1 AND cb.cur_is_redirect=1 AND l.l_from=ca.cur_title AND l.l_to=cb.cur_id AND ca.cur_namespace=0 LIMIT 250
- SELECT ca.cur_namespace, ca.cur_title, FROM cur ca, cur cb, links l WHERE ca.cur_is_redirect=1 AND cb.cur_is_redirect=1 AND l.l_from=CONCAT("Talk:",ca.cur_title) AND l.l_to=cb.cur_id AND ca.cur_namespace=1 LIMIT 250
These queries are slow. You may have to reduce the limit for them to work.
Full article ▸