What Can an Indexing Tool Do with a TEXT?

In our presentation of the notion of an interactive concordance, which lies at the basis of this site, we have to admit that this is a quite radically new tool. This radical tool will also require a good bit of theoretical discussion about text analysis that may seem gratuitous and even fatuous to those interested in specific lines of debate, especially given the involved debate of contemporary thinkers. To illustrate this last point, let me digress on the matter of computer techniques and the study of texts.

In the course of this presentation the notion of the "electronic" tool will take on the function of dominant theme. Let us review: a word processor is an "electronic" tool. As every teenager knows today, a word processor keeps a text in the memory of the computer while it is being written or edited. The computer can save (i.e. store) the text on a disk drive for subsequent recall and more writing and editing. It beats retyping pages, totally - not that today's teeenagers are familiar with the concept of retyping pages in oder to get it just right.

But that is not all there is to electronic tools; once the words are in the memory of the computer (i.e. in electronic digital encoding) a whole range of procedures can be applied to the them. For example, it is quite easy to check the text for spelling errors and to look up synonyms in a thesaurus located in the computer. It is also possible to check for grammar problems or sexist language. Spell-checkers, thesauri, grammar-checkers and language purifiers are all electronic tools.

By the same token, an indexed dictionary is a tool. The procedure of indexing is really quite simple and requires only very basic (word processing level) computer skills. The computer creates a large electronic reference/look-up table with each word and the precise location of each word in the text. Once the table is created, the computer - even a slow, inexpensive computer, can locate (i.e. retrieve) each occurrence of a particular word or of a list of words with the page numbers to the printed text in less than a heartbeat.

If you want more details: CLICK HERE.


Indexing your Own Work

Let us sketch out the procedure for using an electronic index to create an index for a scholarly monograph. It is assumed that - thanks to the ubiquitous word processing - the text is already in electronic form.

If the book is in galleys, the page number markup has to be put into the wordprocessing version of the text. That file can then be fed to one of a number of indexing engines. Output from the engine - if it is a good one - will yield two forms, 1. a printed index of all the words and their page numbers and 2. an interactive browser that allows quick inspection of individual citations.

"Indexing" is a quite complex concept; it should be obvious from this pedestrian example that here we have a very literal use of an "indexing" tool. Since we are only making a "book style index," essentially a "vocabulary list" or a "dictionary" to be put after the text with references to page numbers, we are still securely in the realm of word processing/manuscript preparation; in fact, many word processing programs have a built-in indexing function.

However, since we are also dealing with texts that have deep structures of meaning (or not - as the case may be) and are not just indexing a memo or a proposal, we are also on the threshold of some sort of thematic discovery.

"Book style indexes" have been done without computers for several generations. The question here is: what does the computer bring to this task.

A computer generated word-list of all the words in a text is of course much closer to the actual text than words selected by the author, or a grad student, as an final task in the preparation of a manuscript, even if it is a conscious inventory of the text and a guide to potential readers. It is much easier to delete entries from an electronic list of extant words than it is to remember everything to be indexed.

One could argue that looking at an alphabetical as well as a frequency list is a good idea in any case, if only to check for inappropriate or unusual vocabulary. If it is so easy, why not do it?

An index can be good or bad, primarily based on the amount of thought and effort that goes into its production. The problem is that in many scholarly efforts, the production of the index is defined as a clerical task.

The computer brings much to this task. Rather than reflecting a "most common words and concepts" list or a subjective picking through the text by the editor, the list created by the index reflects an actual inventory of "all" the words in the text. We probably do not have to argue too strenuously that the electronic index approach saves time and gives the task of selecting actual page number pointers much greater precision.