M1: Unit 1: Chapter 6 – Advantages of Corpus Processing Tools

« back to unit 1                                                                                                                                                                                               » next chapter

Ch 1: Introduction Ch 4: Corpus-Based Translation Studies Ch 7: Limitations & Potential of Corpus Processing Tools
Ch 2: Systemic Functional Grammar Ch 4 Continued: Corpus-Based Translation Studies Ch 8: Bibliography
Ch 2 Continued: Systemic Functional Grammar Ch 5: The Feel of the Texts
Ch 3: Point of View (POV) Ch 6: Advantages of Corpus Processing Tools

Chapter 6: Advantages of Corpus Processing Tools


The first advantage is that texts can be stored, distributed and manipulated in ways that are not possible with hard copy corpora. Data can be retrieved very quickly; studies involving electronic corpora can be repeated or supplemented by what appears to be more appropriate studies, more easily than with non-electronic corpora. If a corpus is available to the research community, other researchers can corroborate or invalidate the findings of an initial study based on that corpus.

Corpora can also perfect the hypotheses on which earlier studies were based as ‘the findings of corpus-based studies are in some ways always suggestions for future research’ (Partington in Kenny 2001: 211). The processing techniques, like concordancing, allow the same data to be viewed from different angles, and this stimulates multiple analyses and invites researchers to rethink their position continually. Also, because comparative data can be taken into account with great ease, researchers are encouraged to look at them with fresh eyes. Maria Tymoczko also foresees the ‘construction of many different corpora for specialized, multifarious purposes, making room for the interests, inquiries, and perspectives of a diverse world’ (1998: 5).

Kenny’s study (2001) shows that a parallel corpus in electronic form can enable analyses that would not be otherwise pursued by individual researchers, as they would be too impractical (two million-word). Moreover, even if a researcher had the time to find all the instances of a word and their translations, the work involved would be weary and it would not be possible to keep the level of concentration required to find all the instances of these words. Corpus-processing software aids the human analyst to concentrate on his or her judgement of the data. Different analyses can be carried out using the same corpus.