CLaRK - An XML-Based System for Corpora Development

Free registration required

Executive Summary

In this paper the authors describe the architecture and the intended applications of the CLaRK system. The development of the CLaRK system started under the Tubingen-Sofia International Graduate Programme in Computational Linguistics and Represented Knowledge (CLaRK). The main aim behind the design of the system is the minimization of the human work during creation of corpora. Creation of corpora is still important task for majority of languages like Bulgarian where the invested effort in such development is very modest in comparison with more intensively studied languages like English, German and French. They consider the corpora creation task as editing, manipulation, searching and transforming documents.

  • Format: PDF
  • Size: 152.4 KB