memasysco: XML Schema Based Metadata Management System for Speech Corpora

Download Now Free registration required

Executive Summary

The metadata management system for speech corpora "Memasysco" has been developed at the Institut f?r Deutsche Sprache (IDS) and is applied for the first time to document the speech corpus "German Today". Memasysco is based on a data model for the documentation of speech corpora and contains two generic XML schemas that drive data capture, XML native database storage, dynamic publishing, and information retrieval. The development of memasysco's information architecture was mainly based on the ISLE MetaData Initiative (IMDI) guidelines for publishing metadata of linguistic resources. However, since the authors also have to support the corpus management process in research projects at the IDS, the authors need a finer atomic granularity for some documentation components as well as more restrictive categories to ensure data integrity.

  • Format: PDF
  • Size: 1633.8 KB