How to Title Electronic Documents Using Text Mining Techniques
Automatic titling of text is a task allowing to determine a well formed word group able to represent the text in a relevant way. The main difficulty of this task is to determine a title having morpho-syntactic characteristics close to titles written by concerned people. The authors' approach has to be relevant for all type of text (e.g. news, emails, fora, and so forth). Their automatic titling method is developed in four stages: corpus acquisition, candidate sentences determination for titling, noun phrase extraction in the candidate sentences, and finally, selecting a particular noun phrase to play the role of the text title (ChTITRES approach).