International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE)
Document digitization and Document Analysis and Recognition (DAR) are techniques that are used for handling document images. Several techniques were implemented to perform document digitization. Article reconstruction is one of the main applications of document digitization. The four major steps of article reconstruction are grouping the article bodies, detecting the reading order, title body pair association and article parts linking scattered in different pages. This paper presents a survey on different techniques that are used for document digitization as well as for article reconstruction.