Capture Doc: POR FESR CALABRIA 2007-2013.
YEAR 2014
The main objective of the project “Capture Doc” has been the automatic management of passive invoices in electronic documents format for business studies using document understanding, page segmentation, table recognition and data capture techniques. In particular, the implemented prototype allows the semantic extraction of information from scanned documents. The module is based on defined methods that exploit the spatial and semantic information of documents and are able to recognize relationships between text labels and values obtained as combinations of natural language processing and spatial reasoning techniques. In addition, a system module allows correction of OCR errors and semantic constraints and ensures reliability in the automatic management of invoices.