UiPath Document Understanding Framework

UiPath Document Understanding Framework

Imagine a machine or a bot looking at the scanned document with the values scattered. In such a case, human brain is trained naturally to segregate the data from different scanned documents. In this case the machine requires the “Eyes” called the “Intelligent OCR” or the “OCR Engine” and the Brain called “Customizable Machine Learning” algorithms.

The UiPath Document Understanding Framework is designed to help users combine different approaches to extract information from multiple documents, not necessarily with the same structure.

You would need to install the below packages to start with.

Packages to be Installed:

1)Intelligent OCR activities

2)Omni Page OCR

3)Machine Learning Extractor

Implementation Steps:


In this pre-processing step, you can add multiple document types and the fields you are interested in extracting.  For example, you can work with Invoices, wanting to extract the vendor and the total amount, and with medical forms, wanting to extract insured ID number and patient name.


Using Taxonomy Manager, you can you can create your own Taxonomy.

Taxonomy Manager


As the documents are processed one by one, they go through the digitization process. The difference for non-digital (scanned) documents is that you need to apply the OCR engine of your choice. The outputs of this step are the Document Object Model and a string variable containing all the document text and are passed down to the next steps.

Digitize document


After digitization, the document is classified. If you are working with multiple documents types in the same project, to extract data properly you need to know what type of document you’re working with. The important thing is that you can use multiple classifiers in the same scope, you can configure the classifiers and, later in the framework, train them. The classification results help in applying the right strategy in extraction.

Classify document scope


Extraction is getting just the data you are interested in. For example, extracting specific data from a 5-page document is quite troublesome if you want to do it with string manipulation. In this framework, you can use different extractors, for the different document structures, in the same scope application. The extraction results are passed further for validation.

Data extraction scope


The extracted data can be validated by a human user through the Validation Station. A best practice is to build logic around the decision of adding or not a human validation step, with rules depending on the specific use case to be implemented. Validation results can then be exported and used in further automation activities.

Present validation station


Once you have your validated information, you can use it as it is, or save it in a DataTable format that can be converted very easy into an Excel file.

Export extraction results

Training Classifiers and Extractors

Classification and Extraction are as efficient as the classifiers and extractors used are. If a document was not classified properly, it means it was unknown to the active classifiers. The same way goes for incorrect data extraction. The Framework provides the opportunity to train the classifiers and the extractors, to improve recognition of the documents and fields.

Leave a Reply

Retype the CAPTCHA code from the image
Change the CAPTCHA codeSpeak the CAPTCHA code


Leave us your info