To extract data from official documents (passport, birth certificate, visa, etc.), you can use the Smart Engines module. It is a tool for recognizing passports and other identification documents in Russia and other countries. You can recognize .pdf or .jpg format files.
By default, it is available as a trial version for information only. Some characters of the recognized data are hidden. To activate a full version, you need to purchase a license. For more information, contact us: firstname.lastname@example.org.
To recognize documents of the Invoice and Universal Transfer Document types use the Intellect lab module. It can recognize .pdf format files. The module is deployed on a separate server and licensed separately from the platform. For more information, contact your sales manager.
You can set context variables that will store the recognized text and use them in the process. Read more about context variables in the Process context article.
To open the settings window, click on the activity on the process diagram.
The Parameters tab displays the basic activity parameters.
- Name. The activity name on the process diagram. Its name is set by default when adding the activity. You can change the name in this field;
- Document. A document you want to extract data from. This field supports only the File type variables. You can add a new variable by clicking the icon. Read more about creating context variables in the Process context article.
- Recognition method. Select recognition method.
Document recognition using Smart Engines
After selecting the Smart Engines method, fill in the following fields.
- Country. Select a country for a document;
- Recognized document type. Select a document type (visa, passport, birth certificate, TIN, etc.).
To set the data extracted from a document and the variables that will store it, click the Assign Variables button. The attributes of the selected document type will be displayed. Each document has its own set of attributes.
Document recognition using Intellect lab
After selecting the Intellect lab method, fill in the following fields.
- Server. Intellect lab server address;
- Waiting for server response (sec.). A waiting time for the server response. After this time the process continues.
- Recognized document type. Select the type for document recognition: Invoice or Universal Transfer Document. If you’re not sure, select both types;
- Type of the returned document. A variable that will store the type of the recognized document. You can select it from the list or create a new one by click the icon. Read more about creating context variables in the Process context article.
To set the data you need to extract and to specify the variables that will store it, click Assign variables. After that you will see the attributes of the selected document type. Each document has its own set of attributes.
Select a context process variable to save data for each attribute. You can also create a new variable by clicking the button. To delete a variable, click the icon.
When using the Smart Engines method, the window displays the recognition accuracy of the selected attribute. It determines the minimum acceptable accuracy (confidence) and depends on a lot of aspects. One of the most important aspects is the quality of the document image. For example, you set 90 percent (0.9). This means that if the recognition accuracy is 90 percent or more, values can be recognized and accepted. If the accuracy is less than 90 percent, the value is not accepted. The context variable will not be filled.
If the required accuracy is low, it is more likely to get incorrect data. For that reason, you need to be careful in selecting this parameter. Pay attention primarily to the document quality. If you are sure that it is high, you can set the accuracy rates of 97 percent or more. If the quality is slightly lower, it is better to set 94 percent. If it is low, set about 90 percent or less.
If data are not recognized, the process will not move further, and an error will occur. To avoid it, create an escalation for this activity. Read more about escalation in the Execution flow article.
You can search for document attributes by their names. To do this, start entering a name in the search bar. The search results will immediately be displayed in the table.
The Extracted data block displays all the selected attributes and context variables.
Read more about the Conditions tab in the Basic activity settings principles article.
Found a typo? Highlight the text, press ctrl + enter and notify us