How to use Azure AI Document Intelligence for AI-based text recognition

With Azure AI Document Intelligence, Microsoft offers a sophisticated solution for the automated scanning and analysis of documents in all file formats. We provide a practical hands-on introduction to this cost-saving AI technology.

Key facts:

  • What it’s all about: Azure AI Document Intelligence is an AI-based text recognition solution in the Microsoft Azure Cloud. The solution can extract information from PDFs, photos, graphics or handwriting and prepare it in a structured way.
  • Advantages: This simplifies processes, automates unpleasant tasks and saves costs. Examples of applications include customer service (applications, complaints), accounting (scanning receipts), healthcare (prescriptions), laboratory technology (laboratory reports), archiving (document archives) and many more.
  • Usage: Integration via API
  • Costs: approx. 1.50$ per 1,000 pages

What Azure AI Document Intelligence does

Azure AI Document Intelligence (see Microsoft product page) is part of the Azure AI platform and uses machine learning methods to scan, recognize and classify documents. It can process a variety of document types, including invoices, receipts, forms and even handwritten notes, as well as custom document types.

Unlike standalone solutions such as OmniPage or Adobe Acrobat Pro DC, the Azure-based solution is integrated into your own applications via API. This direct integration allows you to create an application that is precisely tailored to your requirements and processes. Some IT development is required for API integration, but the solution is extremely easy to configure and quick to use. Microsoft also provides all code examples, saving developers valuable time.

Advantages of Azure AI Document Intelligence

  • Recognize text and tables in documents of all formats such as PDF, graphics, handwriting (OCR)
  • Introduce fast document processing in the company (efficiency)
  • Reduce costs (automation instead of manual work)
  • Minimize errors (e.g. avoid typing errors when typing manually)

Costs

The usual usage-based costs are incurred with Azure. The advantage of the cloud solution is that you pay according to usage volume. This means you do not have to purchase a solution with expensive license costs, but can pay flexibly according to the pages or documents used (“pay as you go” model)

  • 0-500 pages/month: free of charge
  • 1.000 pages/month: 1.50 $ (cheaper for > 1 million pages/month)
  • see cost calculator on the Microsoft Azure website

Text recognition : Typical use cases and business benefits

Azure AI Document Intelligence makes cost and efficiency benefits possible for numerous industries. In short: the solution can help wherever a large number of documents are regularly processed. Here are a few examples.

Invoice processing: Azure AI Document Intelligence can help to automatically scan incoming invoices, extract the relevant data such as amount, date and invoice number and transfer it to the accounting system. This reduces manual, time-consuming activities and speeds up the entire accounting process.

Incustomer service, inquiries can be processed more efficiently by automatically analyzing and classifying incoming documents such as application forms or letters of complaint. This leads to faster assignment of inquiries to the responsible employees and thus improves customer service.

In the healthcare sector ,Azure AI Document Intelligence enables more efficient processing of patient files by automatically capturing relevant information such as diagnoses or treatment plans. This contributes to improved patient care and more efficient administrative processes.

In logistics, the automatic processing of delivery bills and bills of lading can lead to faster supply chain processes by extracting and processing relevant information such as delivery addresses or product lists immediately.

In the field of digital humanities ,Azure AI Document Intelligence supports the creation of digital archives by digitizing and analysing historical documents and manuscripts. Prominent examples of such document libraries include Project Gutenberg (making thousands of digitized public domain books available for free), Internet Archive (digital content ranging from websites to books and music). These applications enable broad access to cultural and historical material, promote research and education and facilitate the creation of interactive learning materials.

Tutorial: Azure AI Document Intelligence in 5 steps

In no time at all, you can set up a new instance of the solution in Azure, try it out interactively in the Studio and then integrate it into your own processes via API.

About this short tutorial:

  • Goal: Use Document Intelligence in Azure and learn how to integrate it via API
  • Suitable for: Azure beginners and professionals, developers, data analysts
  • Time required: 15 minutes
  • Cost: free to very low

The steps at a glance:

  • Step 1: Set up an Azure account
  • Step 2: Create a Document Intelligence resource
  • Step 3: Call up Document Intelligence Studio
  • Step 4: Use Document Intelligence Studio
  • Step 5: Integration via API

Step 1: Set up an Azure account

If you do not yet have an Azure account, you can test Azure for 30 days free of charge and receive $200 starting credit, which is more than enough for a lot of data and tests.

Step 2: Create a document intelligence resource

Now let’s set up a free cloud instance of Document Intelligence (previously: “Form Recognizer”). To do this, switch to the “Azure AI Services” service in Azure and then create a new Document Intelligence resource using the “Create” button. You will then always find this area on this overview page “Azure AI Services” on the left in the menu under “Azure AI Document Intelligence” (or under this direct link) .

Settings:

  • Subscription: Select your Azure subscription
  • Resource group: Create a new resource group (this bundles several Azure services together and you can easily find and delete them later)
  • Name: DocumentIntelligence-RS1 (suggestion: product name your name abbreviation number of your test, here a 1)
  • Server region: Germany West Central (or other location in Europe)
  • Cost plan: Free F0 (up to 500 pages free of charge)
  • Click on “Create” and wait 1-2 minutes until your instance has been created

Step 3: Call up Document Intelligence Studio

In the next step, we navigate to the Document Intelligence Studio in Azure.

Step 4: Use Document Intelligence Studio

In the Azure Cloud, there is an interactive “Studio” application for many Azure tools, with which you can easily test the tool.

We now want to read out a table from an annual report as a test – once as a PDF and once as a scanned graphic. Templates are already available for this in the Studio.

Settings:

  • Application: Click on “Layout” (we want to extract tables, i.e. documents that have a “layout”)
  • Select document type. The choices are: Invoice, Receipt, Identity, Health Insurance card, Business card, Contract, Tax Forms. However, the strength of the solution is that you can create and train your own document types. You select a suitable document type so that the data runs into the correct structure.
  • Select document: Upload your own documents or select the annual report from the templates on the left.
  • Click on “Analyze options”: Here you can make a few more settings, such as the page to be scanned if you want to scan a multi-page PDF.
  • Click on “Run analysis”: This starts the text analysis. The tool will now highlight all recognized texts in color.

Result:

When you click on an extracted area, Document Intelligence displays the extracted data to the right of it, here, for example, a complete table from the business report graphics file, in which all cells and headers were automatically recognized correctly. This is now available in a structured format and can therefore be easily processed by machine.

Step 5: Integration via API

Azure AI Document Intelligence can be easily integrated into existing applications. The following programming languages are supported: C#, Java, Python, JavaScript or via REST API.

To the instructions: Integration Azure AI Document Intelligence via API

Alternatives: Other text recognition solutions

There are several alternative software solutions on the market that offer similar functions to Azure AI Document Intelligence, particularly in the area of document analysis and processing using artificial intelligence and machine learning. Some of these solutions are:

Standalone (“On Premise”) solutions:

  • Adobe Acrobat Pro DC: Provides advanced PDF editing capabilities, including text recognition and conversion, document comparison and easy integration with other services.
  • OmniPage by Kofax: A powerful OCR tool used for document conversion and digitization that provides high accuracy in text recognition.
  • BBYY FineReader: An OCR and PDF software solution that enables scanned documents and PDFs to be converted into editable and searchable formats.
  • Readiris: OCR software that enables text recognition in scanned documents, PDFs and images and saves the converted files in various formats.
  • ScanSoft PaperPort: Provides document management and digitization capabilities and allows digital documents to be organized and shared.

Cloud solutions:

  • Google Cloud Vision API: This solution from Google provides advanced image analysis capabilities and can recognize and extract text in documents, similar to Azure AI Document Intelligence.
  • Amazon Textract: A service from Amazon Web Services that makes it possible to automatically extract, process and analyze text and data from documents.
  • IBM Watson Discovery: This tool from IBM uses AI to understand and analyze complex data and gain valuable insights from it. It can also be used to process documents.
  • ABBYY FlexiCapture: An advanced data capture and document processing solution that uses machine learning to analyze documents and extract information. (Cloud and standalone available)
  • Kofax Capture: Provides automated capture, processing and integration of documents and data into business processes and systems. (Cloud and standalone possible)
  • Ephesoft Transact: A platform for intelligent document processing that uses machine learning and AI to extract and classify data from documents. (Cloud and standalone possible)

Conclusion: Microsoft Azure AI Document Intelligence

Microsoft’s text recognition service “Azure AI Document Intelligence” offers a flexible and powerful AI-based solution for the automated processing of many documents in companies.

Since Germany or any other location in Europe and worldwide can be selected as the cloud location, GDPR-compliant solutions can be created. In addition, the costs of the Azure cloud solution can be classified as low. Anyone already using the Azure Cloud can quickly use this service productively.

By integrating the solution into your own processes, you can save time and effort and create new, helpful applications in companies.