Automate Your Invoice Processing: Extract Data From Invoices in 6 Easy Steps

invoice data extraction with width.ai

The state of invoice processing remains alarming and surprising — 36% of firms still use paper invoicing, 47% rely on manual approvals, and 44% are still looking to add invoice automation. But why are these numbers so high and why haven't four decades of computerization and automation brought them down? Do you have to deal with invoicing problems in your business too?

In this article, you'll learn about the significant problems of existing systems, find out how deep learning algorithms for automated data extraction eliminate most of those drawbacks, and learn the six steps to add fully automated invoice data extraction to your business.

What Is Automated Invoice Data Extraction?

example of ocr on text

Your business or organization purchases goods and services from multiple vendors. The invoices they send you are processed by your accounting department. But the problem is that most invoices and receipts are meant to be read by people and not by software systems. So accounting teams have to enter the data into your accounting system, check it for problems, confirm the details with other departments, and release payments as per each vendor's payment terms. In some cases, specific data points are pulled to be stored for business or marketing purposes.

If you're a medium or large business, you may receive hundreds of such invoices or receipts every day. Processing them is a process that requires a ton of manual hours and resources to ensure the data is stored correctly in software such as Quickbooks or your internal CRM.

But this process can be transformed into a faster, automated, and less erroneous process using automated invoice data extraction systems. They use machine learning to extract the different data points and understand what they are (price, name, quantity, product) in any invoice — even a paper or handwritten invoice — regardless of its layout, formatting, language, currency, and other details. They extract all that information as structured data that can be consumed by downstream systems such as ERPs, CRM, and internal databases.

To appreciate their benefits, you need to first understand all the typical problems businesses face when processing invoices using other methods.

Problems With Manual Invoice Processing

Processing invoices manually still happens in about 36% of U.S. businesses.

The points of contact assigned to vendors receive their invoices first. They do basic checks like looking for missing details, wrong line items, and other obvious problems. However, they can't practically check for deeper problems like duplicated entries across invoices or fraudulent entries. Invoices that pass the basic checks are passed on to the accounting department.

The accounts payable team then enters the key data in their internal database or accounting system. They may check for deeper problems like duplicated or fraudulent entries and mismatches with purchase orders. But even such checking requires manual querying and searching in their data. Typically, a senior accountant then double checks everything to reduce errors.

So manual processing suffers some major challenges:

Most businesses know about these problems and have tried to use semi-automated approaches for years. But those have problems too, as we'll see next.

Problems With Template-Based Semi-Automated Approaches

Most businesses of all sizes today use semi-automated invoice processing involving optical character recognition (OCR) and template matching. OCR is first used to extract all the text found in an invoice along with their positions. Then custom parsing rules are created to process that text based on its positions and words.

While this works for standardized documents, it doesn't work well for invoices. Invoices vary widely in layouts and positions. So existing invoice management systems have tried to ease the problem using the concept of invoice templates.

An invoice template is a collection of positional information and parser rules that allow the management system to transform unstructured text into a semblance of structured data if that invoice matches that template. They try to make template creation easy by providing visual editors and predefined rules. But what if you receive dozens or hundreds of invoices from different vendors every day? Because these templates are not very flexible, the only option is to create a template for every unique invoice layout.

So even these semi-automated approaches bring several challenges:

4 Compelling Benefits of Automating Invoice Processing With Deep Learning

invoice text recognition

Semi-automated approaches do not contain deep learning models that can create a learned relationship between invoice text, text positioning, and keywords to automatically extract invoice data that matters to us.

Leveraging natural language processing and computer vision architectures to handle all stages of invoice processing is a game changer. These architectures allow us to create pipelines that can process a huge amount of variation in invoices or receipts with breathtaking accuracy in a matter of seconds. Let's understand the significant business benefits of such a system over the older manual and template-based approaches.

1. Handle A Huge Range Of Invoice Variation

Instead of relying on rigid rules and simple pattern matching, these systems understand the semantics of invoice data the way people do. So they can handle invoices with any layout, formatting, lighting conditions, and other aspects. They quickly learn the relationship between the text on the invoice and the fields you’re looking for. These deep learning architectures can extract from row based invoices, grids, receipts, all while being able to handle different levels of input data for the field. That means if your product name is 2 words or 20 the field can be extracted and understood for what it is. Some fields such as price or quantity need even less information and can rely on the architectures understanding of column names and the location of fields.

receipt ocr

2. Highly Accurate Results

Unlike traditional OCR, deep learning combines visual features and natural language models for true entity understanding. They are largely immune to typos and wrong identification of characters. The high accuracy also brings significant downstream benefits like reducing the risks of financial losses or late payment penalties.

The accuracy of OCR products for these invoices just simply does not cut it. These APIs do not have the deep level understanding of the different fields on an invoice that allows you to map them when extracting. On top of that, these APIs do not allow for custom fine-tuning on your specific data to provide an in-context boost of accuracy.

accuracy of leading ocr products on invoices

3. Increase Processing Efficiency, Reduce Time, and Lower Costs

Intelligent processing that matches human-level understanding streamlines every step of the invoice processing pipeline. High accuracy significantly reduces the time, effort, and money spent on manual data entry and double-checking. This pipeline runs in a matter of seconds and removes the need for any human intervention.

4. Integrate Seamlessly With All Your Accounting Practices

The ability to intelligently understand invoice data means these systems can automate other accounting practices too:

Our Automated Invoice Data Extraction Software Streamlines Your Invoice Processing

width.ai ocr processing

Width.ai automated invoice extraction offers state of the art accuracy in a fully automated and hands off pipeline that processes your invoices in seconds. The API can be customized to automatically accept invoices from your internal workflows and automatically push the grabbed invoice data to your CRM, ERP, DB, or email. The pipeline is fully customizable and can be fine-tuned for your specific domain or invoices to push the accuracy even higher. No matter if you have zero invoices or millions you can deploy this architecture and start processing documents in just 6 easy steps.

Step 1: Plan the Integration Into Your Invoice Processing Workflows

The first step is planning how to include our system in your existing business practices. This involves answering questions like:

The answers to such questions help us assess your integration and deployment requirements. For example, if you want thousands of historical invoices processed in a short time, we deploy additional cloud infrastructure to process your workload. If you're a large enterprise with hundreds of general ledger codes, then we train a secondary machine learning model to automatically generate a code for each invoice line item.

Step 2: Configure The Invoice Capture Software for Reading Your Invoices

input of width.ai invoice processing

Our system supports a wide variety of invoice file formats and sources. It does not require you to provide invoice templates or define a given set. In this step, you should set up your invoice input pipeline and tell our system about the sources of invoices.

Setting Up Digitization for Paper Invoices

If some of your vendors still send paper invoices, we should set up a data capture pipeline for invoice scanning and digitize them to image formats like PDF. A scanner is perfect if you have it, but our system is capable of processing images/PDFs with varying levels of quality.

The second approach is probably the easiest way to integrate our automated system into your accounting workflows. Equip your current manual processors with photo taking tools and set them up to transfer the invoice photos to internal network storage or to cloud storage like AWS S3. Our system can automatically fetch new files from there and process them.

Setting Up Digital Format Invoices

Many businesses already send PDF invoices or other digital formats. Our system can read these files directly and process them without any invoice templates. We’ll configure our system to read them from your internal databases, emails, or others.

Configure Integration With E-Invoicing and Accounting Systems

If your business has an invoice management system (like FreshBooks, QuickBooks, Zoho Books, Xero, or Pilot) or an accounting system (like SAP FI), our system knows how to query and fetch invoices directly from them using their APIs. Our system does not need fine-tuning to start processing your invoices from the most common invoice management systems.

Step 3: Refine the Architecture on Your Invoices

invoice example

Fine-tuning the deep learning models used to extract text and understand the relationship between text and fields allows us to go from the baseline models built to support as much data variance as possible to a domain specific pipeline optimized for your use case. Fine-tuning also allows for us to add unique fields or special remarks to the list of fields that are extracted when your invoices are processed. This is a huge part of the equation as it allows us to take the state of the art architecture we’ve built and fully customize it to your business workflow.

Optimizing For Your Invoices

Right out of the box, our system understands invoice layouts produced by popular invoicing software like FreshBooks and others at a high level with good data variance coverage. We want to boost the accuracy for your specific business workflow and will fine-tune the deep learning models on real invoice and receipt examples from your business.

Extract Critical Additional Information

Layout variations may not be your only concern. While our model supports over 50 of the most common fields right out of the box, you may want to extract additional important information according to your unique accounting practices. For example:

Our system efficiently handles special information that is important to your business. For example, its natural language processing capabilities enable it to understand a sentence with payment terms and automatically file it under the payment terms field.

Add Custom Fields

custom fields with invoice data extraction

Our customizable pipeline allows you to add new fields that you want to extract on top of the 20+ default fields provided. We handle all the fine-tuning required to add these new fields to your specific pipeline.

Customizations based on one vendor's invoice can be automatically applied to all invoices across any vendor, or a subset of invoices. Our system then automatically looks for semantically similar information in every invoice (including older invoices if you want) to populate the relevant fields.

Some of the custom information we have extracted or added using such techniques include:

Step 4: Configure Extracted Data Storage

output configuration of invoice processing

The ability to integrate with popular accounting systems and practices straight out of the box is an important requirement for any automated invoice data extraction that aims to save time and effort. Our system comes with the following built-in export and storage integration features:

You can route the extracted data to multiple export workflows for any subset of invoices based on custom criteria, like vendor names or invoice amounts.

Step 5: Start Processing Invoices

Once you have configured and fine-tuned our system to your invoices, it's ready to process your invoices in bulk and start extracting structured data with minimal manual intervention. Let's understand how it works under the hood.

How Processing Works

First, let's understand the high-level end-to-end processing of an invoice.

The heart of the system is a state of the art deep learning architecture. We’ve trained it on thousands of invoices — generated by ten of the most popular invoice management systems like FreshBooks and QuickBooks — to help it detect text, recognize the text characters, and associate invoice elements with appropriate fields. To do this, the system uses various characteristics of these elements, like their:

By focusing on the most popular examples from businesses, our system produces high accuracy right out of the box but also has enough flexibility to adapt to your specific business use cases.

In the third step where you refine it by providing your invoices, we clone our latest model and fine-tune it on your invoices to create a model that's adapted to your invoices. Your preferences, like custom fields, help further refine this model to identify the data you want.

This customized model is now ready to process your new invoices. It scans each invoice to extract visual and linguistic characteristics. The combinations of characteristics trigger different areas of the neural network to identify an element as an address, purchase order, invoice number, etc.

Some elements may require additional processing. For example, a secondary deep learning model classifies the line items identified in your invoices to output their general ledger codes.

The primary output of this phase is a set of fields and their values. This data is then routed to one or more export pipelines to produce reports in different formats or export data to an accounting system.

How OCR Is Used

ocr on receipts

Text detection and text recognition are important steps in this process. Text detection classifies regions in the input invoice as text or non-text elements (like lines of tables). Our system can identify handwritten text, irregularly oriented text, and signatures as text.

Text recognition involves recognizing the characters in the text. This is done based on visual features and a language model that gives the most probable words given the neighboring characters and the typical words in invoices. Such text understanding using multiple features provides far higher accuracy than plain OCR.

Data Extraction Using Deep Learning

We use a state of the art deep learning pipeline to handle extracting visual features, text, and any other fine-tuned fields. This is used to get all the text from the invoice as well as any visual information that will help us understand what fields in your system match to what data.

The recognized text is further refined by a second subnetwork that specializes in natural language processing, like GPT-3. This model is trained on large volumes of text, like books and websites. As a result, it can identify typos and weird word combinations that are unlikely in the real world.

With all text fragments accurately identified and located, the final subnetwork is a deep learning model that identifies those fragments as fields or field values. It does so based on their meanings, positions on the page, and positions relative to each other — exactly like a person would. For example, a string that has numbers and dashes in the upper regions of an invoice is probably the date of issue while a date in the lower regions is probably the last date of payment. Using such knowledge, it identifies a text fragment as a field value and classifies a probable field name for it.

The final outputs of this model are a set of detected field values and their field names.

Step 6: Measure and Monitor the Process

While we aim for minimal manual intervention in bulk invoice processing, monitoring metrics is necessary to ensure accuracy, and real-time alerts are needed to inform personnel about the progress or any issues that come up.

Model Metrics

The initial runs may show some errors in text recognition and field identification. If that happens, it's a hint to refine your model by providing more samples of problematic invoices. Our system reports metrics like accuracy, precision, recall, and F1-scores on the training and test invoices after each refinement step to help you understand the accuracy when extracting data from any invoice format.

Our system also tells you confidence scores for each processed invoice. Invoices that are proving problematic for the model show low confidence scores. This is a great way for you to monitor the results of the invoice processing at a high level without needing to evaluate each invoice.

Notifications

Whenever a set of invoices are processed or metrics fall short of thresholds, our system can notify personnel or workflows that are waiting for the extracted data. Our system supports alerts through:

Streamline Your Automated Invoice Processing

You’ve seen our architecture used to extract data from invoices using artificial intelligence and deep learning to provide an instant reduction of manual effort and costly resource consumption. The benefits are instant as you start to extract structured data and process it into your accounting system wih state of the art accuracy and zero human input required.

At Width.ai, we have years of expertise in developing highly accurate information extraction systems using the latest deep learning innovations. We can customize our invoice processing solution to your exact business requirements and use cases. Contact us to see a demo of our invoice processing!