AI & OCR

What is AI powered invoice capture?

When it comes to extracting data from invoices and other business documents, the go-to technology for many freight forwarders has traditionally been OCR, or optical character recognition. 

OCR recognises text from image-based documents – such as PDFs or scanned invoices – then extracts and converts it into digital data or editable text. 

This text can then be manually scrutinized for accuracy, edited if necessary and uploaded into your ERP or TMS. As long as documents have low variability, as a basic invoice reader, OCR is a reasonably reliable technology. Once you’ve received an email with a PDF invoice attached, OCR extracts the data so you can copy and paste it into another system. 

Limitations of OCR as an AP Invoice Reader 

OCR technology does have some serious limitations and challenges, however – especially in the logistics industry where the terms “low variability” and “invoices” are rarely uttered in the same breath.

Freight forwarders and shippers receive thousands of different invoices from many different companies – and each invoice is structured in a slightly different way. 

For example, there are no naming standards for invoice data fields – “To pay”, “Total”, and “Amount Due” can all mean the same thing.

As such, in order to put OCR to work as an invoice reader, you first have to set up templates that match each vendor invoice so the technology can actually capture the necessary data in the various data fields. Each field then requires the implementation of an individual rule to interpret the data within. 

This equates to a long and expensive set-up process in the first instance. Every company that uses OCR has to start from scratch, setting up hundreds of templates and implementing potentially thousands of rules.

Even when it’s all looking good to go, with so many document variations in logistics, any rules initially implemented can easily break – so maintaining them continues to eat up huge amounts of time and resources as long as the OCR solution is in use.  

Machine Learning to the Rescue 

Artificial intelligence (AI) is the solution – specifically the branch of AI that is machine learning (ML) technology. 

With a specialist machine learning solution, there is no need to create templates or define rules. 

Instead, the technology provides a contextual understanding layer to add meaning to extracted text. ML systems have already been trained on millions of invoices and documents to ensure models understand a broad range of invoices and are able to account for any variabilities that emerge. 

This means that when extracting text from a document, the ML technology already understands what that text signifies – no need to build new templates and new rules to understand new documents. 

The technology then accurately maps the information to a clean data schema which can be automatically fed into your ERP system – negating the need for manual data entry as well. 

AI Invoice Extraction from Shipamax 

On their own, simple and traditional OCR solutions always run the risk of inaccurate data extraction, resulting in data input errors and valuable workforce hours then spent correcting them. They are tedious and hugely time-consuming to set up and maintain. 

AI invoice readers bring contextual understanding to the process right out of the box with powerful machine learning technology. 

Shipamax is a plug-and-play toolkit for back office automation in logistics. 

Our solution uses a combination of OCR and ML-powered data extraction to turn unstructured documents into structured data. 

Our specialised machine learning models first classify logistics and supply chain document types – such as bills of lading, commercial invoices, delivery notices and accounts payable invoices – and the system then uses a built in OCR component to extract data. 

Once data is extracted, the ML models are used to understand the context of the document and turn the information into business-ready, machine-readable data that can be mapped to a document schema – for example a Master Bill of Lading schema.

This schema can then be automatically pushed into popular ERP systems like Cargowise

The solution plugs directly into any email inbox or unstructured data source to automatically extract data from emails and attachments in real-time – eliminating the need to rely on rules- and template-based OCR and repetitive manual tasks. 

To help you learn more about how Shipamax compares to OCR and RPA technology, we’ve written a blog which compares all three options in more detail

Specialized for the supply chain industry, request a demo of the Shipamax plug-and-play data extraction platform today, or get in touch for more information.

Josh BradleyVP Demand Generation
April 2020
5 min read
  • OCR & AI
  • Data Extraction
Share this post

Free your back office from manual data entry

Get a demo