Building vs Buying Software

Building vs buying a document automation solution for freight forwarders

When it comes to selecting a document automation solution, you’re faced with two options.  Build the tools you need in-house or purchase a specialist, industry focused solution.

When ‘building’ an in-house solution - in most cases you’ll still be buying an external OCR tool and wrapping a layer on top. This in-house layer will enable the OCR to handle the wide variety of logistics formats and types - from import bills of lading and commercial invoices to supplier invoices and detailed packing lists. If you want to know more about the differences between between traditional OCR and AI-OCR (Shipamax), we've drafted a short blog on the subject.

Defining project success

To ensure success of your project, you’ll need to define your scope. At a basic level, you’ll want to determine:

  • How accurate does it need to be?
  • How many document types does it need to cover?
  • What systems will we need to integrate with?
  • How will users interact with the data and manage document exceptions?
  • How will we roll this out and onboard users?

Having managed many automation roll outs, we know the devil is in the detail - so it’s worth taking the time to fully scope out what the full solution looks like. To help, we’ve developed a sample document automation RFP as a checklist for freight forwarders.

Common software build challenges

The tradeoffs for build vs. buy depend on your resources, time, and internal expertise. Sometimes building in-house is a feasible option, other times it's more resource-efficient to purchase existing solutions. When it comes to automating data entry, companies who try to build internal solutions tend to face the following challenges;

Minimum viable functionality

Data extraction solutions are typically built to provide minimum viable product (MVP) functionality as quickly as possible. While it’s true that with a strong engineering team, achieving high accuracy relatively quickly on major clients is feasible, the last 20% of document variety tends to take up 80% of the work.

Unknown and evolving scope of document automation in logistics 

Developing an internal product requires planning, resource allocation, and preparing for the unknown. Because data extraction projects in logistics are relatively new it can be difficult to define the scope. Add to the fact this is a research-led project vs. pure engineering, this can lead to significant timeline risk. In fact, a leading AI specialist at IBM recently revealed that almost 90% of all data science projects never make it into production

User workflows and integrations 

While the core technology may solve an immediate pain point, if the full solution lacks usability or does not easily fit into your existing workflow and processes, it may be far more cumbersome, time-consuming and disruptive to operations than you anticipate. 

Speed of extraction 

Along with scalability and uptime, speed of extraction is a make-or-break factor. To work in a live environment, documents need to be processed in seconds. High pipeline speed and parallel processing, especially during peak times, requires significant engineering investment. 

Total cost of ownership 

The total cost of ownership (TCO) of internally developed tools can often be up to five times greater than initially expected, a statistic revealed following conversations with a leading freight forwarder. For data extraction in logistics, with constantly changing document layouts, new issuers, new document types and even new document layouts, come the ongoing costs of upgrades and maintenance. Over time, more and more technical debt inevitably accrues due to product neglect, evolving product demands, as well as engineer turnover.

Roll-out plan

For users to trust and accept automation, your roll-out needs to be seamless. Unfortunately, with novel projects, it tends to be after you start to do user acceptance testing or even during the project roll-out when key issues come to light. Often a little too late.  

Value analysis

When making the build versus buy decision, it's important to ask the following questions:

  • How much time can I give to divert my engineers to this project?
  • Do I already have the specialist machine learning research talent, or will we need to add recruitment time and additional costs to our build plan?
  • How much time can I give to divert my project managers to this project?
  • What other projects could I get ahead on if this solution existed without engineering resources? What would be the benefit of going live with those projects 6-12 months earlier?
  • How much resource can I allocate to maintaining and updating the system in the long-run?
  • If the solution requires templates, who will maintain these? Who will be on call if a template changes?
  • What components (i.e. OCR, hosting) will I still need to purchase?
  • If we’ll buy the OCR component and build a wrapper, what value does owning the additional wrapper IP have to our business?
  • What redundancies will I need to build in?
  • How long will it take to build, test, and integrate the system into our delivery process?
  • Is this project so fundamental to our offering that we can commit to more than just ‘maintenance’ mode resources?

Buying a logistics document automation solution from a specialist 

Ultimately, an enterprise-grade data extraction system in logistics requires the following:

  • Multi-document classification and extraction
  • Machine learning based contextual understanding of data (non-template/rule based)
  • Machine learning trained on logistics documents
  • Automatic feedback for continuous learning
  • Extraction speed in seconds
  • Multi-document-ready API
  • Mappings to common integration formats such as CargoWise 
  • Scale to handle millions of documents
  • Multi-language compatibility
  • A purpose-built user interface
  • 99.9% uptime and redundancy

To build all of this from scratch and maintain it for the long-term comes with huge resource commitments and big costs. 

When you buy an existing specialist solution, the costs are significantly reduced, resource commitments all but vanish, there is no development risk and the product can be implemented and integrated with your existing systems, with your teams fully onboarded in a matter of hours. We've written a short article diving deeper into the topic of specialist vs generalist logistics solutions.

Document automation with Shipamax

Shipamax is a specialised data extraction platform, purpose built for logistics organisations. In fact, plug and play logistics automation is all we do, and we are dedicated to building the best-in-class automation solution on the market. 

Out of the box, we offer the most powerful toolkit for automating data entry within logistics organisations. Our solution combines market leading data extraction technology with purpose built user interfaces, all from a single API. Our machine learning models continuously learn across data from all our clients, enabling us to deliver the highest levels of accuracy, no matter the document and regardless of format. 

Our solution provides all of the features listed above, and plugs directly into your existing infrastructure and workflows, helping to minimise disruption and maximise impact. We sync with any email server to process documents as soon as they hit your inbox and integrate seamlessly with CargoWise and other custom logistics systems. With machine learning based processing, there are no rules or templates to maintain. 

At Shipamax, we’ve already built the specialist data extraction solution, test-driven and rolled it out across key leaders in freight forwarding.  Our dedicated technology team works around the clock to manage, maintain and enhance the product, to ensure that you have the best-in-class logistics automation solution.

Specialised for the logistics industry, request a personalised demo of the Shipamax plug-and-play or why not drop in on one of our live 15 minutes product demos.

Jenna BrownCo-Founder and CEO
June 2020
8 min read
  • Freight Forwarders
  • Document Automation
  • Data Extraction
Share this post