SHARE

How to integrate robotic process automation in big data projects

Robotic process automation requires repetitive data. Find out which tools can help structure and read your data.

Written By

Dec 9, 2019

We may earn from vendors via affiliate links or sponsorships. This might affect product placement on our site, but not the content of our reviews. See our Terms of Use for details.

More about Innovation

Information Services Group (ISG) reported in 2018 that 92% of companies were aiming to adopt robotic process automation (RPA) by 2020 because they wanted to increase operational efficiencies. This large number reflects how eager companies are to automate routine business processes.

One of the easiest places to employ RPA is in very simple, highly repetitive business processes that rely on transactional data that comes in fixed record lengths, with data fields always in the same locations. This data is highly predictable, and automation tools like RPA that depend on recognizing repetitive data patterns are in strong positions to excel.

SEE: Robotics in the enterprise (ZDNet/TechRepublic special feature) | Download the free PDF version (TechRepublic)

However, even the most routine business process consists of unstructured and semi-structured big data, as well as the more traditional fixed record data. For example, RPA is often used for invoices.

Invoices are commonly presented from suppliers in a PDF format. An invoice might contain a lot of white space, or a company logo, or a string of text and numbers that detail an order or a charge. This is the unstructured or semi-structured big data that RPA is asked to interpret and to automate processing.

Companies can’t just take RPA software off the shelf and make it work with unstructured big data formats like PDF documents. This is where IT comes in with its technical leadership.

Follow these steps to implement RPA

To implement RPA successfully, there is a three-step tooling architecture that IT should think about first: ETL, RPA, and AI.

ETL: At the front end of an RPA process that uses big data, it is recommended that you use an extract, transform, and load (ETL) tool that is capable of integrating with and taking in the incoming streams of raw and unstructured data that you receive from all of your suppliers. This tool is designed to extract the data relevant to your business process, transform it into a usable format that your systems can use, and then load the data into your systems and into an RPA process.
RPA: At this point, the RPA process can take over because you now have clean, quality data coming in to the RPA software, which makes the RPA software’s job of automating a business for something like invoices straightforward.
AI: As the RPA software processes invoices, it invokes the business rules that experienced employees have coded into its artificial intelligence (AI) engine. For instance, if the business rules embedded in an RPA see an invoice from Pearson Manufacturing with a “net 10 days” note on it, and the normal net terms for Pearson are net 30, the RPA process might identify this invoice as an exception that requires a person to review and approve it.

Key tips to remember when implementing any RPA process

RPA software can’t do RPA alone. RPA automates business processes, but ETL automates data cleaning and transfers; you need both to fully automate a business process that depends on quality data. The third piece of the puzzle is an AI engine that is included with the RPA and that contains the business rule-set you want the RPA software to apply to the items and operations it processes.

Tool integration is paramount. In the big data environment, RPA works best when used with an ETL tool that can deliver it clean data. Within the RPA software, there should be a table of business rules that drive the RPA software’s business process decision-making.

It is imperative for end users and IT to understand that implementing an RPA process is not a standalone operation–it requires an assortment of other big data processing software that must be integrated with the RPA. These tools must be compatible with each other, and they must work seamlessly together.

Image: Olivier Le Moal, Getty Images/iStockphoto

Mary Shacklett

Mary E. Shacklett is president of Transworld Data, a technology research and market development firm. Prior to founding the company, Mary was Senior Vice President of Marketing and Technology at TCCU, Inc., a financial services firm; Vice President of Product Research and Software Development for Summit Information Systems, a computer software company; and Vice President of Strategic Planning and Technology at FSI International, a multinational manufacturing company in the semiconductor industry. Mary is a keynote speaker and has more than 1,000 articles, research studies, and technology publications in print.