Skip to content

August 18, 2021

Artificial Intelligence, Design

Teaching a Robot to Read

COLLEEN MCCRETTON

All Posts

Over the last several years one of the FCAT AI teams - code named “RoboReader” - has been working on processing documents and taking needed information from unstructured text and transforming it into structured data that can be used by the business. In the course of the work, we have noticed parallels in the way we are “teaching” the system and how we read as humans.

When reading for work, most of us skim or scan the contents looking for words, phrases or formatting that provide clues that something might be important to us. Information Foraging Theory,1 a concept that emerged in 1993 and correlates the behavior of humans looking for information to animals looking for food, gives reasons for this: (a) we want to maximize our reward (in the form of information or food) relative to our effort, and (b) as a result, we have developed learned behaviors that help us find what we are looking for quickly when reading for informational purposes.2 When we skim our goal is to get the general gist of the information we seek, often focusing on indexes or tables of contents, titles, subtitles and headings, bulleted lists, bold or underlined words, tables, charts and pictures. We also scan to find specific information, e.g. looking for specific words or phrases, ordering, or formatting on a page.3

In our project work, we found evidence of our business users implementing these methods. In one use case, users were always flipping to the last few pages of a document for the information they needed. In another use case, the important information was always in a bulleted list, and in yet another it was always in a table.

We used these observed behaviors when training our AI models. We utilized image processing techniques to “visually” scan for lines indicative of a table when teaching the system to process tabular data. We interpreted formatting metadata indicative of bulleted lists when teaching the system to look for requests, usually coming in this format. We taught the system how to recognize key:value pairs based on location and formatting cues. We taught it to find monetary amounts, dates, addresses and ID numbers based on location and formatting as well. Within paragraphs, we used leading and trailing language markers and letter case to teach it to identify names of people and companies and other specific relevant terms.

Yes, it is possible to teach a robot to read. It starts with understanding how we humans learn to read and transferring those same skills and techniques to our robot assistants. All of which perfectly illustrates the fact that, for humans and robots to be successful in their work, reading is fundamental.

Colleen McCretton is Director, User Experience Design, in FCAT

 
References & Disclaimers

1 https://psycnet.apa.org/doiLanding?doi=10.1037%2F0033-295X.106.4.643
2 https://www.nngroup.com/articles/information-foraging/
3 https://www.utc.edu/enrollment-management-and-student-affairs/center-for-academic-support-and-advisement/tips-for-academic-success/skimming

991168.1.0

Related posts

Technology & Society, Blockchain

Philanthropy in the Age of Bitcoin

THE FCAT TEAM

July 9, 2020

For philanthropic institutions, the emergence of Bitcoin has presented a unique set of issues. Donors enjoy the efficiency and tax savings of being able to donate non-cash assets directly to organizations such as Fidelity Charitable®, a public charity. But Bitcoin, the most visible digital asset, posed a challenge—it was a non-cash asset that Fidelity Charitable was not originally equipped to accept. Fidelity Charitable enlisted FCAT to find a way to help donors give back with Bitcoin.

Quantum, Emerging Technology

"What’s A Qubit” Quantum VR

Jamie Barras

March 31, 2021

“What’s a Qubit?” is a virtual reality experience designed to teach fundamental building blocks of quantum computing through a virtual reality immersive experience. The experience allows you to fly through a quantum computer at the size of a single molecule and witness the truly strange world of quantum mechanics up close. Before reviewing the project, experience "What's A Qubit?" here. Quantum computing is fundamentally different from the classical computers we are used to using today. How quantum computers represent and process information is completely different than their classical counterparts so the types of problems that can be solved with a quantum computer are also different. This application is a step towards starting to understand some of these fundamental differences so that we can start to identify problems for these new machines. Virtual reality is an effective way to create a learning experience. Unlike videos and articles, we believe that a multisensory experience that includes visuals, audio, haptics, and control, will create a lasting memory by teaching in a way that no other medium can achieve.

Blockchain, Artificial Intelligence

Can AI Speed Up Blockchain Development?

Sarah Hoffman

April 20, 2023

This article was written in collaboration with Erman Akdogan of Fidelity Enterprise Cybersecurity and Chris McGahon of the Fidelity Center for Applied Technology. All figures were generated in February using ChatGPT (GPT-3.5 model).