profile

Machine Learning for Medical Imaging

ML for Documents Understanding

Published 12 months ago • 2 min read

What is document understanding and why it's a crucial part for many businesses?

Document understanding allows the processing of different types of documents including images and PDF files in a streamlined manner. For example, extracting key information from invoices (total amount, address, email, ...).

Many businesses need to process huge amounts of documents everyday. Without proper automation tools that perform document understanding, this processing can be costly both in terms of time and money.

In today's edition of AIFEE, we're going to look at some of the advanced techniques in deep learning that can be very useful for document understanding tasks.

How to Extract Key Information from Documents using Deep Learning?

​

Document understanding is a crucial part of many businesses. Why?
​
Because it aims at making it possible to make documents such as PDFs and images with text, easily understood by computers. Which in turns can save a ton amount of time and consequently money.
​
For example, a document understanding system can be used to extract important information such as: "customer name", "customer address" and "total amount" from an invoice.
​
Building document understanding systems has been heavily relying on OCR (Optical Character Recognition).
​
This means that to understand a document, you would need to first pass it through an OCR system such as Tesseract to extract the text and its position from the document.
​
This text is later used as input to your system for understanding the document.
​
A new approach has been developed for document understanding which is completely OCR free!
​
The approach is called Donut!
​
Donut tries to address 3 drawbacks of OCR based document understanding systems:
​
1 - High computational costs for using OCR.
2 - Inflexibility of OCR models on languages or types of documents.
3 - OCR error propagation to the subsequent process.
​
Donut has achieved state of the art on several document understanding datasets and has exceeded them both in terms of speed and accuracy.
​
More on Donut in the original paper and on the original github repo.

​

Recognize Handwritten Text in Documents

Did you know that you can recognize handwriting with high accuracy using deep learning?
​
This process is called OCR or ICR.
​
Although I've seen several models attempting to solve this problem and I have personally built some of them, one approach is just so advanced that it's mind boggling how accurate it is.
​
I personally tested this approach on my own (very terrible) handwriting and it gave very accurate results!
​
The approach is called TrOCR.
​
It's a transformer based encoder-decoder model.
​
With a few lines of code, you can instantiate the model and make predictions on your own images. Below is a sample code.
​
You can test the code on HuggingFace.

​

Tweet of the week

Is document understanding with AI endangering white collar professions?


​

What'd you think of today's edition?

​

Machine Learning for Medical Imaging

by Nour Islam Mokhtari from pycad.co

👉 Learn how to build AI systems for medical imaging domain by leveraging tools and techniques that I share with you! | 💡 The newsletter is read by people from: Nvidia, Baker Hughes, Harvard, NYU, Columbia University, University of Toronto and more!

Read more from Machine Learning for Medical Imaging

Hi Reader,, Welcome to the PYCAD newsletter, where every week you receive doses of machine learning and computer vision techniques and tools to help you learn how to build AI solutions to empower the most vulnerable members of our society, patients. TotalSegmentator : Whole Body Segmentation at your Fingertips This free tool available online can do full body segmentation, it's called TotalSegmentator. I have already mentioned this tool in a previous edition of the newsletter, but in this...

9 days ago • 3 min read

Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below ↓ A Medical Imaging Expert Told Me This Recently I saw a post on LinkedIn where a medical imaging expert showcased his work of segmenting the lungs and its bronchial trees. You can...

16 days ago • 2 min read

Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below ↓ How we helped accelerate inference time for a client's AI product Below is a screenshot of a benchmark we did for a client of ours. The goal was to accelerate inference time. This...

22 days ago • 3 min read
Share this post