profile

Machine Learning for Medical Imaging

Vision Transformer Made Simpler

Published 11 months agoΒ β€’Β 2 min read

Hello Reader,

Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below ↓

Vision Transformer Made Simpler

I thought that CNNs are a must when it comes to vision tasks. I thought that transformers will not be able to fully replace them for vision.

I was wrong. Here’s why.

There is a new vision transformer model from Meta AI. It’s called Hiera.

This vision transformer outperforms previous models in both accuracy and speed.

The performance was measured on image classification and video classification tasks.

But what makes Hiera impressive is its simplicity in design.

This model does NOT have:

  • Convolutional layers.
  • Shifted windows.
  • Attention bias.

Adding these techniques have historically been necessary to make transformers work good enough for vision tasks.

Why were these techniques necessary to make vision transformers work well?

Because they addressed some important drawbacks that vanilla vision transformers had. For example: lack of inductive bias.

So how did Hiera address these drawbacks without the use of these techniques?

By using a self-supervised learning technique for learning visual representation. The technique is called β€œmasked pretraining”;

Below is an image that shows the full architecture of the model when trained with this technique.

Hiera achieves some impressive results for image and video tasks. For example:

On ImageNet-1K dataset, it had a 0.3% accuracy gain while being almost twice as fast during inference compared to ConvNextV2-B.

On Kenitics-400 dataset, it had 2.5% accuracy gain while being almost 3 times faster compared to ViT-B.

You can check out the paper here and the code here.

​

Improving Object Detection Results without Training

Detecting small objects in images has always been a difficult task for deep learning models. There is an approach that can make your results incredibly better.
​
This approach is called Slicing Aided Hyper Inference or SAHI for short.
​
It is extremely simple, yet very effective.
​
You basically take your input image and divide it into patches.
​
You then resize these patches and pass everything to your model: the original image and the resized patches.
​
Then you aggregate the results and you filter them based on an IoU (intersection over union) threshold.
​
The technique can be used directly with your trained object detection model without any finetuning.
​
It can also be used as a data augmentation technique during training.
​
From what the paper has reported, the results are very impressive.
​
If it's used without finetuning you get an AP increase of 6.8%, 5.1% and 5.3% for FCOS, VFNet and TOOD detectors, respectively.
​
With finetuning, you get an AP increase of 12.7%, 13.4% and 14.5% AP in the same order.

You can read more about it in the original paper. You can also check the code here.


Cool AI Tools (Affiliates)

​BlogAssistant helps minimize AI content flags, so you can produce articles in over 30 languages that draw readers in without giving off creepy robot vibes. It also lets you retrieve images for your content that are always 100% royalty-free!


​

What'd you think of today's edition?

​

Machine Learning for Medical Imaging

by Nour Islam Mokhtari from pycad.co

πŸ‘‰ Learn how to build AI systems for medical imaging domain by leveraging tools and techniques that I share with you! | πŸ’‘ The newsletter is read by people from: Nvidia, Baker Hughes, Harvard, NYU, Columbia University, University of Toronto and more!

Read more from Machine Learning for Medical Imaging

Hi Reader,, Welcome to the PYCAD newsletter, where every week you receive doses of machine learning and computer vision techniques and tools to help you learn how to build AI solutions to empower the most vulnerable members of our society, patients. TotalSegmentator : Whole Body Segmentation at your Fingertips This free tool available online can do full body segmentation, it's called TotalSegmentator. I have already mentioned this tool in a previous edition of the newsletter, but in this...

9 days agoΒ β€’Β 3 min read

Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below ↓ A Medical Imaging Expert Told Me This Recently I saw a post on LinkedIn where a medical imaging expert showcased his work of segmenting the lungs and its bronchial trees. You can...

16 days agoΒ β€’Β 2 min read

Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below ↓ How we helped accelerate inference time for a client's AI product Below is a screenshot of a benchmark we did for a client of ours. The goal was to accelerate inference time. This...

22 days agoΒ β€’Β 3 min read
Share this post