Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below β |
I thought that CNNs are a must when it comes to vision tasks. I thought that transformers will not be able to fully replace them for vision.
I was wrong. Hereβs why.
There is a new vision transformer model from Meta AI. Itβs called Hiera.
This vision transformer outperforms previous models in both accuracy and speed.
The performance was measured on image classification and video classification tasks.
But what makes Hiera impressive is its simplicity in design.
This model does NOT have:
Adding these techniques have historically been necessary to make transformers work good enough for vision tasks.
Why were these techniques necessary to make vision transformers work well?
Because they addressed some important drawbacks that vanilla vision transformers had. For example: lack of inductive bias.
So how did Hiera address these drawbacks without the use of these techniques?
By using a self-supervised learning technique for learning visual representation. The technique is called βmasked pretrainingβ;
Below is an image that shows the full architecture of the model when trained with this technique.
Hiera achieves some impressive results for image and video tasks. For example:
On ImageNet-1K dataset, it had a 0.3% accuracy gain while being almost twice as fast during inference compared to ConvNextV2-B.
On Kenitics-400 dataset, it had 2.5% accuracy gain while being almost 3 times faster compared to ViT-B.
You can check out the paper here and the code here.
β
Detecting small objects in images has always been a difficult task for deep learning models. There is an approach that can make your results incredibly better.
β
This approach is called Slicing Aided Hyper Inference or SAHI for short.
β
It is extremely simple, yet very effective.
β
You basically take your input image and divide it into patches.
β
You then resize these patches and pass everything to your model: the original image and the resized patches.
β
Then you aggregate the results and you filter them based on an IoU (intersection over union) threshold.
β
The technique can be used directly with your trained object detection model without any finetuning.
β
It can also be used as a data augmentation technique during training.
β
From what the paper has reported, the results are very impressive.
β
If it's used without finetuning you get an AP increase of 6.8%, 5.1% and 5.3% for FCOS, VFNet and TOOD detectors, respectively.
β
With finetuning, you get an AP increase of 12.7%, 13.4% and 14.5% AP in the same order.
You can read more about it in the original paper. You can also check the code here.
βBlogAssistant helps minimize AI content flags, so you can produce articles in over 30 languages that draw readers in without giving off creepy robot vibes. It also lets you retrieve images for your content that are always 100% royalty-free!
β
β
π Learn how to build AI systems for medical imaging domain by leveraging tools and techniques that I share with you! | π‘ The newsletter is read by people from: Nvidia, Baker Hughes, Harvard, NYU, Columbia University, University of Toronto and more!
Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below β AI Scribes: Transforming Medical Documentation Web Application for Medical Note Generation AI-powered medical scribes are revolutionizing clinical workflows by automating...
Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below β DeepSeek: A New Player in AI for Healthcare The new open-source LLM, DeepSeek, is creating buzz for its potential to transform AI in medicine and healthcare. Designed for...
Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below β Now You Can Use Large Language Models that are HIPAA Compliant People are finding ways to use large language models in all fields. MedTech is no exception. The amount of work...