Hi Reader, Welcome to the PYCAD newsletter, where every week you receive doses of machine learning and computer vision techniques and tools to help you learn how to build AI solutions to empower the most vulnerable members of our society, patients. |
Foundation models such as SAM (Segment Anything Model) are really impressive in their detection or segmentation results. But they are almost useless for building real world products. Why?
β
Because they are too big, too slow and too general.
β
Nonetheless, they can be used to build deep learning systems incredibly quickly. How?
β
By allowing you to auto annotate your dataset!
β
A technique called Autodistill enables you do exactly this.
β
Here are the steps:
β
1 - You take a foundation model such as SAM and you run your images through it and get outputs.
β
2 - You use prompting to choose which outputs to keep and save, and which ones to remove. This is your annotated dataset.
β
3 - You use the annotations to train a leaner and more specialized model such as YOLOv8.
β
Although you can build such a process by yourself, I would recommend you first take a look at the autodistill package.
β
It allows you to perform all of these steps in an easy way.
You can also test the package directly inside a google colab here.
This new tracking technique is a game changer!
β
It's from a paper titled: Tracking Everything Everywhere All at Once!
β
Classical tracking algorithms such as pairwise optical flow, lose track of objects when they are occluded, and can produce inconsistencies when correspondences are composed over multiple frames.
β
Researchers in this paper argue that the way motion is represented by these classical algorithms is not good enough.
β
Therefore, they propose a global motion representation that can provide accurate and consistent tracking even through occlusion.
β
This global motion would be represented as a data structure that encodes the trajectories of all points in a scene.
β
The proposed representation is called OmniMotion.
β
OmniMotion allows going from one frame of a video to the other while consistently keeping track of the 3D context around the video content.
β
Below you can see the result of this technique.
You can find out more about this technique in the original paper. You can also check a demo of this technique here.
β
β
βπ Learn how to build AI systems for medical imaging domain by leveraging tools and techniques that I share with you! | π‘ The newsletter is read by people from: Nvidia, Baker Hughes, Harvard, NYU, Columbia University, University of Toronto and more!
Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below β AI Scribes: Transforming Medical Documentation Web Application for Medical Note Generation AI-powered medical scribes are revolutionizing clinical workflows by automating...
Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below β DeepSeek: A New Player in AI for Healthcare The new open-source LLM, DeepSeek, is creating buzz for its potential to transform AI in medicine and healthcare. Designed for...
Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below β Now You Can Use Large Language Models that are HIPAA Compliant People are finding ways to use large language models in all fields. MedTech is no exception. The amount of work...