Hello Reader,

Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below ↓

Gorilla: A Large Language Model that can call APIs

A new open source large language model can superbly find the right API call starting from a general prompt. It's called Gorilla!

Is this confusing? Let me explain.

Suppose I would like to detect cats on an image. Meaning, I would like to actually get the bounding boxes coordinates surrounding the cats on the image.

ChatGPT cannot do that directly. Maybe through some plugin it can do that.

But there are a variety of machine learning models out there that can do this.

A lot of them can be called through some API or library.

Gorilla allows you to get the necessary libraries and it gives you the necessary steps to call methods or classes from those libraries.

Here's an example to illustrate this.

You can give Gorilla the following prompt: "I want to build a robot that can detect objects in an image.".

Gorilla will understand that you would like to perform object detection on an image.

Then it will find the right model from the HuggingFace library that can achieve this. In this case it's a YOLO model that can be accessed through the class "YolosForObjectDetection".

It then tells you the necessary steps to load your image or camera feed to the model and make predictions on it.

Awesome isn't it!?

For more information about this model, you can read the paper, see the code on github or test the code directly on google colab.

Why the ML Community is Awesome!

A really insightful exchange that I lately saw on Twitter between 3 ML researchers.

Researcher 1: "New paper!! We show that pre-training language-image models *solely* on synthetic images from Stable Diffusion can outperform training on real images!!".

Researcher 2: "Nice work. But stable diffusion already uses a pretrained clip, wonder does that render your exploration as some kind of distillation from pretrained models?"

Researcher 3: "I think that's a nice way to look at it. SD is distillation of knowledge from CLIP and LAION into generative model.
Kind of like dataset distillation but optimized with a generative objective rather than a task objective."

It's exchanges like these that make ML community awesome!

Here's the paper that they were talking about.

Dilip Krishnan

@dilipkay

New paper!! We show that pre-training language-image models *solely* on synthetic images from Stable Diffusion can outperform training on real images!!
Work done with @YonglongT (Google), Huiwen Chang (Google), @phillip_isola (MIT) and Lijie Fan (MIT)!!

12:29 PM • Jun 2, 2023

Retweets

427

Likes

Read 10 replies

Cool AI Tools (Affiliates)

Chatbase: An AI chatbot builder that lets you build, train, and embed smart chatbots powered by ChatGPT right on your website.

EzyCourse: Create and sell your courses, services, products, or online communities from one platform.

Katteb: The first fact-checked, real-time, and localized AI writer.

Machine Learning for Medical Imaging

Language Model that can call APIs

Gorilla: A Large Language Model that can call APIs

Why the ML Community is Awesome!

Cool AI Tools (Affiliates)

What'd you think of today's edition?

AI Scribes: The Future of Medical Documentation?

From DeepSeek to Lung Tumors

LLMs that are HIPAA Compliant!