Hello Reader,

Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below ↓

Lamini: Train your own Large Language Models

Lately I discovered Lamini. A python package that allows you to finetune large language models on your own dataset with a few lines of code.

Here’s the good and the bad about this package.

Now let’s get back to the advantages and drawbacks of the Lamini package.

The good:

Very easy to use. With few lines of code, you’ll be finetuning an LLM.
Several open source LLMs are supported.
Complete privacy and ownership of the model.

The bad:

Doesn’t support top open source LLM models (yet?).
It seems to be limited to only question answering chatbots.
The data format for training the models is in the form of {question : answer}. Meaning that your data needs to be restructured in a Q&A fashion. This is not ideal for many companies and organization that only have large corpuses of text.

Why do you need to know this?

Because a lot of industries are wanting to build in-house language models that can help them solve some of the problems they currently have. A lot of these industries are refraining from using ChatGPT and GPT4 because they don't want their data to enter other companies servers. But now, with all of the open source models for language, you can train a model and use it on premise without needing to have internet access or to send your data to third party APIs!

CM3Leon: The new model for image generation from text but it can do more

CM3Leon the new generative model from Meta that can generate images from text and can also describe image content in clear and concise text!

Here are a few things you should know about this model.

Architecture:

The CM3Leon models follow a decoder-only transformer architecture, similar to OPT (Open Pre-trained Transformer Language Models).

For weight initialization, they used a truncated normal distribution with a mean of 0 and a standard deviation of 0.006, truncated to 3 standard deviations.

Output layers are initialized as 0, and the learned absolute positional embedding is initialized near zero with a standard deviation of 0.0002. The models were trained with Metaseq.

Model Capabilities:

Text to image generation:

The model can generate images from text prompts like many other generative models.

It can also do text guided image inpainting. Meaning that you can start from an input image then ask the model to change some attributes of objects found in the image.

The model also seems to be able to write coherent text on images. Something that many image generation models struggle with.

Image to text generation:

The model can generate short and long captions about an image.

It can also answer questions about an image. So you can ask it a questions such as: “what color is the dog?” and it will answer you.

You can read more about CM3Leon here.

What'd you think of today's edition?

That's it for this week's edition, I hope you enjoyed it!

Machine Learning for Medical Imaging

Train Your Own ChatGPT

Lamini: Train your own Large Language Models

CM3Leon: The new model for image generation from text but it can do more

What'd you think of today's edition?

Machine Learning for Medical Imaging

In-depth look at TotalSegmentator

Medical imaging expert told me this

Need for speed in medical imaging