Echovlm: Medical Imaging Reports for $5

(github.com)

1 points | by vukadinovic 4 hours ago ago

1 comments

  • vukadinovic 4 hours ago

    This repo is a full-stack implementation of a VLM for medical report generation from imaging scans. It is designed to serve as a practical example for researchers, demonstrating end-to-end training of VLMs on medical imaging data, and can be adapted to various imaging modalities. echovlm is inspired by Karpathy's nanochat, and is also a fully open source, reproducible codebase which makes it one of the few public codebases for medical machine learning. As a running example, we use an echocardiography dataset with synthethic reports and study embeddings, which allows us to simulate VLM training data despite not having access to raw medical imaging scans. echovlm is very light and it can run on a single gpu via speedrun.sh script, that runs the entire pipeline start to end. This includes dataset preparation, tokenization, training, evaluation, inference and real example on the scan of my heart.