Demajh logo Demajh, Inc.

GeMix: Conditional GAN-Based Mixup for Improved Medical Image Augmentation: what it means for business leaders

GeMix replaces pixel-level blend noise with GAN-generated, diagnosis-aware images, giving healthcare A.I. teams stronger generalization, fewer false negatives, and a compliant path to scale models without additional sensitive patient data.

1. What the method is

GeMix is a two-stage augmentation framework that first trains a class-conditional StyleGAN2-ADA on a target medical dataset, then samples soft Dirichlet labels to generate synthetic scans lying between real disease classes. These generator outputs are mixed with authentic images during classifier training, acting as a learned, label-aware alternative to plain pixel-space mixup.

2. Why the method was developed

Naïve mixup often creates anatomically impossible CT or X-ray blends that degrade clinical signal. GeMix’s authors sought realistic, privacy-friendly data expansion that preserves subtle pathology cues critical for high-stakes diagnoses, while remaining a drop-in replacement for existing training pipelines.

3. Who should care

4. How the method works

During pre-training, the conditional GAN learns a distribution of realistic scans per class. At augmentation time, two class-biased Dirichlet vectors pick mixture weights; the generator receives the soft label plus Gaussian noise and returns an image on the learned manifold. Generated samples and their interpolated label vectors are appended to each mini-batch, seamlessly integrating with standard cross-entropy loss.

5. How it was evaluated

Experiments on the 21-site COVIDx-CT-3 benchmark compared GeMix against vanilla mixup, CutMix, AugMix and no-mix baselines across ResNet-50, ResNet-101 and EfficientNet-B0 backbones. Metrics included macro-F1, false-negative rate, calibration error and confusion-matrix shifts under five-fold cross-validation.

6. How it performed

GeMix lifted macro-F1 by up to 3.8 points and cut COVID-19 false negatives 27 % relative to pixel mixup, while adding only 9 % training time. Qualitative review showed sharper lung textures and anatomically plausible blends. The gains held across model families, indicating broad deployability. (Source: arXiv 2507.15577, 2025)

← Back to dossier index