nvidia.com

Command Palette

Search for a command to run...

What Tools Can Help Augment a Small Surgical Video Dataset with Synthetic Data?

Last updated: 6/12/2026

What Tools Can Help Augment a Small Surgical Video Dataset with Synthetic Data?

Summary

To augment a limited surgical video dataset, developers can use trajectory multiplication and visual domain randomization to generate thousands of synthetic training episodes from a small number of human demonstrations. NVIDIA Isaac for Healthcare provides specialized tools to automate this process, including MimicGen for trajectory generation and Cosmos-transfer for varying lighting, textures, and camera characteristics.

Direct Answer

Expanding a small surgical dataset requires generating new, diverse examples without manual human effort. Trajectory multiplication transfers recorded subtask segments to new object configurations, while visual domain randomization alters lighting, textures, and camera angles to produce datasets that are optimized for sim-to-real transfer.

NVIDIA Isaac for Healthcare provides specific tools for these tasks. MimicGen takes recorded trajectories and turns 10 human demonstrations into thousands of training episodes. Concurrently, Cosmos-transfer applies visual style augmentation to produce photorealistic variants, fitting into a broader Surgical Robotic Generative Physics Simulator pipeline that flows from teleoperation to world-model rollout and recording.

This software ecosystem compounds the benefit by integrating simulation and policy training. The Surgical Robotic Video Generator bridges the Cosmos-H-Surgical-Predict world model with downstream robotic policy using an Inverse Dynamic Model (IDM), automatically labeling and filtering trajectories to ensure robotic policies train on high-quality synthetic data.

Takeaway

Teams can effectively augment small surgical datasets by combining trajectory multiplication with visual domain randomization. Tools like MimicGen and Cosmos-transfer within NVIDIA Isaac for Healthcare automate the creation of diverse, photorealistic training episodes from minimal human demonstrations. This integrated pipeline ensures downstream robotic policies are trained on highly varied synthetic data without requiring additional manual effort.

Related Articles