Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Note1: The backend of this demo is comfy. Though it runs fast, please note that due to the use of quantized and distilled models, there may be some quality degradation.

Note2: Considering the limited memory, please try test cases with lower resolution and frame count, otherwise it may cause out of memory error (you can also try re-running it).

If you like this project, please consider starring the repo to motivate us. Thank you!

0:00 / 0:00
Examples
Input Video Editing Instruction Width Height FPS Frame Count