Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

📄 Paper | 🌐 Project Page | 💻 Github Code | 📦 Model Weights | 📊 Dataset

Note1: The backend of this demo is comfy. Though it runs fast, please note that due to the use of quantized and distilled models, there may be some quality degradation.

Note2: Considering the limited memory, please try test cases with lower resolution and frame count, otherwise it may cause out of memory error (you can also try re-running it).

If you like this project, please consider starring the repo to motivate us. Thank you!