diff --git a/README.md b/README.md index 35352bb..0b526cd 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@

- +
@@ -40,7 +40,7 @@ With Open-Sora, our goal is to foster innovation, creativity, and inclusivity wi ## ๐Ÿ“ฐ News -- **[2025.03.12]** ๐Ÿ”ฅ We released **Open-Sora 2.0** (11B). ๐ŸŽฌ 11B model achieves [on-par performance](#evaluation) with 11B HunyuanVideo & 30B Step-Video on ๐Ÿ“VBench & ๐Ÿ“ŠHuman Preference. ๐Ÿ› ๏ธ Fully open-source: checkpoints and training codes for training with only **$200K**. [[report]](https://github.com/hpcaitech/Open-Sora-Demo/blob/main/paper/Open_Sora_2_tech_report.pdf) +- **[2025.03.12]** ๐Ÿ”ฅ We released **Open-Sora 2.0** (11B). ๐ŸŽฌ 11B model achieves [on-par performance](#evaluation) with 11B HunyuanVideo & 30B Step-Video on ๐Ÿ“VBench & ๐Ÿ“ŠHuman Preference. ๐Ÿ› ๏ธ Fully open-source: checkpoints and training codes for training with only **$200K**. [[report]](https://arxiv.org/abs/2503.09642v1) - **[2025.02.20]** ๐Ÿ”ฅ We released **Open-Sora 1.3** (1B). With the upgraded VAE and Transformer architecture, the quality of our generated videos has been greatly improved ๐Ÿš€. [[checkpoints]](#open-sora-13-model-weights) [[report]](/docs/report_04.md) [[demo]](https://huggingface.co/spaces/hpcai-tech/open-sora) - **[2024.12.23]** The development cost of video generation models has saved by 50%! Open-source solutions are now available with H200 GPU vouchers. [[blog]](https://company.hpc-ai.com/blog/the-development-cost-of-video-generation-models-has-saved-by-50-open-source-solutions-are-now-available-with-h200-gpu-vouchers) [[code]](https://github.com/hpcaitech/Open-Sora/blob/main/scripts/train.py) [[vouchers]](https://colossalai.org/zh-Hans/docs/get_started/bonus/) - **[2024.06.17]** We released **Open-Sora 1.2**, which includes **3D-VAE**, **rectified flow**, and **score condition**. The video quality is greatly improved. [[checkpoints]](#open-sora-12-model-weights) [[report]](/docs/report_03.md) [[arxiv]](https://arxiv.org/abs/2412.20404) @@ -125,7 +125,7 @@ see [here](/assets/texts/t2v_samples.txt) for full prompts. ## ๐Ÿ”† Reports -- **[Tech Report of Open-Sora 2.0](https://github.com/hpcaitech/Open-Sora-Demo/blob/main/paper/Open_Sora_2_tech_report.pdf)** +- **[Tech Report of Open-Sora 2.0](https://arxiv.org/abs/2503.09642v1)** - **[Step by step to train or finetune your own model](docs/train.md)** - **[Step by step to train and evaluate an video autoencoder](docs/ae.md)** - **[Visit the high compression video autoencoder](docs/hcae.md)** diff --git a/docs/hcae.md b/docs/hcae.md index 013cffc..1ea6cb5 100644 --- a/docs/hcae.md +++ b/docs/hcae.md @@ -11,7 +11,16 @@ we explore training video generation models with high-compression autoencoders ( Nevertheless, despite the advantanges in drastically lower computation costs, other challenges remain. For instance, larger channels low down convergance. Our generation model adapted with a 128-channel Video DC-AE for 25K iterations achieves a loss level of 0.5, as compared to 0.1 from the initialization model. While the fast video generation model underperforms the original, it still captures spatial-temporal relationships. We release this model to the research community for further exploration. -Checkout more details in our [report](). +Checkout more details in our [report](https://arxiv.org/abs/2503.09642v1). + +## Model Download + +Download from ๐Ÿค— [Huggingface](https://huggingface.co/hpcai-tech/Open-Sora-v2-Video-DC-AE): + +```bash +pip install "huggingface_hub[cli]" +huggingface-cli download hpcai-tech/Open-Sora-v2-Video-DC-AE --local-dir ./ckpts +``` ## Inference