[docs] update report

2026-04-11 21:42:26 +02:00 · 2024-06-13 16:54:14 +00:00 · 2024-06-13 16:54:14 +00:00 · 2ebad94b89
commit 2ebad94b89
parent 0bb4c8d587
1 changed files with 6 additions and 3 deletions
--- a/docs/report_03.md
+++ b/docs/report_03.md
@ -1,11 +1,12 @@
 # Open-Sora 1.2 Report

- [3D VAE](#3d-vae)
+- [Video compression network](#video-compression-network)
 - [Rectified flow and model adaptation](#rectified-flow-and-model-adaptation)
- [Training data and stages](#training-data-and-stages)
+- [More data and better multi-stage training](#more-data-and-better-multi-stage-training)
+- [Easy and effective model conditioning](#easy-and-effective-model-conditioning)
 - [Evaluation](#evaluation)

-In Open-Sora 1.2 release, we train a 1.1B models on >20M data, supporting 0s~15s, 144p to 720p, various aspect ratios video generation. Our configurations is listed below, where ✅ means that the data is seen during training, while 🆗 means although not trained, the model can inference at that config (inference requires more than one 80G memory GPU). Following our 1.1 version, Open-Sora 1.2 can also do image-to-video generation and video extension.
+In Open-Sora 1.2 release, we train a 1.1B models on >20M data, supporting 0s~15s, 144p to 720p, various aspect ratios video generation. Our configurations is listed below. Following our 1.1 version, Open-Sora 1.2 can also do image-to-video generation and video extension.

 |      | image | 2s  | 4s  | 8s  | 16s |
 | ---- | ----- | --- | --- | --- | --- |
@ -14,6 +15,8 @@ In Open-Sora 1.2 release, we train a 1.1B models on >20M data, supporting 0s~15s
 | 480p | ✅     | ✅   | ✅   | ✅   | 🆗   |
 | 720p | ✅     | ✅   | ✅   | 🆗   | 🆗   |

+Here ✅ means that the data is seen during training, and 🆗 means although not trained, the model can inference at that config. Inference for 🆗 requires more than one 80G memory GPU and sequence parallelism.
+
 Besides features introduced in Open-Sora 1.1, Open-Sora 1.2 highlights:

 - Video compression network