mirror of
https://github.com/hpcaitech/Open-Sora.git
synced 2026-04-11 21:42:26 +02:00
[docs]add docs/commands_zh.md,fix some doc's typo (#100)
Signed-off-by: zeekzen <yangzitao1995@qq.com>
This commit is contained in:
parent
08d574d29f
commit
13f8bcfdf0
|
|
@ -142,7 +142,7 @@ torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/i
|
|||
torchrun --standalone --nproc_per_node 2 scripts/inference.py configs/opensora/inference/64x512x512.py --ckpt-path ./path/to/your/ckpt.pth
|
||||
```
|
||||
|
||||
我们在 H800 GPU 上进行了速度测试。如需使用其他模型进行推理,请参阅[此处](docs/commands.md)获取更多说明。
|
||||
我们在 H800 GPU 上进行了速度测试。如需使用其他模型进行推理,请参阅[此处](/docs/commands_zh.md)获取更多说明。
|
||||
|
||||
## 数据处理
|
||||
|
||||
|
|
@ -169,11 +169,11 @@ torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/6
|
|||
colossalai run --nproc_per_node 8 --hostfile hostfile scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --ckpt-path YOUR_PRETRAINED_CKPT
|
||||
```
|
||||
|
||||
有关其他型号的培训和高级使用方法,请参阅[此处](docs/commands.md)获取更多说明。
|
||||
有关其他模型的训练和高级使用方法,请参阅[此处](/docs/commands_zh.md)获取更多说明。
|
||||
|
||||
## 贡献
|
||||
|
||||
如果您希望为该项目做出贡献,可以参考 [贡献指南](./CONTRIBUTING.md).
|
||||
如果您希望为该项目做出贡献,可以参考 [贡献指南](/CONTRIBUTING.md).
|
||||
|
||||
## 声明
|
||||
|
||||
|
|
|
|||
|
|
@ -37,7 +37,7 @@ torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inf
|
|||
|
||||
### Inference with checkpoints saved during training
|
||||
|
||||
During training, an experiment logging folder is created in `outputs` directory. Under each checpoint folder, e.g. `epoch12-global_step2000`, there is a `ema.pt` and the shared `model` folder. Run the following command to perform inference.
|
||||
During training, an experiment logging folder is created in `outputs` directory. Under each checkpoint folder, e.g. `epoch12-global_step2000`, there is a `ema.pt` and the shared `model` folder. Run the following command to perform inference.
|
||||
|
||||
```bash
|
||||
# inference with ema model
|
||||
|
|
@ -62,13 +62,14 @@ type="dmp-solver"
|
|||
num_sampling_steps=20
|
||||
```
|
||||
|
||||
1. You can use [SVD](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)'s finetuned VAE decoder on videos for inference (consumes more memory). However, we do not see significant improvement in the video result. To use it, download [the pretrained weights](https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models/vae_temporal_decoder) into `./pretrained_models/vae_temporal_decoder` and modify the config file as follows.
|
||||
2. You can use [SVD](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)'s finetuned VAE decoder on videos for inference (consumes more memory). However, we do not see significant improvement in the video result. To use it, download [the pretrained weights](https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models/vae_temporal_decoder) into `./pretrained_models/vae_temporal_decoder` and modify the config file as follows.
|
||||
|
||||
```python
|
||||
vae = dict(
|
||||
type="VideoAutoencoderKLTemporalDecoder",
|
||||
from_pretrained="pretrained_models/vae_temporal_decoder",
|
||||
)
|
||||
```
|
||||
|
||||
## Training
|
||||
|
||||
|
|
|
|||
92
docs/commands_zh.md
Normal file
92
docs/commands_zh.md
Normal file
|
|
@ -0,0 +1,92 @@
|
|||
# 命令
|
||||
|
||||
## 推理
|
||||
|
||||
您可以修改相应的配置文件来更改推理设置。在 [此处](/docs/structure.md#inference-config-demos) 查看更多详细信息。
|
||||
|
||||
### 在 ImageNet 上使用 DiT 预训练进行推理
|
||||
|
||||
以下命令会自动在 ImageNet 上下载预训练权重并运行推理。
|
||||
|
||||
```bash
|
||||
python scripts/inference.py configs/dit/inference/1x256x256-class.py --ckpt-path DiT-XL-2-256x256.pt
|
||||
```
|
||||
|
||||
### 在 UCF101 上使用 Latte 预训练进行推理
|
||||
|
||||
以下命令会自动下载 UCF101 上的预训练权重并运行推理。
|
||||
|
||||
```bash
|
||||
python scripts/inference.py configs/latte/inference/16x256x256-class.py --ckpt-path Latte-XL-2-256x256-ucf101.pt
|
||||
```
|
||||
|
||||
### 使用 PixArt-α 预训练权重进行推理
|
||||
|
||||
将 T5 下载到 `./pretrained_models` 并运行以下命令。
|
||||
|
||||
```bash
|
||||
# 256x256
|
||||
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x256x256.py --ckpt-path PixArt-XL-2-256x256.pth
|
||||
|
||||
# 512x512
|
||||
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x512x512.py --ckpt-path PixArt-XL-2-512x512.pth
|
||||
|
||||
# 1024 multi-scale
|
||||
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x1024MS.py --ckpt-path PixArt-XL-2-1024MS.pth
|
||||
```
|
||||
|
||||
### 使用训练期间保存的 checkpoints 进行推理
|
||||
|
||||
在训练期间,会在 `outputs` 目录中创建一个实验日志记录文件夹。在每个 checkpoint 文件夹下(例如 `epoch12-global_step2000`),有一个 `ema.pt` 文件和共享的 `model` 文件夹。执行以下命令进行推理。
|
||||
|
||||
```bash
|
||||
# 使用 ema 模型进行推理
|
||||
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000/ema.pt
|
||||
|
||||
# 使用模型进行推理
|
||||
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000
|
||||
|
||||
# 使用序列并行进行推理
|
||||
# 当 nproc_per_node 大于 1 时,将自动启用序列并行
|
||||
torchrun --standalone --nproc_per_node 2 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000
|
||||
```
|
||||
|
||||
第二个命令将在 checkpoint 文件夹中自动生成一个 `model_ckpt.pt` 文件。
|
||||
|
||||
### 推理超参数
|
||||
|
||||
1. DPM 求解器擅长对图像进行快速推理。但是,它的视频推理的效果并不令人满意。若出于快速演示目的您可以使用这个求解器。
|
||||
|
||||
```python
|
||||
type="dmp-solver"
|
||||
num_sampling_steps=20
|
||||
```
|
||||
|
||||
2. 您可以在视频推理上使用 [SVD](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt) 微调的 VAE 解码器(消耗更多内存)。但是,我们没有看到视频推理效果有明显改善。要使用它,请将 [预训练权重](https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models/vae_temporal_decoder) 下载到 `./pretrained_models/vae_temporal_decoder` 中,并修改配置文件,如下所示。
|
||||
|
||||
```python
|
||||
vae = dict(
|
||||
type="VideoAutoencoderKLTemporalDecoder",
|
||||
from_pretrained="pretrained_models/vae_temporal_decoder",
|
||||
)
|
||||
```
|
||||
|
||||
## 训练
|
||||
|
||||
如果您要继续训练,请运行以下命令。参数 ``--load`` 和 ``--ckpt-path`` 不同之处在于,它会加载优化器和数据加载器的状态。
|
||||
|
||||
```bash
|
||||
torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --load YOUR_PRETRAINED_CKPT
|
||||
```
|
||||
|
||||
如果要启用 wandb 日志,请添加到 `--wandb` 参数到命令中。
|
||||
|
||||
```bash
|
||||
WANDB_API_KEY=YOUR_WANDB_API_KEY torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --wandb True
|
||||
```
|
||||
|
||||
您可以修改相应的配置文件来更改训练设置。在 [此处](/docs/structure.md#training-config-demos) 查看更多详细信息。
|
||||
|
||||
### 训练超参数
|
||||
|
||||
1. `dtype` 是用于训练的数据类型。仅支持 `fp16` 和 `bf16`。ColossalAI 自动启用 `fp16` 和 `bf16` 的混合精度训练。在训练过程中,我们发现 `bf16` 更稳定。
|
||||
Loading…
Reference in a new issue