From 13f8bcfdf01fd629626bb58630bda434f462e00e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E6=9E=81=E5=AE=A2=E5=89=91=E5=BF=83?=
 <44943767+zeekzen@users.noreply.github.com>
Date: Mon, 18 Mar 2024 14:30:19 +0800
Subject: [PATCH] [docs]add docs/commands_zh.md,fix some doc's typo (#100)

Signed-off-by: zeekzen <yangzitao1995@qq.com>
---
 docs/README_zh.md   |  6 +--
 docs/commands.md    |  5 ++-
 docs/commands_zh.md | 92 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 98 insertions(+), 5 deletions(-)
 create mode 100644 docs/commands_zh.md

diff --git a/docs/README_zh.md b/docs/README_zh.md
index d440550..839d513 100644
--- a/docs/README_zh.md
+++ b/docs/README_zh.md
@@ -142,7 +142,7 @@ torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/i
 torchrun --standalone --nproc_per_node 2 scripts/inference.py configs/opensora/inference/64x512x512.py --ckpt-path ./path/to/your/ckpt.pth
 ```
 
-我们在 H800 GPU 上进行了速度测试。如需使用其他模型进行推理，请参阅[此处](docs/commands.md)获取更多说明。
+我们在 H800 GPU 上进行了速度测试。如需使用其他模型进行推理，请参阅[此处](/docs/commands_zh.md)获取更多说明。
 
 ## 数据处理
 
@@ -169,11 +169,11 @@ torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/6
 colossalai run --nproc_per_node 8 --hostfile hostfile scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --ckpt-path YOUR_PRETRAINED_CKPT
 ```
 
-有关其他型号的培训和高级使用方法，请参阅[此处](docs/commands.md)获取更多说明。
+有关其他模型的训练和高级使用方法，请参阅[此处](/docs/commands_zh.md)获取更多说明。
 
 ## 贡献
 
-如果您希望为该项目做出贡献，可以参考 [贡献指南](./CONTRIBUTING.md).
+如果您希望为该项目做出贡献，可以参考 [贡献指南](/CONTRIBUTING.md).
 
 ## 声明
 
diff --git a/docs/commands.md b/docs/commands.md
index 28ee285..83fb42c 100644
--- a/docs/commands.md
+++ b/docs/commands.md
@@ -37,7 +37,7 @@ torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inf
 
 ### Inference with checkpoints saved during training
 
-During training, an experiment logging folder is created in `outputs` directory. Under each checpoint folder, e.g. `epoch12-global_step2000`, there is a `ema.pt` and the shared `model` folder. Run the following command to perform inference.
+During training, an experiment logging folder is created in `outputs` directory. Under each checkpoint folder, e.g. `epoch12-global_step2000`, there is a `ema.pt` and the shared `model` folder. Run the following command to perform inference.
 
 ```bash
 # inference with ema model
@@ -62,13 +62,14 @@ type="dmp-solver"
 num_sampling_steps=20
 ```
 
-1. You can use [SVD](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)'s finetuned VAE decoder on videos for inference (consumes more memory). However, we do not see significant improvement in the video result. To use it, download [the pretrained weights](https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models/vae_temporal_decoder) into `./pretrained_models/vae_temporal_decoder` and modify the config file as follows.
+2. You can use [SVD](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)'s finetuned VAE decoder on videos for inference (consumes more memory). However, we do not see significant improvement in the video result. To use it, download [the pretrained weights](https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models/vae_temporal_decoder) into `./pretrained_models/vae_temporal_decoder` and modify the config file as follows.
 
 ```python
 vae = dict(
     type="VideoAutoencoderKLTemporalDecoder",
     from_pretrained="pretrained_models/vae_temporal_decoder",
 )
+```
 
 ## Training
 
diff --git a/docs/commands_zh.md b/docs/commands_zh.md
new file mode 100644
index 0000000..6564293
--- /dev/null
+++ b/docs/commands_zh.md
@@ -0,0 +1,92 @@
+# 命令
+
+## 推理
+
+您可以修改相应的配置文件来更改推理设置。在 [此处](/docs/structure.md#inference-config-demos) 查看更多详细信息。
+
+### 在 ImageNet 上使用 DiT 预训练进行推理
+
+以下命令会自动在 ImageNet 上下载预训练权重并运行推理。
+
+```bash
+python scripts/inference.py configs/dit/inference/1x256x256-class.py --ckpt-path DiT-XL-2-256x256.pt
+```
+
+### 在 UCF101 上使用 Latte 预训练进行推理
+
+以下命令会自动下载 UCF101 上的预训练权重并运行推理。
+
+```bash
+python scripts/inference.py configs/latte/inference/16x256x256-class.py --ckpt-path Latte-XL-2-256x256-ucf101.pt
+```
+
+### 使用 PixArt-α 预训练权重进行推理
+
+将 T5 下载到 `./pretrained_models` 并运行以下命令。
+
+```bash
+# 256x256
+torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x256x256.py --ckpt-path PixArt-XL-2-256x256.pth
+
+# 512x512
+torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x512x512.py --ckpt-path PixArt-XL-2-512x512.pth
+
+# 1024 multi-scale
+torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x1024MS.py --ckpt-path PixArt-XL-2-1024MS.pth
+```
+
+### 使用训练期间保存的 checkpoints 进行推理
+
+在训练期间，会在 `outputs` 目录中创建一个实验日志记录文件夹。在每个 checkpoint 文件夹下（例如 `epoch12-global_step2000`），有一个 `ema.pt` 文件和共享的 `model` 文件夹。执行以下命令进行推理。
+
+```bash
+# 使用 ema 模型进行推理
+torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000/ema.pt
+
+# 使用模型进行推理
+torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000
+
+# 使用序列并行进行推理
+# 当 nproc_per_node 大于 1 时，将自动启用序列并行
+torchrun --standalone --nproc_per_node 2 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000
+```
+
+第二个命令将在 checkpoint 文件夹中自动生成一个 `model_ckpt.pt` 文件。
+
+### 推理超参数
+
+1. DPM 求解器擅长对图像进行快速推理。但是，它的视频推理的效果并不令人满意。若出于快速演示目的您可以使用这个求解器。
+
+```python
+type="dmp-solver"
+num_sampling_steps=20
+```
+
+2. 您可以在视频推理上使用 [SVD](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt) 微调的 VAE 解码器（消耗更多内存）。但是，我们没有看到视频推理效果有明显改善。要使用它，请将 [预训练权重](https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models/vae_temporal_decoder) 下载到 `./pretrained_models/vae_temporal_decoder` 中，并修改配置文件，如下所示。
+
+```python
+vae = dict(
+    type="VideoAutoencoderKLTemporalDecoder",
+    from_pretrained="pretrained_models/vae_temporal_decoder",
+)
+```
+
+## 训练
+
+如果您要继续训练，请运行以下命令。参数 ``--load`` 和 ``--ckpt-path`` 不同之处在于，它会加载优化器和数据加载器的状态。
+
+```bash
+torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --load YOUR_PRETRAINED_CKPT
+```
+
+如果要启用 wandb 日志，请添加到 `--wandb` 参数到命令中。
+
+```bash
+WANDB_API_KEY=YOUR_WANDB_API_KEY torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --wandb True
+```
+
+您可以修改相应的配置文件来更改训练设置。在 [此处](/docs/structure.md#training-config-demos) 查看更多详细信息。
+
+### 训练超参数
+
+1. `dtype` 是用于训练的数据类型。仅支持 `fp16` 和 `bf16`。ColossalAI 自动启用 `fp16` 和 `bf16` 的混合精度训练。在训练过程中，我们发现 `bf16` 更稳定。