Open-Sora/docs/commands.md

# Commands

- [Inference](#inference)
  - [Inference with Open-Sora 1.1](#inference-with-open-sora-11)
  - [Inference with DiT pretrained on ImageNet](#inference-with-dit-pretrained-on-imagenet)
  - [Inference with Latte pretrained on UCF101](#inference-with-latte-pretrained-on-ucf101)
  - [Inference with PixArt-α pretrained weights](#inference-with-pixart-α-pretrained-weights)
  - [Inference with checkpoints saved during training](#inference-with-checkpoints-saved-during-training)
  - [Inference Hyperparameters](#inference-hyperparameters)
- [Training](#training)
  - [Training Hyperparameters](#training-hyperparameters)
- [Search batch size for buckets](#search-batch-size-for-buckets)

## Inference

You can modify corresponding config files to change the inference settings. See more details [here](/docs/structure.md#inference-config-demos).

### Inference with Open-Sora 1.1

Since Open-Sora 1.1 supports inference with dynamic input size, you can pass the input size as an argument.

```bash
# image sampling with prompt path
python scripts/inference.py configs/opensora-v1-1/inference/sample.py \
    --ckpt-path CKPT_PATH --prompt-path assets/texts/t2i_samples.txt --num-frames 1 --image-size 1024 1024

# image sampling with prompt
python scripts/inference.py configs/opensora-v1-1/inference/sample.py \
    --ckpt-path CKPT_PATH --prompt "A beautiful sunset over the city" --num-frames 1 --image-size 1024 1024

# video sampling
python scripts/inference.py configs/opensora-v1-1/inference/sample.py \
    --ckpt-path CKPT_PATH --prompt "A beautiful sunset over the city" --num-frames 16 --image-size 480 854
```

You can adjust the `--num-frames` and `--image-size` to generate different results. We recommend you to use the same image size as the training resolution, which is defined in [aspect.py](/opensora/datasets/aspect.py). Some examples are shown below.

- 240p
  - 16:9 240x426
  - 3:4 276x368
  - 1:1 320x320
- 480p
  - 16:9 480x854
  - 3:4 554x738
  - 1:1 640x640
- 720p
  - 16:9 720x1280
  - 3:4 832x1110
  - 1:1 960x960

`inference-long.py` is compatible with `inference.py` and supports advanced features.

```bash
# image condition
python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
  --num-frames 32 --image-size 240 426 --sample-name image-cond \
  --prompt 'A breathtaking sunrise scene.{"reference_path": "assets/images/condition/wave.png","mask_strategy": "0"}'

# video extending
python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
  --num-frames 32 --image-size 240 426 --sample-name image-cond \
  --prompt 'A car driving on the ocean.{"reference_path": "https://cdn.openai.com/tmp/s/interp/d0.mp4","mask_strategy": "0,0,0,-8,8"}'

# long video generation
python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
  --num-frames 32 --image-size 240 426 --loop 16 --condition-frame-length 8 --sample-name long \
  --prompt '|0|a white jeep equipped with a roof rack driving on a dirt road in a coniferous forest.|2|a white jeep equipped with a roof rack driving on a dirt road in the desert.|4|a white jeep equipped with a roof rack driving on a dirt road in a mountain.|6|A white jeep equipped with a roof rack driving on a dirt road in a city.|8|a white jeep equipped with a roof rack driving on a dirt road on the surface of a river.|10|a white jeep equipped with a roof rack driving on a dirt road under the lake.|12|a white jeep equipped with a roof rack flying into the sky.|14|a white jeep equipped with a roof rack driving in the universe. Earth is the background.{"reference_path": "https://cdn.openai.com/tmp/s/interp/d0.mp4", "mask_strategy": "0,0,0,0,16"}'

# video connecting
python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
  --num-frames 32 --image-size 240 426 --sample-name connect \
  --prompt 'A breathtaking sunrise scene.{"reference_path": "assets/images/condition/sunset1.png;assets/images/condition/sunset2.png","mask_strategy": "0;0,1,0,-1,1"}'

# video editing
python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
  --num-frames 32 --image-size 480 853 --sample-name edit \
  --prompt 'A cyberpunk-style city at night.{"reference_path": "https://cdn.pixabay.com/video/2021/10/12/91744-636709154_large.mp4","mask_strategy": "0,0,0,0,32,0.4"}'
```

### Inference with DiT pretrained on ImageNet

The following command automatically downloads the pretrained weights on ImageNet and runs inference.

```bash
python scripts/inference.py configs/dit/inference/1x256x256-class.py --ckpt-path DiT-XL-2-256x256.pt
```

### Inference with Latte pretrained on UCF101

The following command automatically downloads the pretrained weights on UCF101 and runs inference.

```bash
python scripts/inference.py configs/latte/inference/16x256x256-class.py --ckpt-path Latte-XL-2-256x256-ucf101.pt
```

### Inference with PixArt-α pretrained weights

Download T5 into `./pretrained_models` and run the following command.

```bash
# 256x256
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x256x256.py --ckpt-path PixArt-XL-2-256x256.pth

# 512x512
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x512x512.py --ckpt-path PixArt-XL-2-512x512.pth

# 1024 multi-scale
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x1024MS.py --ckpt-path PixArt-XL-2-1024MS.pth
```

### Inference with checkpoints saved during training

During training, an experiment logging folder is created in `outputs` directory. Under each checkpoint folder, e.g. `epoch12-global_step2000`, there is a `ema.pt` and the shared `model` folder. Run the following command to perform inference.

```bash
# inference with ema model
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000/ema.pt

# inference with model
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000

# inference with sequence parallelism
# sequence parallelism is enabled automatically when nproc_per_node is larger than 1
torchrun --standalone --nproc_per_node 2 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000
```

The second command will automatically generate a `model_ckpt.pt` file in the checkpoint folder.

### Inference Hyperparameters

1. DPM-solver is good at fast inference for images. However, the video result is not satisfactory. You can use it for fast demo purpose.

```python
type="dmp-solver"
num_sampling_steps=20
```

2. You can use [SVD](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)'s finetuned VAE decoder on videos for inference (consumes more memory). However, we do not see significant improvement in the video result. To use it, download [the pretrained weights](https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models/vae_temporal_decoder) into `./pretrained_models/vae_temporal_decoder` and modify the config file as follows.

```python
vae = dict(
    type="VideoAutoencoderKLTemporalDecoder",
    from_pretrained="pretrained_models/vae_temporal_decoder",
)
```

## Training

To resume training, run the following command. ``--load`` different from ``--ckpt-path`` as it loads the optimizer and dataloader states.

```bash
torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --load YOUR_PRETRAINED_CKPT
```

To enable wandb logging, add `--wandb` to the command.

```bash
WANDB_API_KEY=YOUR_WANDB_API_KEY torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --wandb True
```

You can modify corresponding config files to change the training settings. See more details [here](/docs/structure.md#training-config-demos).

### Training Hyperparameters

1. `dtype` is the data type for training. Only `fp16` and `bf16` are supported. ColossalAI automatically enables the mixed precision training for `fp16` and `bf16`. During training, we find `bf16` more stable.

## Search batch size for buckets

To search the batch size for buckets, run the following command.

```bash
torchrun --standalone --nproc_per_node 1 scripts/misc/search_bs.py configs/opensora-v1-2/misc/bs.py --data-path /mnt/nfs-207/sora_data/meta/searchbs.csv
```

Here, your data should be a small one for searching purposes.

To control the batch size search range, you should specify `bucket_config` in the config file, where the value tuple is `(guess_value, range)` and the search will be performed in `guess_value±range`.

Here is an example of the bucket config:

```python
bucket_config = {
  "240p": {
        1: (100, 100),
        51: (24, 10),
        102: (12, 10),
        204: (4, 8),
        408: (2, 8),
    },
    "480p": {
        1: (50, 50),
        51: (6, 6),
        102: (3, 3),
        204: (1, 2),
    },
}
```

You can also specify a resolution to search for parallelism.

```bash
torchrun --standalone --nproc_per_node 1 scripts/misc/search_bs.py configs/opensora-v1-2/misc/bs.py --data-path /mnt/nfs-207/sora_data/meta/searchbs.csv --resolution 240p
```

The searching goal should be specified in the config file as well. There are two ways:

1. Specify a `base_step_time` in the config file. The searching goal is to find the batch size that can achieve the `base_step_time` for each bucket.
2. If `base_step_time` is not specified, it will be determined by `base` which is a tuple of `(batch_size, step_time)`. The step time is the maximum batch size allowed for the bucket.

The script will print the best batch size (and corresponding step time) for each bucket and save the output config file. Note that we assume a larger batch size is better, so the script use binary search to find the best batch size.
-												Docs/readme (#73)

* update docs

* update docs
											
										
										
											2024-03-16 10:09:00 +01:00
+								# Commands
-												update docs

											
										
										
											2024-04-23 11:26:10 +02:00
+								- [Inference](#inference)
 								  - [Inference with Open-Sora 1.1](#inference-with-open-sora-11)
 								  - [Inference with DiT pretrained on ImageNet](#inference-with-dit-pretrained-on-imagenet)
 								  - [Inference with Latte pretrained on UCF101](#inference-with-latte-pretrained-on-ucf101)
 								  - [Inference with PixArt-α pretrained weights](#inference-with-pixart-α-pretrained-weights)
 								  - [Inference with checkpoints saved during training](#inference-with-checkpoints-saved-during-training)
 								  - [Inference Hyperparameters](#inference-hyperparameters)
 								- [Training](#training)
 								  - [Training Hyperparameters](#training-hyperparameters)
 								- [Search batch size for buckets](#search-batch-size-for-buckets)
-												Docs/readme (#73)

* update docs

* update docs
											
										
										
											2024-03-16 10:09:00 +01:00
+								## Inference
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
+								You can modify corresponding config files to change the inference settings. See more details [here](/docs/structure.md#inference-config-demos).
-												update docs

											
										
										
											2024-04-23 11:26:10 +02:00
+								### Inference with Open-Sora 1.1
 								Since Open-Sora 1.1 supports inference with dynamic input size, you can pass the input size as an argument.
 								```bash
 								# image sampling with prompt path
 								python scripts/inference.py configs/opensora-v1-1/inference/sample.py \
 								    --ckpt-path CKPT_PATH --prompt-path assets/texts/t2i_samples.txt --num-frames 1 --image-size 1024 1024
 								# image sampling with prompt
 								python scripts/inference.py configs/opensora-v1-1/inference/sample.py \
 								    --ckpt-path CKPT_PATH --prompt "A beautiful sunset over the city" --num-frames 1 --image-size 1024 1024
 								# video sampling
 								python scripts/inference.py configs/opensora-v1-1/inference/sample.py \
 								    --ckpt-path CKPT_PATH --prompt "A beautiful sunset over the city" --num-frames 16 --image-size 480 854
 								```
 								You can adjust the `--num-frames` and `--image-size` to generate different results. We recommend you to use the same image size as the training resolution, which is defined in [aspect.py](/opensora/datasets/aspect.py). Some examples are shown below.
 								- 240p
 								  - 16:9 240x426
 								  - 3:4 276x368
 								  - 1:1 320x320
 								- 480p
 								  - 16:9 480x854
 								  - 3:4 554x738
 								  - 1:1 640x640
 								- 720p
 								  - 16:9 720x1280
 								  - 3:4 832x1110
 								  - 1:1 960x960
 								`inference-long.py` is compatible with `inference.py` and supports advanced features.
 								```bash
 								# image condition
-												Docs/v1.1 zangwei (#308)

* update reference sample

* update docs
											
										
										
											2024-04-25 07:09:26 +02:00
+								python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
 								  --num-frames 32 --image-size 240 426 --sample-name image-cond \
 								  --prompt 'A breathtaking sunrise scene.{"reference_path": "assets/images/condition/wave.png","mask_strategy": "0"}'
-												update docs

											
										
										
											2024-04-23 11:26:10 +02:00
+								# video extending
-												Docs/v1.1 zangwei (#308)

* update reference sample

* update docs
											
										
										
											2024-04-25 07:09:26 +02:00
+								python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
 								  --num-frames 32 --image-size 240 426 --sample-name image-cond \
 								  --prompt 'A car driving on the ocean.{"reference_path": "https://cdn.openai.com/tmp/s/interp/d0.mp4","mask_strategy": "0,0,0,-8,8"}'
 								# long video generation
 								python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
 								  --num-frames 32 --image-size 240 426 --loop 16 --condition-frame-length 8 --sample-name long \
 								  --prompt '|0|a white jeep equipped with a roof rack driving on a dirt road in a coniferous forest.|2|a white jeep equipped with a roof rack driving on a dirt road in the desert.|4|a white jeep equipped with a roof rack driving on a dirt road in a mountain.|6|A white jeep equipped with a roof rack driving on a dirt road in a city.|8|a white jeep equipped with a roof rack driving on a dirt road on the surface of a river.|10|a white jeep equipped with a roof rack driving on a dirt road under the lake.|12|a white jeep equipped with a roof rack flying into the sky.|14|a white jeep equipped with a roof rack driving in the universe. Earth is the background.{"reference_path": "https://cdn.openai.com/tmp/s/interp/d0.mp4", "mask_strategy": "0,0,0,0,16"}'
-												update docs

											
										
										
											2024-04-23 11:26:10 +02:00
+								# video connecting
-												Docs/v1.1 zangwei (#308)

* update reference sample

* update docs
											
										
										
											2024-04-25 07:09:26 +02:00
+								python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
 								  --num-frames 32 --image-size 240 426 --sample-name connect \
 								  --prompt 'A breathtaking sunrise scene.{"reference_path": "assets/images/condition/sunset1.png;assets/images/condition/sunset2.png","mask_strategy": "0;0,1,0,-1,1"}'
-												update docs

											
										
										
											2024-04-23 11:26:10 +02:00
+								# video editing
-												Docs/v1.1 zangwei (#308)

* update reference sample

* update docs
											
										
										
											2024-04-25 07:09:26 +02:00
+								python scripts/inference-long.py configs/opensora-v1-1/inference/sample.py --ckpt-path CKPT_PATH \
 								  --num-frames 32 --image-size 480 853 --sample-name edit \
 								  --prompt 'A cyberpunk-style city at night.{"reference_path": "https://cdn.pixabay.com/video/2021/10/12/91744-636709154_large.mp4","mask_strategy": "0,0,0,0,32,0.4"}'
-												update docs

											
										
										
											2024-04-23 11:26:10 +02:00
+								```
-												Docs/readme (#73)

* update docs

* update docs
											
										
										
											2024-03-16 10:09:00 +01:00
+								### Inference with DiT pretrained on ImageNet
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
+								The following command automatically downloads the pretrained weights on ImageNet and runs inference.
 								```bash
 								python scripts/inference.py configs/dit/inference/1x256x256-class.py --ckpt-path DiT-XL-2-256x256.pt
 								```
 								### Inference with Latte pretrained on UCF101
 								The following command automatically downloads the pretrained weights on UCF101 and runs inference.
 								```bash
 								python scripts/inference.py configs/latte/inference/16x256x256-class.py --ckpt-path Latte-XL-2-256x256-ucf101.pt
 								```
 								### Inference with PixArt-α pretrained weights
 								Download T5 into `./pretrained_models` and run the following command.
 								```bash
 								# 256x256
-												updated doc (#77)


											
										
										
											2024-03-17 05:17:28 +01:00
+								torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x256x256.py --ckpt-path PixArt-XL-2-256x256.pth
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
+								# 512x512
-												updated doc (#77)


											
										
										
											2024-03-17 05:17:28 +01:00
+								torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x512x512.py --ckpt-path PixArt-XL-2-512x512.pth
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
+								# 1024 multi-scale
-												updated doc (#77)


											
										
										
											2024-03-17 05:17:28 +01:00
+								torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/pixart/inference/1x1024MS.py --ckpt-path PixArt-XL-2-1024MS.pth
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
+								```
 								### Inference with checkpoints saved during training
-												[docs]add docs/commands_zh.md,fix some doc's typo (#100)

Signed-off-by: zeekzen <yangzitao1995@qq.com>
											
										
										
											2024-03-18 07:30:19 +01:00
+								During training, an experiment logging folder is created in `outputs` directory. Under each checkpoint folder, e.g. `epoch12-global_step2000`, there is a `ema.pt` and the shared `model` folder. Run the following command to perform inference.
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
 								```bash
 								# inference with ema model
-												updated doc (#77)


											
										
										
											2024-03-17 05:17:28 +01:00
+								torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000/ema.pt
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
+								# inference with model
-												updated doc (#77)


											
										
										
											2024-03-17 05:17:28 +01:00
+								torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000
 								# inference with sequence parallelism
 								# sequence parallelism is enabled automatically when nproc_per_node is larger than 1
 								torchrun --standalone --nproc_per_node 2 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path outputs/001-STDiT-XL-2/epoch12-global_step2000
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
+								```
 								The second command will automatically generate a `model_ckpt.pt` file in the checkpoint folder.
-												Docs/readme (#73)

* update docs

* update docs
											
										
										
											2024-03-16 10:09:00 +01:00
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
+								### Inference Hyperparameters
 . DPM-solver is good at fast inference for images. However, the video result is not satisfactory. You can use it for fast demo purpose.
 								```python
 								type="dmp-solver"
 								num_sampling_steps=20
 								```
-												[docs]add docs/commands_zh.md,fix some doc's typo (#100)

Signed-off-by: zeekzen <yangzitao1995@qq.com>
											
										
										
											2024-03-18 07:30:19 +01:00
+. You can use [SVD](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)'s finetuned VAE decoder on videos for inference (consumes more memory). However, we do not see significant improvement in the video result. To use it, download [the pretrained weights](https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models/vae_temporal_decoder) into `./pretrained_models/vae_temporal_decoder` and modify the config file as follows.
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
 								```python
 								vae = dict(
 								    type="VideoAutoencoderKLTemporalDecoder",
 								    from_pretrained="pretrained_models/vae_temporal_decoder",
 								)
-												[docs]add docs/commands_zh.md,fix some doc's typo (#100)

Signed-off-by: zeekzen <yangzitao1995@qq.com>
											
										
										
											2024-03-18 07:30:19 +01:00
+								```
-												Docs/readme (#73)

* update docs

* update docs
											
										
										
											2024-03-16 10:09:00 +01:00
 								## Training
-												Docs/readme (#75)

* update docs

* update docs

* update docs

* update acceleration docs and fix typos

* update docs commands
											
										
										
											2024-03-16 15:17:22 +01:00
 								To resume training, run the following command. ``--load`` different from ``--ckpt-path`` as it loads the optimizer and dataloader states.
 								```bash
 								torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --load YOUR_PRETRAINED_CKPT
 								```
 								To enable wandb logging, add `--wandb` to the command.
 								```bash
 								WANDB_API_KEY=YOUR_WANDB_API_KEY torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --wandb True
 								```
 								You can modify corresponding config files to change the training settings. See more details [here](/docs/structure.md#training-config-demos).
 								### Training Hyperparameters
 . `dtype` is the data type for training. Only `fp16` and `bf16` are supported. ColossalAI automatically enables the mixed precision training for `fp16` and `bf16`. During training, we find `bf16` more stable.
-												[feature] add batch size search script (#47)


											
										
										
											2024-04-11 08:23:13 +02:00
 								## Search batch size for buckets
 								To search the batch size for buckets, run the following command.
 								```bash
-												[docs] update docs for bs search

											
										
										
											2024-05-15 15:13:17 +02:00
+								torchrun --standalone --nproc_per_node 1 scripts/misc/search_bs.py configs/opensora-v1-2/misc/bs.py --data-path /mnt/nfs-207/sora_data/meta/searchbs.csv
-												[feature] add batch size search script (#47)


											
										
										
											2024-04-11 08:23:13 +02:00
+								```
-												[docs] update docs for bs search

											
										
										
											2024-05-15 15:13:17 +02:00
+								Here, your data should be a small one for searching purposes.
-												[feature] add batch size search script (#47)


											
										
										
											2024-04-11 08:23:13 +02:00
-												[docs] update docs for bs search

											
										
										
											2024-05-15 15:13:17 +02:00
+								To control the batch size search range, you should specify `bucket_config` in the config file, where the value tuple is `(guess_value, range)` and the search will be performed in `guess_value±range`.
-												[feature] add batch size search script (#47)


											
										
										
											2024-04-11 08:23:13 +02:00
 								Here is an example of the bucket config:
 								```python
 								bucket_config = {
-												[docs] update docs for bs search

											
										
										
											2024-05-15 15:13:17 +02:00
+								  "240p": {
 : (100, 100),
 : (24, 10),
 : (12, 10),
 : (4, 8),
 : (2, 8),
 								    },
 								    "480p": {
 : (50, 50),
 : (6, 6),
 : (3, 3),
 : (1, 2),
-												[feature] add batch size search script (#47)


											
										
										
											2024-04-11 08:23:13 +02:00
+								    },
 								}
 								```
-												[docs] update docs for bs search

											
										
										
											2024-05-15 15:13:17 +02:00
+								You can also specify a resolution to search for parallelism.
 								```bash
 								torchrun --standalone --nproc_per_node 1 scripts/misc/search_bs.py configs/opensora-v1-2/misc/bs.py --data-path /mnt/nfs-207/sora_data/meta/searchbs.csv --resolution 240p
 								```
 								The searching goal should be specified in the config file as well. There are two ways:
 . Specify a `base_step_time` in the config file. The searching goal is to find the batch size that can achieve the `base_step_time` for each bucket.
 . If `base_step_time` is not specified, it will be determined by `base` which is a tuple of `(batch_size, step_time)`. The step time is the maximum batch size allowed for the bucket.
 								The script will print the best batch size (and corresponding step time) for each bucket and save the output config file. Note that we assume a larger batch size is better, so the script use binary search to find the best batch size.