7.3 KiB
Installation
Requirements are listed in requirements folder.
Note that besides these packages, some packages needs to be mannually installed, and are detailed in the following sections.
Different CUDA versions
You need to manually install torch, torchvision and xformers for different CUDA versions.
For CUDA 12.1,
# need to update first, else may run into weird issues with apex
pip install -U pip
pip install -U setuptools
pip install -U wheel
# install pytorch, torchvision, and xformers
pip install -r requirements/requirements-cu121.txt
# install this project
git clone https://github.com/hpcaitech/Open-Sora
cd Open-Sora
# the default installation is for inference only
pip install -v . # NOTE: for development mode, run `pip install -v -e .`
(Optional, recommended for fast speed, especially for training) To enable `layernorm_kernel` and `flash_attn`, you need to install `apex` and `flash-attn` with the following commands.
# install flash attention
# set enable_flash_attn=False in config to disable flash attention
pip install packaging ninja
pip install flash-attn --no-build-isolation
# install apex, the compilation will take a long time
# set enable_layernorm_kernel=False in config to disable apex
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" git+https://github.com/NVIDIA/apex.git
Optional Dependencies
The default installation is for inference only. Other optional dependencies are detailed in sections below, namel "Data Dependencies" and "Evaluation Dependencies".
Data Dependencies
First, run the following command to install requirements:
pip install -v .[data] # For development: `pip install -v -e .[eval]`
Next, you need to manually install the packages listed in the following sections specific to your data processing needs.
Datasets
To get image and video information, we use opencv-python in our requirement script
However, if your videos are in av1 codec instead of h264, you need to install ffmpeg (already in our requirement script), then run the following to make conda support av1 codec:
conda install -c conda-forge opencv
LLaVA Captioning
You need to manually install LLaVA with the following command:
pip install --no-deps llava@git+https://github.com/haotian-liu/LLaVA.git@v1.2.2.post1
PLLaVA Captioning
You need to manually install PLLaVa with the following commands:
cd tools/caption/pllava_dir # Assume you are in Open-Sora-dev root directory
git clone https://github.com/magic-research/PLLaVA.git
cd PLLaVA
git checkout fd9194a # since there is no version tag, we use this commit
python python_scripts/hf.py # download the PLLaVA weights
# IMPORTANT: create new environment for reliable pllava performances:
conda create -n pllava python=3.10
# You need to manually install `torch`, `torchvision` and `xformers` for different CUDA versions, the following works for CUDA 12.1:
conda activate pllava
pip install -r ../../../requirements/requirements-cu121.txt
pip install packaging ninja
pip install flash-attn --no-build-isolation
# You may manually remove any lines in requirements.txt that contains `cu11`, then run `pip install -r requirements.txt`
# Alternatively, use our prepared pllava environment:
pip install -r ../../../../requirements/requirements-pllava.txt
Frame Interpolation
conda install -c conda-forge opencv
Scene Detection
We use PySceneDetect for this job. You need to manually run the following:
pip install scenedetect[opencv] --upgrade
OCR
You need to go into path_to_your_env/lib/python3.10/site-packages/mmdet/__init__.py
and change the assert of mmcv_version < digit_version(mmcv_maximum_version) to mmcv_version <= digit_version(mmcv_maximum_version).
If you are unsure of your path to the mmdet init file, simply run our OCR command, wait for the mmdeet assertion error on mmcv versions. The error will contain the exact path to the mmdet init file.
Evaluation Dependencies
First, run the following command to install requirements:
pip install -v .[eval] # For development:`pip install -v -e .[eval]`
Next, you need to manually install the packages listed in the following sections specific to different evaluation methods.
Human Eval
You need to manually install apex from source by:
# use latest pip, setuptools, and wheel; else may run into weird issues with apex
pip install -U pip
pip install -U setuptools
pip install -U wheel
# set enable_layernorm_kernel=False in config to disable apex
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" git+https://github.com/NVIDIA/apex.git
VBench
You need to manually install VBench:
pip install --no-deps vbench==0.1.1
# If the installation shows a warning about the intalled vbench not in PATH, you need to add it by:
export PATH="/path/to/vbench:$PATH"
VAE
You need to mannually install cupy.
- For CUDA v11.2~11.8 (x86_64 / aarch64),
pip install cupy-cuda11x - For CUDA v12.x (x86_64 / aarch64),
pip install cupy-cuda12x
Note that for VAE evaluation, you may run into error with ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor', in this case, you need to go to the corresponding file (.../pytorchvideo/transforms/augmentations.py) reporting this error, then change as following:
# find the original line:
import torchvision.transforms.functional_tensor as F_t
# change to:
import torchvision.transforms._functional_tensor as F_t