hxwang
|
5730060f41
|
[ckpt] mitigate gpu mem peak when loading ckpt
|
2025-03-26 18:04:16 +08:00 |
|
hxwang
|
bc4aa4f217
|
[ckpt] fix shape error when gathering weights under sp + dp parallelism
|
2025-03-26 15:43:00 +08:00 |
|
Alex Gherghina
|
8202ca13df
|
Import Tuple from typing instead of torch
|
2025-03-25 10:38:55 +02:00 |
|
Zheng Zangwei (Alex Zheng)
|
febf3ad4b2
|
Update Open-Sora 2.0 (#807)
* upload v2.0
* update docs
* [hotfix] fit latest fa3 (#802)
* update readme
* update readme
* update readme
* update train readme
* update readme
* update readme: motion score
* cleaning video dc ae WIP
* update config
* add dependency functions
* undo cleaning
* use latest dcae
* complete high compression training
* update hcae config
* cleaned up vae
* update ae.md
* further cleanup
* update vae & ae paths
* align naming of ae
* [hotfix] fix ring attn bwd for fa3 (#803)
* train ae default without wandb
* update config
* update evaluation results
* added hcae report
* update readme
* update readme demo
* update readme demo
* update readme gif
* display demo directly in readme
* update paper
* delete files
---------
Co-authored-by: Hongxin Liu <lhx0217@gmail.com>
Co-authored-by: Shen-Chenhui <shen_chenhui@u.nus.edu>
Co-authored-by: wuxiwen <wuxiwen.simon@gmail.com>
|
2025-03-12 13:14:22 +08:00 |
|
Zheng Zangwei (Alex Zheng)
|
f1c6b8b88e
|
open-sora v1.3 code upload (#786)
Co-authored-by: gxyes <gxynoz@gmail.com>
|
2025-02-20 16:50:24 +08:00 |
|
Gao, Ruiyuan
|
df5668cdf1
|
fix bug at mha, MaskGenerator; improve ckpt_utils.py (#609)
* fix bug at mha in blocks.py
* fix bug in MaskGenerator
* align logging style in ckpt_utils.py
|
2025-02-20 16:40:47 +08:00 |
|
Hongxin Liu
|
70ca63f30b
|
[feature] support async ckpt & pin memory cache (#760)
* [feature] support async ckpt
* [feature] support pin memory cache
* [doc] update readme
|
2024-12-20 10:30:49 +08:00 |
|
ZXMMD
|
a29424c237
|
fix ckpt_utils.py (#580)
|
2024-07-12 14:52:21 +08:00 |
|
Jirka Borovec
|
a7b6aacc99
|
lint: unify setting in pyproject.toml (#583)
* lint: unify setting in `pyproject.toml`
* apply pre-commit
|
2024-07-12 14:50:13 +08:00 |
|
Tom Young
|
194e2204c1
|
Merge pull request #169 from hpcaitech/hotfix/cut
update default shorter_size
|
2024-07-04 11:25:07 +08:00 |
|
pxy
|
2ac4900c81
|
update default shorter_size
|
2024-07-04 03:14:08 +00:00 |
|
zhengzangw
|
7e325e4e7b
|
Merge branch 'main' of https://github.com/hpcaitech/Open-Sora into main
|
2024-06-27 14:02:04 +00:00 |
|
zhengzangw
|
eb0ba30484
|
Merge branch 'main' of github.com:hpcaitech/Open-Sora-dev into main
|
2024-06-27 07:11:11 +00:00 |
|
Hongxin Liu
|
332d9fc9c9
|
[feature] make timer optional and make reduce bucket size configurable (#549)
* [feature] make reduce bucket size configurable
* [feature] make timer optional
|
2024-06-27 13:37:54 +08:00 |
|
Zheng Zangwei (Alex Zheng)
|
45df92849c
|
Merge pull request #156 from hpcaitech/feature/causal_atten
Added causal mask in Attention forward pass
|
2024-06-26 23:36:37 +08:00 |
|
zhengzangw
|
4b2b47b34d
|
[fix] pixart sampling
|
2024-06-26 07:00:24 +00:00 |
|
FrankLeeeee
|
3552145f84
|
[sp] updated precision test
|
2024-06-25 06:17:36 +00:00 |
|
FrankLeeeee
|
6bb2c599b6
|
Merge remote-tracking branch 'upstream/main' into hotfix/fix-sp
|
2024-06-24 09:08:21 +00:00 |
|
Jiacheng Yang
|
00fef1d1af
|
fix SeqParallelMultiHeadCrossAttention for consistent results in distributed mode (#510)
|
2024-06-24 17:07:49 +08:00 |
|
Zheng Zangwei (Alex Zheng)
|
455f9e7674
|
Merge pull request #161 from hpcaitech/hotfix/dataset
[data] added error handling to dataset
|
2024-06-24 16:54:40 +08:00 |
|
FrankLeeeee
|
6a72b8910b
|
[data] added error handling to dataset
|
2024-06-24 08:53:10 +00:00 |
|
zhengzangw
|
f40ea2270c
|
Merge branch 'main' of github.com:hpcaitech/Open-Sora-dev into main
|
2024-06-24 07:04:17 +00:00 |
|
zhengzangw
|
491403218d
|
update for pixart
|
2024-06-24 07:04:08 +00:00 |
|
Frank Lee
|
ee1c79a898
|
[sp] added padding (#160)
|
2024-06-24 13:59:29 +08:00 |
|
zhengzangw
|
9a9a6c2f3e
|
[fix] better support local ckpt
|
2024-06-22 15:54:27 +00:00 |
|
zhengzangw
|
7115864314
|
[fix] HF loading
|
2024-06-22 15:41:32 +00:00 |
|
zhengzangw
|
cd12584034
|
handle av error
|
2024-06-22 13:26:55 +00:00 |
|
zhengzangw
|
a6bdabe286
|
minor fix
|
2024-06-21 19:22:02 +00:00 |
|
zhengzangw
|
7aa940f20d
|
Merge branch 'main' of https://github.com/hpcaitech/Open-Sora into dev/v1.2
|
2024-06-21 19:17:30 +00:00 |
|
zhengzangw
|
20d1584a1c
|
[fix] support stdit1 training
|
2024-06-21 19:03:30 +00:00 |
|
zhengzangw
|
dec17bd990
|
[feat] reduce memory leakage in dataloader and pyav
|
2024-06-21 18:23:30 +00:00 |
|
Zheng Zangwei (Alex Zheng)
|
9b668e1c4e
|
Merge pull request #523 from BurkeHulk/hotfix/fp16_nan_output
Force fp16 input to fp32 to avoid nan output in timestep_transform
|
2024-06-21 18:01:17 +08:00 |
|
HangXu
|
04d2ee0182
|
Force fp16 input to fp32 to avoid nan output in timestep_transform
|
2024-06-21 11:15:39 +03:00 |
|
zhengzangw
|
f32c1173b7
|
config for local load
|
2024-06-20 10:23:38 +00:00 |
|
rangoliu
|
81524e675e
|
fix ar keys (#500)
|
2024-06-20 17:51:24 +08:00 |
|
Shen Chenhui
|
416837a86b
|
Hotfix/vae (#502)
* fix assert
* fix vae config; update path
---------
Co-authored-by: Shen-Chenhui <shen_chenhui@u.nus.edu>
|
2024-06-20 17:49:16 +08:00 |
|
HangXu
|
8f239c87bf
|
Added causal mask in Attention forward pass
|
2024-06-20 11:48:42 +03:00 |
|
Zheng Zangwei (Alex Zheng)
|
4cbf3c33b8
|
Hotfix/t5 load (#487)
* hotfix
* hotfix for stdit
* hotfix for vae
|
2024-06-19 23:15:29 +08:00 |
|
Zheng Zangwei (Alex Zheng)
|
396307c050
|
Hotfix/t5 load (#486)
* hotfix
* hotfix for stdit
|
2024-06-19 23:03:15 +08:00 |
|
Zheng Zangwei (Alex Zheng)
|
85f20274a0
|
hotfix (#484)
|
2024-06-19 22:47:40 +08:00 |
|
Zheng Zangwei (Alex Zheng)
|
ccb85fc3c3
|
hotfix (#482)
|
2024-06-19 22:05:47 +08:00 |
|
Shen Chenhui
|
49536bd923
|
Merge pull request #154 from hpcaitech/hotfix/t5_assert
fix assertion
|
2024-06-19 21:22:51 +08:00 |
|
Shen-Chenhui
|
1602768d5b
|
fix assertion
|
2024-06-19 13:22:04 +00:00 |
|
Zheng Zangwei (Alex Zheng)
|
403772eee1
|
Docs/fix zangwei (#474)
* [docs] fix training data num
* [docs] update sp
* add support for issue #470
|
2024-06-19 16:53:53 +08:00 |
|
Jianshu Guo
|
17cce908b2
|
fix:multi-fps bug. For multi-fps training, when extracting frames according to a certain frame_interval, the fps of the extracted frames actually changes. (#444)
Co-authored-by: Jianshu Guo <fguojianshu@gmail.com>
|
2024-06-18 16:20:11 +08:00 |
|
jackmappotion
|
2229f01b35
|
♻️ Directory creation to use os.makedirs with exist_ok (#435)
Simplified code and improved readability / Ensured functionality remains the same by allowing directory to exist without error
|
2024-06-18 07:41:02 +08:00 |
|
FrankLeeeee
|
7ba29d3439
|
[doc] resolved conflict in readme
|
2024-06-17 23:18:40 +00:00 |
|
zhengzangw
|
525e29abbe
|
reformat and update docs
|
2024-06-17 15:37:23 +00:00 |
|
zhengzangw
|
30e276166c
|
update
|
2024-06-17 13:42:27 +00:00 |
|
Shen Chenhui
|
f0c98dd186
|
Merge pull request #148 from hpcaitech/feature/docs_v1.2
Feature/docs v1.2
|
2024-06-17 17:18:27 +08:00 |
|