mirror of
https://github.com/hpcaitech/Open-Sora.git
synced 2026-04-11 05:13:31 +02:00
Dev/pxy (#69)
* update scoring/matching * update scoring/matching * update scoring/matching * update scoring/matching * update scoring/matching * update scoring/matching * update scoring/matching * update scoring/matching * update scoring/matching * update scene_cut * update scene_cut * update scene_cut[A * update scene_cut * update scene_cut * update scene_cut * update scene_cut * update scene_cut * update scene_cut * m * m * m * m * m * m * m * m * m * m * m * m * m * m * update readme * update readme * extract frames using opencv everywhere * extract frames using opencv everywhere * extract frames using opencv everywhere * filter panda10m * filter panda10m * m * m * m * m * m * m * m * m * m * m * m * m * m * m * m * m * m * ocr * add ocr * add main.sh * add ocr * add ocr * add ocr * add ocr * add ocr * add ocr * update scene_cut * update remove main.sh * update scoring * update scoring * update scoring * update README * update readme * update scene_cut * update readme * update scoring * update readme * update readme * update filter_panda10m * update readme * update readme
This commit is contained in:
parent
c6cbc7a6bf
commit
f7329dda38
|
|
@ -245,32 +245,6 @@ def filter_panda10m_timestamp(meta_path):
|
|||
print(f"New meta (shape={meta.shape}) saved to '{out_path}'.")
|
||||
|
||||
|
||||
def append_timestamp(meta_path):
|
||||
|
||||
def process_single_row(row):
|
||||
path = row['path']
|
||||
wo_ext, ext = os.path.splitext(path)
|
||||
json_path = f'{wo_ext}.json'
|
||||
try:
|
||||
with open(json_path, 'r') as f:
|
||||
data = json.load(f)
|
||||
timestamp = data['clips'][2:-2]
|
||||
a, b = timestamp.split(', ')
|
||||
timestamp = f"('{a}', '{b}')"
|
||||
except Exception as e:
|
||||
timestamp = ''
|
||||
return timestamp
|
||||
|
||||
meta = pd.read_csv(meta_path)
|
||||
ret = apply(meta, process_single_row, axis=1)
|
||||
meta['timestamp'] = ret
|
||||
|
||||
wo_ext, ext = os.path.splitext(meta_path)
|
||||
out_path = f"{wo_ext}_timestamp{ext}"
|
||||
meta.to_csv(out_path, index=False)
|
||||
print(f"New meta (shape={meta.shape}) saved to '{out_path}'.")
|
||||
|
||||
|
||||
def parse_args():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--meta_path', type=str, nargs='+')
|
||||
|
|
@ -282,7 +256,6 @@ def parse_args():
|
|||
|
||||
if __name__ == '__main__':
|
||||
args = parse_args()
|
||||
# append_timestamp(args.meta_path)
|
||||
|
||||
text_set = get_10m_set()
|
||||
for x in args.meta_path:
|
||||
|
|
|
|||
|
|
@ -1,9 +1,15 @@
|
|||
# Scene Detection and Video Splitting
|
||||
|
||||
- [Scene Detection and Video Splitting](#scene-detection-and-video-splitting)
|
||||
- [Prepare Meta Files](#prepare-meta-files)
|
||||
- [Scene Detection](#scene-detection)
|
||||
- [Video Splitting](#video-splitting)
|
||||
|
||||
In many cases, raw videos contain several scenes and are too long for training. Thus, it is essential to split them into shorter
|
||||
clips based on scenes. Here, we provide code for scene detection and video splitting.
|
||||
|
||||
## Prepare a meta file
|
||||
At this step, you should have a raw video dataset prepared. We need a meta file for the dataset. To create a meta file from a folder, run:
|
||||
## Prepare Meta Files
|
||||
At this step, you should have a raw video dataset prepared. A meta file of the dataset information is needed for data processing. To create a meta file from a folder, run:
|
||||
|
||||
```bash
|
||||
python -m tools.datasets.convert video /path/to/video/folder --output /path/to/save/meta.csv
|
||||
|
|
|
|||
Loading…
Reference in a new issue