Spaces:

StarPigeon
/

ViDove

Sleeping

JiaenLiu commited on Mar 21, 2023

Commit

7511df3

1 Parent(s): 144c78b

in progress

Former-commit-id: be461c9f32bacab23a5b56b51f601a7afe3ad51d

Files changed (3) hide show

README.md CHANGED Viewed

@@ -14,6 +14,10 @@ quick start:
 example online: python3 pipeline.py --link https://www.youtube.com/watch?v=61c4dn6851g --download ./downloads --result ./results --video_name SO_I_CHOSE_RANDOM
 example offline: python3 pipeline.py --local_path test_translation.m4a --result ./results --video_name test_translation
 example text input: python pipeline.py --text_file "/home/jiaenliu/project-t/results/huanghe_translation_en.txt" --result "/home/jiaenliu/project-t/results" --video_name "huanghe_test"

 example online: python3 pipeline.py --link https://www.youtube.com/watch?v=61c4dn6851g --download ./downloads --result ./results --video_name SO_I_CHOSE_RANDOM
+python3 pipeline.py --link https://www.youtube.com/watch?v=VrigMmXt9A0 --video_name Ukraine_and_its_Global_Impact
+python3 pipeline.py --video_file '/home/jiaenliu/project-t/downloads/audio/Ukraine_and_its_Global_Impact.mp4' -v --video_name Ukraine_and_its_Global_Impact
 example offline: python3 pipeline.py --local_path test_translation.m4a --result ./results --video_name test_translation
 example text input: python pipeline.py --text_file "/home/jiaenliu/project-t/results/huanghe_translation_en.txt" --result "/home/jiaenliu/project-t/results" --video_name "huanghe_test"

__pycache__/srt2ass.cpython-38.pyc ADDED Viewed

Binary file (13.9 kB). View file

pipeline.py CHANGED Viewed

@@ -119,8 +119,8 @@ if not args.only_srt:
     print('ASS subtitle saved as: ' + assSub_en)
 # Split the video script by sentences and create chunks within the token limit
-n_threshold = 1500  # Token limit for the GPT-3 model
-script_split = script_input.split('.')
 script_arr = []
 script = ""
@@ -142,12 +142,15 @@ for s in script_arr:
             model=model_name,
             messages = [
                 {"role": "system", "content": "You are a helpful assistant that translates English to Chinese and have decent background in starcraft2."},
                 {"role": "user", "content": 'Translate the following English text to Chinese: "{}"'.format(s)}
             ],
             temperature=0.15
         )
         with open(f"{RESULT_PATH}/{VIDEO_NAME}/{VIDEO_NAME}_zh.srt", 'a+') as f:
             f.write(response['choices'][0]['message']['content'].strip())
     if model_name == "text-davinci-003":
         prompt = f"Please help me translate this into Chinese:\n\n{s}\n\n"
@@ -164,6 +167,7 @@ for s in script_arr:
         with open(f"{RESULT_PATH}/{VIDEO_NAME}/{VIDEO_NAME}_zh.srt", 'a+') as f:
             f.write(response['choices'][0]['text'].strip())
 if not args.only_srt:
     assSub_zh = srt2ass(f"{RESULT_PATH}/{VIDEO_NAME}/{VIDEO_NAME}_zh.srt", "default", "No", "Modest")

     print('ASS subtitle saved as: ' + assSub_en)
 # Split the video script by sentences and create chunks within the token limit
+n_threshold = 1000  # Token limit for the GPT-3 model
+script_split = script_input.split('\n')
 script_arr = []
 script = ""
             model=model_name,
             messages = [
                 {"role": "system", "content": "You are a helpful assistant that translates English to Chinese and have decent background in starcraft2."},
+                {"role": "system", "content": "Your translation has to keep the orginal format and be as accurate as possible."},
+                {"role": "system", "content": "There is no need for you to add any comments or notes."},
                 {"role": "user", "content": 'Translate the following English text to Chinese: "{}"'.format(s)}
             ],
             temperature=0.15
         )
         with open(f"{RESULT_PATH}/{VIDEO_NAME}/{VIDEO_NAME}_zh.srt", 'a+') as f:
             f.write(response['choices'][0]['message']['content'].strip())
+            f.write("\n")
     if model_name == "text-davinci-003":
         prompt = f"Please help me translate this into Chinese:\n\n{s}\n\n"
         with open(f"{RESULT_PATH}/{VIDEO_NAME}/{VIDEO_NAME}_zh.srt", 'a+') as f:
             f.write(response['choices'][0]['text'].strip())
+            f.write("\n")
 if not args.only_srt:
     assSub_zh = srt2ass(f"{RESULT_PATH}/{VIDEO_NAME}/{VIDEO_NAME}_zh.srt", "default", "No", "Modest")