Some sentense not translate to English in Korean Language Video

#3
by MeroyTruman - opened

The previous version did not have this problems.
The video https://www.youtube.com/watch?v=ivET_OOjqps
Parameter settings:
Model:
medium
Language
Korean
Task
translate
VAD
silero-vad
VAD - Merge Window (s)
5
VAD - Max Merge Size (s)
35
VAD - Padding (s)
1
VAD - Prompt Window (s)
3

  • Really? You can't kick it like this. help me Lets see how we'll proceed with a proper segment. First lyrics neded. ��知 successes please listen carefully to the rap part and dance dance dance You were looking for that? Thank you for watching! The last word is love. I can't let go of you, holding your hands. Love leaves me with a farewell and love turns around. I can't hold you and I can't let go of you. I can't let go of you and I can't let go of you. It's 2013. We've seen the lyrics of the song, The Change of Parting, by Cho Patta Family. The songwriter is... Kang Junho. And also, Yook Junghwan. The songwriter is all of us. Please tell us the point. The point is, the change of parting. You can focus on this part. Everyone has different reasons for parting, but when you part, it's hard to feel bad. The break-up and rest are also hard. But when you think about it, when you part... Why did you say that?<|transcribe|> 생각할수록 기분 나쁘네 싶은 이별의 말들이 있어요 그 당시에는 그렇게 절절할 수가 없었는데 말이죠 그래서 오늘 제목 지어주실 포인트는요 시간 지나고 보니 은근 열받는 이별할 때 말 말 말입니다 한 줄의 공간 가는 제목으로 다시 보내주세요 에디킴이 예시를 좀 알려주시죠 이런거 있습니다. 유학간다고 내 미래를 위해 헤어져 준다고 하더니 6개월만에 귀국한 전 남친 아 도피썩입니다 너한테 내가 너무 부족하다더니 누가 봐도 나보다 잘난 사람 만나 결혼하는 전 애인
  • retty, pretty, you I'm out of energy. I'll announce the winner. My lips are drying up, and I'm sweating again. I know the answer, but what you want to hear, what you want to hear, I'll test myself. You're really pretty, pretty, pretty. Why don't you believe me? Why? No matter how much I say, no matter how much I say, your leap is waits, waits, waits for me.<|transcribe|> 말수가 없어진 너 더 불안해지는 나 매일 같은 퀴즈 반복되는 게임 난 항상 술래 머리를 새로 했나 오 손톱이 바뀌었을까 오 감이 오질 않아 네가 듣고 싶은 말 그 듣고 싶은 말 정말 힌트도 없는지 너 정말 이쁘다 이쁘다 이쁠다니까 왜 내 맘 믿지 않는건데 왜 말하고 말하고 아무리 말해도 화난 듯한 너의 그 표정 왜 왜 It's a secret. When I go out, I only wear Naningu. No way! I only wear Naningu when I'm a radio DJ. Really? If you can't believe it, go to Naningu.com right now. If you're the number one fashionista in Korea, Naningu.com Oh, you're wearing Naningu, too.

Do you mean the version from 1217d8b? I tried that version on the video above, and it does seem to translate more of the audio into English:

But the reason for this is probably that it's only passing chunks of size 150 seconds and above to Whisper:

Running whisper from  00:00.000  to  02:37.986 , duration:  157.986 expanded:  0.0
Running whisper from  02:37.986  to  05:08.898 , duration:  150.91200000000003 expanded:  0.0
Running whisper from  05:08.898  to  08:18.214 , duration:  189.31599999999997 expanded:  0.0
Running whisper from  08:18.214  to  11:04.770 , duration:  166.55599999999998 expanded:  0.0
Running whisper from  11:04.770  to  14:20.322 , duration:  195.55200000000002 expanded:  0.0
Running whisper from  14:20.322  to  17:18.210 , duration:  177.88800000000003 expanded:  0.0
Running whisper from  17:18.210  to  20:32.706 , duration:  194.4960000000001 expanded:  0.0
Running whisper from  20:32.706  to  23:05.922 , duration:  153.2159999999999 expanded:  0.0
Running whisper from  23:05.922  to  25:43.746 , duration:  157.82400000000007 expanded:  0.0
Running whisper from  25:43.746  to  28:51.334 , duration:  187.58799999999997 expanded:  0.0
Running whisper from  28:51.334  to  31:29.062 , duration:  157.72800000000007 expanded:  0.0
Running whisper from  31:29.062  to  34:16.674 , duration:  167.61199999999985 expanded:  0.0
Running whisper from  34:16.674  to  39:37.602 , duration:  320.9279999999999 expanded:  0.0
Running whisper from  39:37.602  to  42:40.962 , duration:  183.36000000000013 expanded:  0.0
Running whisper from  42:40.962  to  45:12.358 , duration:  151.39600000000019 expanded:  0.0
Running whisper from  45:12.358  to  49:00.162 , duration:  227.8040000000001 expanded:  0.0
Running whisper from  49:00.162  to  51:42.790 , duration:  162.6279999999997 expanded:  0.0
Running whisper from  51:42.790  to  53:01.514 , duration:  78.72400000000016 expanded:  0

While the new version is mostly limited to chunks of 30 seconds and below. Still, you should be mostly able to recreate this in the new version by using the following settings:

VAD
silero-vad-expand-into-gaps
VAD - Merge Window
1
VAD - Max Merge Size (s)
150
VAD - Padding (s)
1
VAD - Prompt Window (s)
0 or 3

The downside with the old method is that it's less accurate in terms of getting the correct timings. I also found the old method to be worse in terms of handling different forms of audio - for instance, it would often ignore or skip transcribing the lyrics in the opening sequence.

But yeah, it the new version does seem to be worse when translating something into English. Or at least for this particular video.

Thanks.I change the model to Large with my parameters, both the lyrics and conversation get better.Just few 10s sentenses ignored.

MeroyTruman changed discussion status to closed

Sign up or log in to comment