Spaces:
Runtime error
Some sentense not translate to English in Korean Language Video
The previous version did not have this problems.
The video https://www.youtube.com/watch?v=ivET_OOjqps
Parameter settings:
Model:
medium
Language
Korean
Task
translate
VAD
silero-vad
VAD - Merge Window (s)
5
VAD - Max Merge Size (s)
35
VAD - Padding (s)
1
VAD - Prompt Window (s)
3
- Really? You can't kick it like this. help me Lets see how we'll proceed with a proper segment. First lyrics neded. ��知 successes please listen carefully to the rap part and dance dance dance You were looking for that? Thank you for watching! The last word is love. I can't let go of you, holding your hands. Love leaves me with a farewell and love turns around. I can't hold you and I can't let go of you. I can't let go of you and I can't let go of you. It's 2013. We've seen the lyrics of the song, The Change of Parting, by Cho Patta Family. The songwriter is... Kang Junho. And also, Yook Junghwan. The songwriter is all of us. Please tell us the point. The point is, the change of parting. You can focus on this part. Everyone has different reasons for parting, but when you part, it's hard to feel bad. The break-up and rest are also hard. But when you think about it, when you part... Why did you say that?<|transcribe|> 생각할수록 기분 나쁘네 싶은 이별의 말들이 있어요 그 당시에는 그렇게 절절할 수가 없었는데 말이죠 그래서 오늘 제목 지어주실 포인트는요 시간 지나고 보니 은근 열받는 이별할 때 말 말 말입니다 한 줄의 공간 가는 제목으로 다시 보내주세요 에디킴이 예시를 좀 알려주시죠 이런거 있습니다. 유학간다고 내 미래를 위해 헤어져 준다고 하더니 6개월만에 귀국한 전 남친 아 도피썩입니다 너한테 내가 너무 부족하다더니 누가 봐도 나보다 잘난 사람 만나 결혼하는 전 애인
- retty, pretty, you I'm out of energy. I'll announce the winner. My lips are drying up, and I'm sweating again. I know the answer, but what you want to hear, what you want to hear, I'll test myself. You're really pretty, pretty, pretty. Why don't you believe me? Why? No matter how much I say, no matter how much I say, your leap is waits, waits, waits for me.<|transcribe|> 말수가 없어진 너 더 불안해지는 나 매일 같은 퀴즈 반복되는 게임 난 항상 술래 머리를 새로 했나 오 손톱이 바뀌었을까 오 감이 오질 않아 네가 듣고 싶은 말 그 듣고 싶은 말 정말 힌트도 없는지 너 정말 이쁘다 이쁘다 이쁠다니까 왜 내 맘 믿지 않는건데 왜 말하고 말하고 아무리 말해도 화난 듯한 너의 그 표정 왜 왜 It's a secret. When I go out, I only wear Naningu. No way! I only wear Naningu when I'm a radio DJ. Really? If you can't believe it, go to Naningu.com right now. If you're the number one fashionista in Korea, Naningu.com Oh, you're wearing Naningu, too.
Do you mean the version from 1217d8b? I tried that version on the video above, and it does seem to translate more of the audio into English:
But the reason for this is probably that it's only passing chunks of size 150 seconds and above to Whisper:
Running whisper from 00:00.000 to 02:37.986 , duration: 157.986 expanded: 0.0
Running whisper from 02:37.986 to 05:08.898 , duration: 150.91200000000003 expanded: 0.0
Running whisper from 05:08.898 to 08:18.214 , duration: 189.31599999999997 expanded: 0.0
Running whisper from 08:18.214 to 11:04.770 , duration: 166.55599999999998 expanded: 0.0
Running whisper from 11:04.770 to 14:20.322 , duration: 195.55200000000002 expanded: 0.0
Running whisper from 14:20.322 to 17:18.210 , duration: 177.88800000000003 expanded: 0.0
Running whisper from 17:18.210 to 20:32.706 , duration: 194.4960000000001 expanded: 0.0
Running whisper from 20:32.706 to 23:05.922 , duration: 153.2159999999999 expanded: 0.0
Running whisper from 23:05.922 to 25:43.746 , duration: 157.82400000000007 expanded: 0.0
Running whisper from 25:43.746 to 28:51.334 , duration: 187.58799999999997 expanded: 0.0
Running whisper from 28:51.334 to 31:29.062 , duration: 157.72800000000007 expanded: 0.0
Running whisper from 31:29.062 to 34:16.674 , duration: 167.61199999999985 expanded: 0.0
Running whisper from 34:16.674 to 39:37.602 , duration: 320.9279999999999 expanded: 0.0
Running whisper from 39:37.602 to 42:40.962 , duration: 183.36000000000013 expanded: 0.0
Running whisper from 42:40.962 to 45:12.358 , duration: 151.39600000000019 expanded: 0.0
Running whisper from 45:12.358 to 49:00.162 , duration: 227.8040000000001 expanded: 0.0
Running whisper from 49:00.162 to 51:42.790 , duration: 162.6279999999997 expanded: 0.0
Running whisper from 51:42.790 to 53:01.514 , duration: 78.72400000000016 expanded: 0
While the new version is mostly limited to chunks of 30 seconds and below. Still, you should be mostly able to recreate this in the new version by using the following settings:
VAD
silero-vad-expand-into-gaps
VAD - Merge Window
1
VAD - Max Merge Size (s)
150
VAD - Padding (s)
1
VAD - Prompt Window (s)
0 or 3
The downside with the old method is that it's less accurate in terms of getting the correct timings. I also found the old method to be worse in terms of handling different forms of audio - for instance, it would often ignore or skip transcribing the lyrics in the opening sequence.
But yeah, it the new version does seem to be worse when translating something into English. Or at least for this particular video.
Thanks.I change the model to Large with my parameters, both the lyrics and conversation get better.Just few 10s sentenses ignored.