error: ushort format requires 0 <= number <= 65535
I tried to run the second cell from the tutorial on an Apple Silicon Mac, I got this error
```The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask
to obtain reliable results.
Setting pad_token_id
to eos_token_id
:10000 for open-end generation.
error Traceback (most recent call last)
Cell In[5], line 8
4 synthesiser = pipeline("text-to-speech", "suno/bark")
6 speech = synthesiser("Hello, my dog is cooler than you!", forward_params={"do_sample": True})
----> 8 scipy.io.wavfile.write("bark_out.wav", rate=speech["sampling_rate"], data=speech["audio"])
File /opt/homebrew/lib/python3.11/site-packages/scipy/io/wavfile.py:796, in write(filename, rate, data)
793 bytes_per_second = fs*(bit_depth // 8)*channels
794 block_align = channels * (bit_depth // 8)
--> 796 fmt_chunk_data = struct.pack('<HHIIHH', format_tag, channels, fs,
797 bytes_per_second, block_align, bit_depth)
798 if not (dkind == 'i' or dkind == 'u'):
799 # add cbSize field for non-PCM files
800 fmt_chunk_data += b'\x00\x00'
error: ushort format requires 0 <= number <= 65535```
I was getting that error, doing this fixed it for me:
import scipy
from transformers import AutoProcessor, BarkModel
processor = AutoProcessor.from_pretrained("suno/bark")
model = BarkModel.from_pretrained("suno/bark")
inputs = processor(""" Hello how are you """, voice_preset="v2/en_speaker_5")
audio_array = model.generate(**inputs)
audio_array = audio_array.cpu().numpy().squeeze()
sample_rate = model.generation_config.sample_rate
scipy.io.wavfile.write("bark_out.wav",rate=sample_rate, data=audio_array)