reach-vb HF staff commited on
Commit
0a0de70
1 Parent(s): 26ae39a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -19,8 +19,8 @@ We further release a set of stereophonic capable models. Those were fine tuned f
19
  from the mono models. The training data is otherwise identical and capabilities and limitations are shared with the base modes. The stereo models work by getting 2 streams of tokens from the EnCodec model, and interleaving those using
20
  the delay pattern.
21
 
22
- Stereophonic sound, also known as stereo, is a technique used to reproduce sound with depth and direction.
23
- It uses two separate audio channels played through speakers or headphones arranged so that it sounds like you're listening from different angles.
24
 
25
  MusicGen is a text-to-music model capable of genreating high-quality music samples conditioned on text descriptions or audio prompts.
26
  It is a single stage auto-regressive Transformer model trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz.
@@ -82,7 +82,7 @@ import torch
82
  import soundfile as sf
83
  from transformers import pipeline
84
 
85
- synthesiser = pipeline("text-to-audio", "facebook/musicgen-stereo-small", device="cuda:0", torch_dtype=torch.float16)
86
 
87
  music = synthesiser("lo-fi music with a soothing melody", forward_params={"max_new_tokens": 256})
88
 
 
19
  from the mono models. The training data is otherwise identical and capabilities and limitations are shared with the base modes. The stereo models work by getting 2 streams of tokens from the EnCodec model, and interleaving those using
20
  the delay pattern.
21
 
22
+ Stereophonic sound, also known as stereo, is a technique used to reproduce sound with depth and direction.
23
+ It uses two separate audio channels played through speakers (or headphones), which creates the impression of sound coming from multiple directions.
24
 
25
  MusicGen is a text-to-music model capable of genreating high-quality music samples conditioned on text descriptions or audio prompts.
26
  It is a single stage auto-regressive Transformer model trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz.
 
82
  import soundfile as sf
83
  from transformers import pipeline
84
 
85
+ synthesiser = pipeline("text-to-audio", "facebook/musicgen-stereo-medium", device="cuda:0", torch_dtype=torch.float16)
86
 
87
  music = synthesiser("lo-fi music with a soothing melody", forward_params={"max_new_tokens": 256})
88