Gpagejr12 commited on
Commit
4504a34
·
verified ·
1 Parent(s): 984c220

Delete demos_musicgen_demo.ipynb

Browse files
Files changed (1) hide show
  1. demos_musicgen_demo.ipynb +0 -232
demos_musicgen_demo.ipynb DELETED
@@ -1,232 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {},
6
- "source": [
7
- "# MusicGen\n",
8
- "Welcome to MusicGen's demo jupyter notebook. Here you will find a series of self-contained examples of how to use MusicGen in different settings.\n",
9
- "\n",
10
- "First, we start by initializing MusicGen, you can choose a model from the following selection:\n",
11
- "1. `facebook/musicgen-small` - 300M transformer decoder.\n",
12
- "2. `facebook/musicgen-medium` - 1.5B transformer decoder.\n",
13
- "3. `facebook/musicgen-melody` - 1.5B transformer decoder also supporting melody conditioning.\n",
14
- "4. `facebook/musicgen-large` - 3.3B transformer decoder.\n",
15
- "\n",
16
- "We will use the `facebook/musicgen-small` variant for the purpose of this demonstration."
17
- ]
18
- },
19
- {
20
- "cell_type": "code",
21
- "execution_count": 1,
22
- "metadata": {},
23
- "outputs": [],
24
- "source": [
25
- "from audiocraft.models import MusicGen\n",
26
- "from audiocraft.models import MultiBandDiffusion\n",
27
- "\n",
28
- "USE_DIFFUSION_DECODER = False\n",
29
- "# Using small model, better results would be obtained with `medium` or `large`.\n",
30
- "model = MusicGen.get_pretrained('facebook/musicgen-small')\n",
31
- "if USE_DIFFUSION_DECODER:\n",
32
- " mbd = MultiBandDiffusion.get_mbd_musicgen()"
33
- ]
34
- },
35
- {
36
- "cell_type": "markdown",
37
- "metadata": {},
38
- "source": [
39
- "Next, let us configure the generation parameters. Specifically, you can control the following:\n",
40
- "* `use_sampling` (bool, optional): use sampling if True, else do argmax decoding. Defaults to True.\n",
41
- "* `top_k` (int, optional): top_k used for sampling. Defaults to 250.\n",
42
- "* `top_p` (float, optional): top_p used for sampling, when set to 0 top_k is used. Defaults to 0.0.\n",
43
- "* `temperature` (float, optional): softmax temperature parameter. Defaults to 1.0.\n",
44
- "* `duration` (float, optional): duration of the generated waveform. Defaults to 30.0.\n",
45
- "* `cfg_coef` (float, optional): coefficient used for classifier free guidance. Defaults to 3.0.\n",
46
- "\n",
47
- "When left unchanged, MusicGen will revert to its default parameters."
48
- ]
49
- },
50
- {
51
- "cell_type": "code",
52
- "execution_count": null,
53
- "metadata": {},
54
- "outputs": [],
55
- "source": [
56
- "model.set_generation_params(\n",
57
- " use_sampling=True,\n",
58
- " top_k=250,\n",
59
- " duration=30\n",
60
- ")"
61
- ]
62
- },
63
- {
64
- "cell_type": "markdown",
65
- "metadata": {},
66
- "source": [
67
- "Next, we can go ahead and start generating music using one of the following modes:\n",
68
- "* Unconditional samples using `model.generate_unconditional`\n",
69
- "* Music continuation using `model.generate_continuation`\n",
70
- "* Text-conditional samples using `model.generate`\n",
71
- "* Melody-conditional samples using `model.generate_with_chroma`"
72
- ]
73
- },
74
- {
75
- "cell_type": "markdown",
76
- "metadata": {},
77
- "source": [
78
- "### Music Continuation"
79
- ]
80
- },
81
- {
82
- "cell_type": "code",
83
- "execution_count": null,
84
- "metadata": {},
85
- "outputs": [],
86
- "source": [
87
- "import math\n",
88
- "import torchaudio\n",
89
- "import torch\n",
90
- "from audiocraft.utils.notebook import display_audio\n",
91
- "\n",
92
- "def get_bip_bip(bip_duration=0.125, frequency=440,\n",
93
- " duration=0.5, sample_rate=32000, device=\"cuda\"):\n",
94
- " \"\"\"Generates a series of bip bip at the given frequency.\"\"\"\n",
95
- " t = torch.arange(\n",
96
- " int(duration * sample_rate), device=\"cuda\", dtype=torch.float) / sample_rate\n",
97
- " wav = torch.cos(2 * math.pi * 440 * t)[None]\n",
98
- " tp = (t % (2 * bip_duration)) / (2 * bip_duration)\n",
99
- " envelope = (tp >= 0.5).float()\n",
100
- " return wav * envelope"
101
- ]
102
- },
103
- {
104
- "cell_type": "code",
105
- "execution_count": null,
106
- "metadata": {},
107
- "outputs": [],
108
- "source": [
109
- "# Here we use a synthetic signal to prompt both the tonality and the BPM\n",
110
- "# of the generated audio.\n",
111
- "res = model.generate_continuation(\n",
112
- " get_bip_bip(0.125).expand(2, -1, -1), \n",
113
- " 32000, ['Jazz jazz and only jazz', \n",
114
- " 'Heartful EDM with beautiful synths and chords'], \n",
115
- " progress=True)\n",
116
- "display_audio(res, 32000)"
117
- ]
118
- },
119
- {
120
- "cell_type": "code",
121
- "execution_count": null,
122
- "metadata": {},
123
- "outputs": [],
124
- "source": [
125
- "# You can also use any audio from a file. Make sure to trim the file if it is too long!\n",
126
- "prompt_waveform, prompt_sr = torchaudio.load(\"../assets/bach.mp3\")\n",
127
- "prompt_duration = 2\n",
128
- "prompt_waveform = prompt_waveform[..., :int(prompt_duration * prompt_sr)]\n",
129
- "output = model.generate_continuation(prompt_waveform, prompt_sample_rate=prompt_sr, progress=True, return_tokens=True)\n",
130
- "display_audio(output[0], sample_rate=32000)\n",
131
- "if USE_DIFFUSION_DECODER:\n",
132
- " out_diffusion = mbd.tokens_to_wav(output[1])\n",
133
- " display_audio(out_diffusion, sample_rate=32000)"
134
- ]
135
- },
136
- {
137
- "cell_type": "markdown",
138
- "metadata": {},
139
- "source": [
140
- "### Text-conditional Generation"
141
- ]
142
- },
143
- {
144
- "cell_type": "code",
145
- "execution_count": null,
146
- "metadata": {},
147
- "outputs": [],
148
- "source": [
149
- "from audiocraft.utils.notebook import display_audio\n",
150
- "\n",
151
- "output = model.generate(\n",
152
- " descriptions=[\n",
153
- " #'80s pop track with bassy drums and synth',\n",
154
- " #'90s rock song with loud guitars and heavy drums',\n",
155
- " #'Progressive rock drum and bass solo',\n",
156
- " #'Punk Rock song with loud drum and power guitar',\n",
157
- " #'Bluesy guitar instrumental with soulful licks and a driving rhythm section',\n",
158
- " #'Jazz Funk song with slap bass and powerful saxophone',\n",
159
- " 'drum and bass beat with intense percussions'\n",
160
- " ],\n",
161
- " progress=True, return_tokens=True\n",
162
- ")\n",
163
- "display_audio(output[0], sample_rate=32000)\n",
164
- "if USE_DIFFUSION_DECODER:\n",
165
- " out_diffusion = mbd.tokens_to_wav(output[1])\n",
166
- " display_audio(out_diffusion, sample_rate=32000)"
167
- ]
168
- },
169
- {
170
- "cell_type": "markdown",
171
- "metadata": {},
172
- "source": [
173
- "### Melody-conditional Generation"
174
- ]
175
- },
176
- {
177
- "cell_type": "code",
178
- "execution_count": null,
179
- "metadata": {},
180
- "outputs": [],
181
- "source": [
182
- "import torchaudio\n",
183
- "from audiocraft.utils.notebook import display_audio\n",
184
- "\n",
185
- "model = MusicGen.get_pretrained('facebook/musicgen-melody')\n",
186
- "model.set_generation_params(duration=8)\n",
187
- "\n",
188
- "melody_waveform, sr = torchaudio.load(\"../assets/bach.mp3\")\n",
189
- "melody_waveform = melody_waveform.unsqueeze(0).repeat(2, 1, 1)\n",
190
- "output = model.generate_with_chroma(\n",
191
- " descriptions=[\n",
192
- " '80s pop track with bassy drums and synth',\n",
193
- " '90s rock song with loud guitars and heavy drums',\n",
194
- " ],\n",
195
- " melody_wavs=melody_waveform,\n",
196
- " melody_sample_rate=sr,\n",
197
- " progress=True, return_tokens=True\n",
198
- ")\n",
199
- "display_audio(output[0], sample_rate=32000)\n",
200
- "if USE_DIFFUSION_DECODER:\n",
201
- " out_diffusion = mbd.tokens_to_wav(output[1])\n",
202
- " display_audio(out_diffusion, sample_rate=32000)"
203
- ]
204
- }
205
- ],
206
- "metadata": {
207
- "kernelspec": {
208
- "display_name": "Python 3 (ipykernel)",
209
- "language": "python",
210
- "name": "python3"
211
- },
212
- "language_info": {
213
- "codemirror_mode": {
214
- "name": "ipython",
215
- "version": 3
216
- },
217
- "file_extension": ".py",
218
- "mimetype": "text/x-python",
219
- "name": "python",
220
- "nbconvert_exporter": "python",
221
- "pygments_lexer": "ipython3",
222
- "version": "3.9.16"
223
- },
224
- "vscode": {
225
- "interpreter": {
226
- "hash": "b02c911f9b3627d505ea4a19966a915ef21f28afb50dbf6b2115072d27c69103"
227
- }
228
- }
229
- },
230
- "nbformat": 4,
231
- "nbformat_minor": 2
232
- }