sander-wood commited on
Commit
b78dd91
·
verified ·
1 Parent(s): 3c428bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -10
README.md CHANGED
@@ -111,7 +111,7 @@ CLaMP 2 is a music information retrieval model compatible with 101 languages, de
111
 
112
  ### Links
113
  - [CLaMP 2 Code](https://github.com/sanderwood/clamp2)
114
- - [CLaMP 2 Paper](https://arxiv.org/)
115
  - [CLaMP 2 Model Weights](https://huggingface.co/sander-wood/clamp2/blob/main/weights_clamp2_h_size_768_lr_5e-05_batch_128_scale_1_t_length_128_t_model_FacebookAI_xlm-roberta-base_t_dropout_True_m3_True.pth)
116
  - [M3 Model Weights](https://huggingface.co/sander-wood/clamp2/blob/main/weights_m3_p_size_64_p_length_512_t_layers_3_p_layers_12_h_size_768_lr_0.0001_batch_16_mask_0.45.pth)
117
 
@@ -172,6 +172,7 @@ conda activate clamp2
172
  ]
173
  }
174
  ```
 
175
 
176
  **Output Example**: The output will be a JSON file containing the structured summary in both English and a selected non-English language. Here’s an example of the expected output:
177
 
@@ -197,6 +198,19 @@ conda activate clamp2
197
  }
198
  ```
199
 
 
 
 
 
 
 
 
 
 
 
 
 
 
200
  ### Training and Feature Extraction
201
  2. **Training Models**: If you want to train CLaMP 2 or M3 models, check the scripts in the `code/` folder.
202
  - Modify the `config.py` files to set your training hyperparameters and paths.
@@ -217,12 +231,11 @@ Benchmark datasets related to the experiments conducted with CLaMP 2 and M3, inc
217
  If you use CLaMP 2 or M3 in your research, please cite the following paper:
218
 
219
  ```bibtex
220
- @inproceedings{clamp2,
221
- title={CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models},
222
- author={Author Name and Coauthor Name},
223
- booktitle={Proceedings of the Conference on Music Information Retrieval},
224
- year={2024},
225
- publisher={Publisher Name},
226
- address={Conference Location},
227
- url={https://placeholder.url}
228
- }
 
111
 
112
  ### Links
113
  - [CLaMP 2 Code](https://github.com/sanderwood/clamp2)
114
+ - [CLaMP 2 Paper](https://arxiv.org/pdf/2410.13267)
115
  - [CLaMP 2 Model Weights](https://huggingface.co/sander-wood/clamp2/blob/main/weights_clamp2_h_size_768_lr_5e-05_batch_128_scale_1_t_length_128_t_model_FacebookAI_xlm-roberta-base_t_dropout_True_m3_True.pth)
116
  - [M3 Model Weights](https://huggingface.co/sander-wood/clamp2/blob/main/weights_m3_p_size_64_p_length_512_t_layers_3_p_layers_12_h_size_768_lr_0.0001_batch_16_mask_0.45.pth)
117
 
 
172
  ]
173
  }
174
  ```
175
+ The filepaths field contains relative paths starting from the shortest common root directory (e.g., abc/ or mtf/). This ensures that only the minimal shared part of the path is included, and each file is represented with a concise relative path from this root.
176
 
177
  **Output Example**: The output will be a JSON file containing the structured summary in both English and a selected non-English language. Here’s an example of the expected output:
178
 
 
198
  }
199
  ```
200
 
201
+ After generating the individual JSON files:
202
+
203
+ 1. Merge all JSON files into a single JSONL file.
204
+
205
+ 2. Place the merged JSONL file and the shortest common root directories (e.g., abc/ and/or mtf/) in the same folder, structured like this:
206
+
207
+ ```
208
+ /your-target-folder/
209
+ ├── abc/
210
+ ├── mtf/
211
+ ├── merged_output.jsonl
212
+ ```
213
+
214
  ### Training and Feature Extraction
215
  2. **Training Models**: If you want to train CLaMP 2 or M3 models, check the scripts in the `code/` folder.
216
  - Modify the `config.py` files to set your training hyperparameters and paths.
 
231
  If you use CLaMP 2 or M3 in your research, please cite the following paper:
232
 
233
  ```bibtex
234
+ @misc{wu2024clamp2multimodalmusic,
235
+ title={CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models},
236
+ author={Shangda Wu and Yashan Wang and Ruibin Yuan and Zhancheng Guo and Xu Tan and Ge Zhang and Monan Zhou and Jing Chen and Xuefeng Mu and Yuejie Gao and Yuanliang Dong and Jiafeng Liu and Xiaobing Li and Feng Yu and Maosong Sun},
237
+ year={2024},
238
+ eprint={2410.13267},
239
+ archivePrefix={arXiv},
240
+ primaryClass={cs.SD},
241
+ url={https://arxiv.org/abs/2410.13267},