TristanBehrens
/

bach-garland-phariaplusplus

Model card Files Files and versions Community

bach-garland-phariaplusplus / README.md

TristanBehrens's picture

Update README.md

9fb5edb verified 3 months ago

|

2.11 kB

	---
	language:
	- en
	tags:
	- NLP
	license: mit
	datasets:
	- TristanBehrens/bach_garland_2024-100K
	base_model: None
	---

	# Bach Garland Pharia - A Pharia model trained on Johann Sebastian Bach Style music

	Say Hello on [LinkedIn](https://www.linkedin.com/dr-tristan-behrens-734967a2/) and [X](https://x.com/DrTBehrens).

	![Cover](bachgarlandphariaplusplus.jpg)

	This is a Pharia model trained on music by Johann Sebastian Bach. It includes all pieces of Bach's music that can be played on church organ. The samples come in the prototypical Garland notation.

	The dataset contains 100K samples and comes with a total token count of 144M.

	## How to use

	1. Clone this repository and follow the installation instructions: https://github.com/AI-Guru/helibrunna/
	2. Open and run the notebook `examples/music.ipynb`. Do not forget to add the id of this model.
	3. Enjoy!

	## Training

	![Trained with Helibrunna](banner.jpg)

	Trained with [Helibrunna](https://github.com/AI-Guru/helibrunna) by [Dr. Tristan Behrens](https://de.linkedin.com/dr-tristan-behrens-734967a2).

	## Configuration

	```
	training:
	model_name: bach_garland_phariaplusplus
	batch_size: 12
	lr: 0.001
	lr_warmup_steps: 2083
	lr_decay_until_steps: 20833
	lr_decay_factor: 0.001
	weight_decay: 0.1
	amp_precision: bfloat16
	weight_precision: float32
	enable_mixed_precision: true
	num_epochs: 5
	output_dir: output/bach_garland_phariaplusplus
	save_every_step: 500
	log_every_step: 10
	wandb_project: bach_garland
	torch_compile: false
	model:
	type: pharia
	attention_bias: true
	attention_dropout: 0.0
	eos_token_id: 0
	bos_token_id: 127179
	pad_token_id: 1
	hidden_act: gelu
	hidden_size: 512
	initializer_range: 0.02
	intermediate_size: 1024
	max_position_embeddings: 2048
	mlp_bias: true
	num_attention_heads: 8
	num_hidden_layers: 6
	num_key_value_heads: 8
	rope_scaling: null
	rope_theta: 1000000
	tie_word_embeddings: false
	use_cache: true
	context_length: 2048
	vocab_size: 178
	dataset:
	hugging_face_id: TristanBehrens/bach_garland_2024-100K
	tokenizer:
	type: whitespace
	fill_token: '[EOS]'

	```