maker.V1 / README.md

Mathoufle13

Update README.md

895ec39 verified 3 months ago

preview code

raw

history blame contribute delete

No virus

4.27 kB

	1289.4068 seconds used for training.
	21.49 minutes used for training.
	Peak reserved memory = 9.545 GB.
	Peak reserved memory for training = 4.018 GB.
	Peak reserved memory % of max memory = 43.058 %.
	Peak reserved memory for training % of max memory = 18.125 %.

	args = TrainingArguments(
	per_device_train_batch_size = 2,
	gradient_accumulation_steps = 4,
	warmup_steps = 10, # Augmenté le nombre de steps de warmup
	max_steps = 200, # Augmenté le nombre total de steps
	learning_rate = 1e-4, # Réduit le taux d'apprentissage
	fp16 = not torch.cuda.is_bf16_supported(),
	bf16 = torch.cuda.is_bf16_supported(),
	logging_steps = 1,
	optim = "adamw_8bit",
	weight_decay = 0.01,
	lr_scheduler_type = "linear",
	seed = 42,
	output_dir = "outputs",


	==((====))== Unsloth - 2x faster free finetuning \| Num GPUs = 1
	\\ /\| Num examples = 399 \| Num Epochs = 4
	O^O/ \_/ \ Batch size per device = 2 \| Gradient Accumulation steps = 4
	\ / Total batch size = 8 \| Total steps = 200
	"-____-" Number of trainable parameters = 20,971,520
	[200/200 21:17, Epoch 4/4]
	Step Training Loss
	1 2.027900
	2 2.008700
	3 1.946100
	4 1.924700
	5 1.995000
	6 1.999000
	7 1.870100
	8 1.891400
	9 1.807600
	10 1.723200
	11 1.665100
	12 1.541000
	13 1.509100
	14 1.416600
	15 1.398600
	16 1.233200
	17 1.172100
	18 1.272100
	19 1.146000
	20 1.179000
	21 1.206400
	22 1.095400
	23 0.937300
	24 1.214300
	25 1.040200
	26 1.183400
	27 1.033900
	28 0.953100
	29 0.935700
	30 0.962200
	31 0.908900
	32 0.924900
	33 0.931000
	34 1.011300
	35 0.951900
	36 0.936000
	37 0.903000
	38 0.906900
	39 0.945700
	40 0.827000
	41 0.931800
	42 0.919600
	43 0.926900
	44 0.932900
	45 0.872700
	46 0.795200
	47 0.888700
	48 0.956800
	49 1.004200
	50 0.859500
	51 0.802500
	52 0.855400
	53 0.885500
	54 1.026600
	55 0.844100
	56 0.879800
	57 0.797400
	58 0.885300
	59 0.842800
	60 0.861600
	61 0.789100
	62 0.861600
	63 0.856700
	64 0.929200
	65 0.782500
	66 0.713600
	67 0.781000
	68 0.765100
	69 0.784700
	70 0.869500
	71 0.742900
	72 0.787900
	73 0.750800
	74 0.931700
	75 0.713000
	76 0.832100
	77 0.928300
	78 0.777600
	79 0.694000
	80 0.835400
	81 0.822000
	82 0.754600
	83 0.813400
	84 0.868800
	85 0.732400
	86 0.803700
	87 0.694400
	88 0.771300
	89 0.864400
	90 0.646700
	91 0.690800
	92 0.695000
	93 0.732300
	94 0.766900
	95 0.864100
	96 0.867200
	97 0.774300
	98 0.797700
	99 0.772100
	100 0.906700
	101 0.693400
	102 0.685500
	103 0.712200
	104 0.678400
	105 0.761900
	106 0.705300
	107 0.775700
	108 0.627600
	109 0.599300
	110 0.615100
	111 0.618200
	112 0.668700
	113 0.699900
	114 0.577000
	115 0.711600
	116 0.692900
	117 0.585400
	118 0.646400
	119 0.569200
	120 0.752300
	121 0.745000
	122 0.690100
	123 0.744700
	124 0.665800
	125 0.866100
	126 0.707400
	127 0.679300
	128 0.591400
	129 0.655100
	130 0.734000
	131 0.637900
	132 0.733900
	133 0.652500
	134 0.685400
	135 0.641300
	136 0.608200
	137 0.754100
	138 0.753700
	139 0.671000
	140 0.767200
	141 0.668700
	142 0.630300
	143 0.734700
	144 0.767700
	145 0.722200
	146 0.694400
	147 0.710100
	148 0.696300
	149 0.612600
	150 0.670400
	151 0.512900
	152 0.675100
	153 0.579900
	154 0.622900
	155 0.652500
	156 0.649200
	157 0.546700
	158 0.521600
	159 0.522200
	160 0.589400
	161 0.552600
	162 0.630700
	163 0.595600
	164 0.614300
	165 0.489400
	166 0.634500
	167 0.620800
	168 0.618600
	169 0.637900
	170 0.553900
	171 0.656000
	172 0.644000
	173 0.694300
	174 0.608900
	175 0.673000
	176 0.612500
	177 0.654200
	178 0.639200
	179 0.599100
	180 0.642100
	181 0.529700
	182 0.614000
	183 0.582900
	184 0.765100
	185 0.502700
	186 0.564300
	187 0.740200
	188 0.636100
	189 0.638800
	190 0.560100
	191 0.620000
	192 0.712800
	193 0.531000
	194 0.591600
	195 0.608600
	196 0.671800
	197 0.572900
	198 0.600900
	199 0.586800
	200 0.545900

	---
	base_model: unsloth/llama-3-8b-bnb-4bit
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- gguf
	---

	# Uploaded model

	- Developed by: Mathoufle13
	- License: apache-2.0
	- Finetuned from model : unsloth/llama-3-8b-bnb-4bit

	This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)