Qwen2-1_5B_Function_Call_tiny_lora / running_log.txt

Upload folder using huggingface_hub

7547e01 verified 14 days ago

No virus

17.6 kB

	06/17/2024 19:50:24 - INFO - transformers.models.auto.tokenization_auto - Could not locate the tokenizer configuration file, will try to use the model config instead.

	06/17/2024 19:50:25 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/config.json

	06/17/2024 19:50:25 - INFO - transformers.configuration_utils - Model config Qwen2Config {
	"_name_or_path": "Qwen/Qwen2-1.5B-Instruct",
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"max_position_embeddings": 32768,
	"max_window_layers": 28,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_theta": 1000000.0,
	"sliding_window": 32768,
	"tie_word_embeddings": true,
	"torch_dtype": "bfloat16",
	"transformers_version": "4.41.2",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}


	06/17/2024 19:50:30 - INFO - transformers.tokenization_utils_base - loading file vocab.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/vocab.json

	06/17/2024 19:50:30 - INFO - transformers.tokenization_utils_base - loading file merges.txt from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/merges.txt

	06/17/2024 19:50:30 - INFO - transformers.tokenization_utils_base - loading file tokenizer.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/tokenizer.json

	06/17/2024 19:50:30 - INFO - transformers.tokenization_utils_base - loading file added_tokens.json from cache at None

	06/17/2024 19:50:30 - INFO - transformers.tokenization_utils_base - loading file special_tokens_map.json from cache at None

	06/17/2024 19:50:30 - INFO - transformers.tokenization_utils_base - loading file tokenizer_config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/tokenizer_config.json

	06/17/2024 19:50:31 - WARNING - transformers.tokenization_utils_base - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

	06/17/2024 19:50:31 - INFO - llamafactory.data.template - Replace eos token: <\|im_end\|>

	06/17/2024 19:50:31 - WARNING - transformers.tokenization_utils_base - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

	06/17/2024 19:50:31 - INFO - llamafactory.data.template - Replace eos token: <\|im_end\|>

	06/17/2024 19:50:31 - INFO - llamafactory.data.loader - Loading dataset llamafactory/glaive_toolcall_zh...

	06/17/2024 19:50:37 - INFO - llamafactory.data.loader - Loading dataset llamafactory/glaive_toolcall_en...

	06/17/2024 19:50:43 - INFO - llamafactory.data.loader - Loading dataset llamafactory/glaive_toolcall_zh...

	06/17/2024 19:50:47 - INFO - llamafactory.data.loader - Loading dataset llamafactory/glaive_toolcall_en...

	06/17/2024 19:50:52 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/config.json

	06/17/2024 19:50:52 - INFO - transformers.configuration_utils - Model config Qwen2Config {
	"_name_or_path": "Qwen/Qwen2-1.5B-Instruct",
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"max_position_embeddings": 32768,
	"max_window_layers": 28,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_theta": 1000000.0,
	"sliding_window": 32768,
	"tie_word_embeddings": true,
	"torch_dtype": "bfloat16",
	"transformers_version": "4.41.2",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}


	06/17/2024 19:50:52 - INFO - llamafactory.model.model_utils.quantization - Quantizing model to 4 bit.

	06/17/2024 19:50:52 - INFO - llamafactory.model.model_utils.quantization - Quantizing model to 4 bit.

	06/17/2024 19:50:59 - INFO - transformers.modeling_utils - loading weights file model.safetensors from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/model.safetensors

	06/17/2024 19:50:59 - INFO - transformers.modeling_utils - Instantiating Qwen2ForCausalLM model under default dtype torch.float16.

	06/17/2024 19:50:59 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig {
	"bos_token_id": 151643,
	"eos_token_id": 151645
	}


	06/17/2024 19:51:06 - INFO - transformers.modeling_utils - All model checkpoint weights were used when initializing Qwen2ForCausalLM.


	06/17/2024 19:51:06 - INFO - transformers.modeling_utils - All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at Qwen/Qwen2-1.5B-Instruct.
	If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.

	06/17/2024 19:51:06 - INFO - transformers.generation.configuration_utils - loading configuration file generation_config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/generation_config.json

	06/17/2024 19:51:06 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig {
	"bos_token_id": 151643,
	"do_sample": true,
	"eos_token_id": [
	151645,
	151643
	],
	"pad_token_id": 151643,
	"repetition_penalty": 1.1,
	"temperature": 0.7,
	"top_k": 20,
	"top_p": 0.8
	}


	06/17/2024 19:51:07 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.

	06/17/2024 19:51:07 - INFO - llamafactory.model.model_utils.attention - Using torch SDPA for faster training and inference.

	06/17/2024 19:51:07 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.

	06/17/2024 19:51:07 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA

	06/17/2024 19:51:07 - INFO - llamafactory.model.model_utils.misc - Found linear modules: o_proj,v_proj,down_proj,gate_proj,k_proj,up_proj,q_proj

	06/17/2024 19:51:07 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.

	06/17/2024 19:51:07 - INFO - llamafactory.model.model_utils.attention - Using torch SDPA for faster training and inference.

	06/17/2024 19:51:07 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.

	06/17/2024 19:51:07 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA

	06/17/2024 19:51:07 - INFO - llamafactory.model.model_utils.misc - Found linear modules: down_proj,k_proj,gate_proj,up_proj,o_proj,v_proj,q_proj

	06/17/2024 19:51:07 - INFO - llamafactory.model.loader - trainable params: 9232384 \|\| all params: 1552946688 \|\| trainable%: 0.5945

	06/17/2024 19:51:07 - INFO - llamafactory.model.loader - trainable params: 9232384 \|\| all params: 1552946688 \|\| trainable%: 0.5945

	06/17/2024 19:51:07 - WARNING - accelerate.utils.other - Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.

	06/17/2024 19:51:07 - INFO - transformers.trainer - Using auto half precision backend

	06/17/2024 19:51:07 - INFO - transformers.trainer - *** Running training ***

	06/17/2024 19:51:07 - INFO - transformers.trainer - Num examples = 2,000

	06/17/2024 19:51:07 - INFO - transformers.trainer - Num Epochs = 3

	06/17/2024 19:51:07 - INFO - transformers.trainer - Instantaneous batch size per device = 2

	06/17/2024 19:51:07 - INFO - transformers.trainer - Total train batch size (w. parallel, distributed & accumulation) = 32

	06/17/2024 19:51:07 - INFO - transformers.trainer - Gradient Accumulation steps = 8

	06/17/2024 19:51:07 - INFO - transformers.trainer - Total optimization steps = 186

	06/17/2024 19:51:07 - INFO - transformers.trainer - Number of trainable parameters = 9,232,384

	06/17/2024 19:51:52 - INFO - llamafactory.extras.callbacks - {'loss': 0.8099, 'learning_rate': 4.9911e-05, 'epoch': 0.08, 'throughput': 2566.78}

	06/17/2024 19:52:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.9580, 'learning_rate': 4.9644e-05, 'epoch': 0.16, 'throughput': 2542.66}

	06/17/2024 19:53:18 - INFO - llamafactory.extras.callbacks - {'loss': 0.7150, 'learning_rate': 4.9202e-05, 'epoch': 0.24, 'throughput': 2505.70}

	06/17/2024 19:54:02 - INFO - llamafactory.extras.callbacks - {'loss': 0.7585, 'learning_rate': 4.8587e-05, 'epoch': 0.32, 'throughput': 2511.00}

	06/17/2024 19:54:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.7342, 'learning_rate': 4.7804e-05, 'epoch': 0.40, 'throughput': 2533.17}

	06/17/2024 19:55:16 - INFO - llamafactory.extras.callbacks - {'loss': 0.6904, 'learning_rate': 4.6859e-05, 'epoch': 0.48, 'throughput': 2557.91}

	06/17/2024 19:55:54 - INFO - llamafactory.extras.callbacks - {'loss': 0.8254, 'learning_rate': 4.5757e-05, 'epoch': 0.56, 'throughput': 2587.35}

	06/17/2024 19:56:34 - INFO - llamafactory.extras.callbacks - {'loss': 0.7551, 'learning_rate': 4.4508e-05, 'epoch': 0.64, 'throughput': 2578.43}

	06/17/2024 19:57:15 - INFO - llamafactory.extras.callbacks - {'loss': 0.7747, 'learning_rate': 4.3120e-05, 'epoch': 0.72, 'throughput': 2580.98}

	06/17/2024 19:57:54 - INFO - llamafactory.extras.callbacks - {'loss': 0.7027, 'learning_rate': 4.1602e-05, 'epoch': 0.80, 'throughput': 2578.88}

	06/17/2024 19:58:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.7581, 'learning_rate': 3.9967e-05, 'epoch': 0.88, 'throughput': 2580.17}

	06/17/2024 19:59:13 - INFO - llamafactory.extras.callbacks - {'loss': 0.7221, 'learning_rate': 3.8224e-05, 'epoch': 0.96, 'throughput': 2583.14}

	06/17/2024 19:59:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.8214, 'learning_rate': 3.6387e-05, 'epoch': 1.04, 'throughput': 2587.56}

	06/17/2024 20:00:36 - INFO - llamafactory.extras.callbacks - {'loss': 0.6304, 'learning_rate': 3.4469e-05, 'epoch': 1.12, 'throughput': 2580.91}

	06/17/2024 20:01:12 - INFO - llamafactory.extras.callbacks - {'loss': 0.6434, 'learning_rate': 3.2484e-05, 'epoch': 1.20, 'throughput': 2577.50}

	06/17/2024 20:01:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.6796, 'learning_rate': 3.0445e-05, 'epoch': 1.28, 'throughput': 2575.34}

	06/17/2024 20:02:35 - INFO - llamafactory.extras.callbacks - {'loss': 0.6651, 'learning_rate': 2.8368e-05, 'epoch': 1.36, 'throughput': 2580.63}

	06/17/2024 20:03:17 - INFO - llamafactory.extras.callbacks - {'loss': 0.7844, 'learning_rate': 2.6266e-05, 'epoch': 1.44, 'throughput': 2586.23}

	06/17/2024 20:04:01 - INFO - llamafactory.extras.callbacks - {'loss': 0.8139, 'learning_rate': 2.4156e-05, 'epoch': 1.52, 'throughput': 2581.43}

	06/17/2024 20:04:43 - INFO - llamafactory.extras.callbacks - {'loss': 0.6717, 'learning_rate': 2.2051e-05, 'epoch': 1.60, 'throughput': 2578.63}

	06/17/2024 20:04:43 - INFO - transformers.trainer - Saving model checkpoint to saves/Qwen2-1.5B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-100

	06/17/2024 20:04:44 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/config.json

	06/17/2024 20:04:44 - INFO - transformers.configuration_utils - Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"max_position_embeddings": 32768,
	"max_window_layers": 28,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_theta": 1000000.0,
	"sliding_window": 32768,
	"tie_word_embeddings": true,
	"torch_dtype": "bfloat16",
	"transformers_version": "4.41.2",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}


	06/17/2024 20:04:44 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Qwen2-1.5B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-100/tokenizer_config.json

	06/17/2024 20:04:44 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Qwen2-1.5B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-100/special_tokens_map.json

	06/17/2024 20:05:27 - INFO - llamafactory.extras.callbacks - {'loss': 0.7524, 'learning_rate': 1.9968e-05, 'epoch': 1.68, 'throughput': 2577.77}

	06/17/2024 20:06:06 - INFO - llamafactory.extras.callbacks - {'loss': 0.6310, 'learning_rate': 1.7920e-05, 'epoch': 1.76, 'throughput': 2578.49}

	06/17/2024 20:06:48 - INFO - llamafactory.extras.callbacks - {'loss': 0.7462, 'learning_rate': 1.5923e-05, 'epoch': 1.84, 'throughput': 2578.25}

	06/17/2024 20:07:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.6148, 'learning_rate': 1.3990e-05, 'epoch': 1.92, 'throughput': 2578.95}

	06/17/2024 20:08:10 - INFO - llamafactory.extras.callbacks - {'loss': 0.7145, 'learning_rate': 1.2136e-05, 'epoch': 2.00, 'throughput': 2582.83}

	06/17/2024 20:08:52 - INFO - llamafactory.extras.callbacks - {'loss': 0.6798, 'learning_rate': 1.0374e-05, 'epoch': 2.08, 'throughput': 2583.35}

	06/17/2024 20:09:31 - INFO - llamafactory.extras.callbacks - {'loss': 0.6754, 'learning_rate': 8.7157e-06, 'epoch': 2.16, 'throughput': 2584.30}

	06/17/2024 20:10:10 - INFO - llamafactory.extras.callbacks - {'loss': 0.6708, 'learning_rate': 7.1737e-06, 'epoch': 2.24, 'throughput': 2584.52}

	06/17/2024 20:10:46 - INFO - llamafactory.extras.callbacks - {'loss': 0.6386, 'learning_rate': 5.7587e-06, 'epoch': 2.32, 'throughput': 2587.95}

	06/17/2024 20:11:24 - INFO - llamafactory.extras.callbacks - {'loss': 0.6995, 'learning_rate': 4.4809e-06, 'epoch': 2.40, 'throughput': 2590.06}

	06/17/2024 20:12:06 - INFO - llamafactory.extras.callbacks - {'loss': 0.6691, 'learning_rate': 3.3494e-06, 'epoch': 2.48, 'throughput': 2593.15}

	06/17/2024 20:12:50 - INFO - llamafactory.extras.callbacks - {'loss': 0.6024, 'learning_rate': 2.3721e-06, 'epoch': 2.56, 'throughput': 2588.49}

	06/17/2024 20:13:28 - INFO - llamafactory.extras.callbacks - {'loss': 0.6484, 'learning_rate': 1.5562e-06, 'epoch': 2.64, 'throughput': 2591.26}

	06/17/2024 20:14:10 - INFO - llamafactory.extras.callbacks - {'loss': 0.7137, 'learning_rate': 9.0736e-07, 'epoch': 2.72, 'throughput': 2585.69}

	06/17/2024 20:14:56 - INFO - llamafactory.extras.callbacks - {'loss': 0.7770, 'learning_rate': 4.3025e-07, 'epoch': 2.80, 'throughput': 2589.29}

	06/17/2024 20:15:36 - INFO - llamafactory.extras.callbacks - {'loss': 0.6529, 'learning_rate': 1.2827e-07, 'epoch': 2.88, 'throughput': 2590.50}

	06/17/2024 20:16:16 - INFO - llamafactory.extras.callbacks - {'loss': 0.6996, 'learning_rate': 3.5659e-09, 'epoch': 2.96, 'throughput': 2590.41}

	06/17/2024 20:16:26 - INFO - transformers.trainer -

	Training completed. Do not forget to share your model on huggingface.co/models =)



	06/17/2024 20:16:26 - INFO - transformers.trainer - Saving model checkpoint to saves/Qwen2-1.5B-Chat/lora/train_2024-06-17-19-49-05

	06/17/2024 20:16:27 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8/config.json

	06/17/2024 20:16:27 - INFO - transformers.configuration_utils - Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"max_position_embeddings": 32768,
	"max_window_layers": 28,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_theta": 1000000.0,
	"sliding_window": 32768,
	"tie_word_embeddings": true,
	"torch_dtype": "bfloat16",
	"transformers_version": "4.41.2",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}


	06/17/2024 20:16:27 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Qwen2-1.5B-Chat/lora/train_2024-06-17-19-49-05/tokenizer_config.json

	06/17/2024 20:16:27 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Qwen2-1.5B-Chat/lora/train_2024-06-17-19-49-05/special_tokens_map.json

	06/17/2024 20:16:27 - WARNING - llamafactory.extras.ploting - No metric eval_loss to plot.

	06/17/2024 20:16:27 - INFO - transformers.modelcard - Dropping the following result as it does not have all the necessary fields:
	{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}