You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

An LLM for Chinese Information Extraction.

基于Baichuan-7B,使用8张A800进行了全参数SFT。目的是使用一个强基座模型复现zju cama

对于SFT的数据进行了扩充: image/png

并没有跑Eval,欢迎提供!

训练用的Codebase是来自于shibing624大佬

使用的Bash如下

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node 8 ../supervised_finetuning.py \
    --model_type baichuan \
    --model_name_or_path /data/llm/models/Pretrained/Baichuan-7B/ \
    --train_file_dir ../data/finetune/1124_IELLM/ \
    --per_device_train_batch_size 8 \
    --do_train \
    --use_peft False \
    --num_train_epochs 3 \
    --learning_rate 2e-5 \
    --warmup_ratio 0.03 \
    --weight_decay 0. \
    --fp16 \
    --logging_strategy steps \
    --logging_steps 10 \
    --save_strategy epoch \
    --save_total_limit 5 \
    --gradient_accumulation_steps 1 \
    --preprocessing_num_workers 8 \
    --output_dir ../results/20231124_IELLM \
    --overwrite_output_dir \
    --ddp_timeout 30000 \
    --logging_first_step True \
    --torch_dtype float16 \
    --device_map auto \
    --report_to tensorboard \
    --ddp_find_unused_parameters False \
    --gradient_checkpointing True \
    --cache_dir ./cache \
    --model_max_length 2048 \
    --deepspeed ../deepspeed_zero_stage2_config.json \
    --template_name baichuan \
    --flash_attn 
***** train metrics *****
  epoch                    =                3.0
  train_loss               =             0.1012
  train_runtime            = 1 day, 14:16:59.20
  train_samples            =             376031
  train_samples_per_second =              8.185
  train_steps_per_second   =              0.128

image/png

测试结果:

image/png

image/png

Downloads last month
0
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.

Dataset used to train lorinma/ZjuCamaXBaichuan7B