Text Generation
Transformers
Safetensors
Czech
mpt
custom_code
text-generation-inference
Inference Endpoints
mfajcik commited on
Commit
7c9025d
1 Parent(s): b641776

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md CHANGED
@@ -1,3 +1,84 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ ### Eval
5
+ Dev eval at CS-HellaSwag
6
+ | Model | Model Accuracy |
7
+ |---------------|----------------|
8
+ | mistral7b | 0.4992 |
9
+ | csmpt-130k | __0.5004__ |
10
+ | csmpt-100k | 0.4959 |
11
+ | csmpt-75k | 0.4895 |
12
+ | csmpt-50k steps | 0.4755 |
13
+ | csmpt-26.5k steps | 0.4524 |
14
+
15
+
16
+ However, we ran validation on Hellaswag, and after 100k, the improvements were very noisy if any. The improvement over mistral7b is not significant.
17
+
18
+
19
+ ### How to setup environment
20
+ ```bash
21
+ pip install transformers==4.37.2 torch==2.1.2 einops==0.7.0
22
+
23
+ # be sure to install right flash-attn, we use torch compiled with CUDA 12.1, no ABI, python 3.9, Linux x86_64 architecture
24
+ pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.5.3/flash_attn-2.5.3+cu122torch2.
25
+ 1cxx11abiFALSE-cp39-cp39-linux_x86_64.whl
26
+
27
+ ### How to use in transformers
28
+ ```python
29
+ import torch
30
+ import transformers
31
+ from transformers import pipeline
32
+
33
+ name = 'BUT-FIT/csmpt7b'
34
+
35
+ config = transformers.AutoConfig.from_pretrained(name, trust_remote_code=True)
36
+ config.attn_config['attn_impl'] = 'flash'
37
+ config.init_device = 'cuda:0' # For fast initialization directly on GPU!
38
+ model = transformers.AutoModelForCausalLM.from_pretrained(
39
+ name,
40
+ config=config,
41
+ torch_dtype=torch.bfloat16, # Load model weights in bfloat16
42
+ trust_remote_code=True
43
+ )
44
+
45
+ tokenizer = transformers.AutoTokenizer.from_pretrained(name, trust_remote_code=True)
46
+
47
+ pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, device='cuda:0')
48
+
49
+ with torch.autocast('cuda', dtype=torch.bfloat16):
50
+ print(
51
+ pipe('Nejznámějším českým spisovatelem ',
52
+ max_new_tokens=100,
53
+ top_p=0.95,
54
+ repetition_penalty=1.0,
55
+ do_sample=True,
56
+ use_cache=True))
57
+
58
+ ```
59
+
60
+
61
+ ### Our Release Plan
62
+ | Stage | Description | Date |
63
+ |---------------|----------------|----------------|
64
+ | 1 | 'Best' model + training data | 11.03.2024
65
+ | 2 | All checkpoints + training code|
66
+ | 3 | __Benczechmark__ a collection of Czech datasets for few-shot LLM evaluation |
67
+
68
+
69
+
70
+ - Stage 1: 'Best' model + training data.
71
+ - Stage 2: All checkpoints + training code
72
+ - Stage 3: __Benczechmark__ a collection of Czech datasets. **Get in touch if you'd like to know more and contribute!**
73
+
74
+ ## Getting in Touch
75
+ For further questions, email to `martin.fajcik@vut.cz`.
76
+
77
+ ## Disclaimer
78
+ This is a probabilistic model, and authors are not responsible for the model outputs. Use at your own risk.
79
+
80
+
81
+ ## Acknowledgement
82
+ This work was supported by NAKI III program of Ministry of Culture Czech Republic, project semANT ---
83
+ "Sémantický průzkumník textového kulturního dědictví" grant no. `DH23P03OVV060` and
84
+ by the Ministry of Education, Youth and Sports of the Czech Republic through the e-INFRA CZ (ID:`90254`).