Update README.md
Browse files
README.md
CHANGED
@@ -15,11 +15,11 @@ widget:
|
|
15 |
content: I want to close an online account
|
16 |
---
|
17 |
|
18 |
-
# Mistral-7B-Banking
|
19 |
|
20 |
## Model Description
|
21 |
|
22 |
-
This model, "Mistral-7B-Banking
|
23 |
|
24 |
The goal of this model is to show that a generic verticalized model makes customization for a final use case much easier. For example, if you are "ACME Bank", you can create your own customized model by using this fine-tuned model and a doing an additional fine-tuning using a small amount of your own data. An overview of this approach can be found at: [From General-Purpose LLMs to Verticalized Enterprise Models](https://www.bitext.com/blog/general-purpose-models-verticalized-enterprise-genai/)
|
25 |
|
@@ -32,13 +32,26 @@ The goal of this model is to show that a generic verticalized model makes custom
|
|
32 |
|
33 |
```python
|
34 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
35 |
|
36 |
-
|
37 |
-
tokenizer = AutoTokenizer.from_pretrained("bitext-llm/Mistral-7B-Banking-v2")
|
38 |
|
39 |
-
|
40 |
-
|
41 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
```
|
43 |
|
44 |
## Model Architecture
|
@@ -55,16 +68,16 @@ The model was fine-tuned on a dataset comprising various banking-related intents
|
|
55 |
|
56 |
- **Optimizer**: AdamW
|
57 |
- **Learning Rate**: 0.0002 with a cosine learning rate scheduler
|
58 |
-
- **Epochs**:
|
59 |
-
- **Batch Size**:
|
60 |
-
- **Gradient Accumulation Steps**:
|
61 |
- **Maximum Sequence Length**: 8192 tokens
|
62 |
|
63 |
### Environment
|
64 |
|
65 |
-
- **Transformers Version**: 4.
|
66 |
-
- **Framework**: PyTorch 2.
|
67 |
-
- **Tokenizers**: Tokenizers 0.
|
68 |
|
69 |
## Limitations and Bias
|
70 |
|
@@ -81,7 +94,7 @@ This model was developed and trained by Bitext using proprietary data and techno
|
|
81 |
|
82 |
## License
|
83 |
|
84 |
-
This model, "Mistral-7B-Banking
|
85 |
|
86 |
### Key Points of the Apache 2.0 License
|
87 |
|
|
|
15 |
content: I want to close an online account
|
16 |
---
|
17 |
|
18 |
+
# Mistral-7B-Banking
|
19 |
|
20 |
## Model Description
|
21 |
|
22 |
+
This model, "Mistral-7B-Banking", is a fine-tuned version of the [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2), specifically tailored for the Banking domain. It is optimized to answer questions and assist users with various banking transactions. It has been trained using hybrid synthetic data generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools.
|
23 |
|
24 |
The goal of this model is to show that a generic verticalized model makes customization for a final use case much easier. For example, if you are "ACME Bank", you can create your own customized model by using this fine-tuned model and a doing an additional fine-tuning using a small amount of your own data. An overview of this approach can be found at: [From General-Purpose LLMs to Verticalized Enterprise Models](https://www.bitext.com/blog/general-purpose-models-verticalized-enterprise-genai/)
|
25 |
|
|
|
32 |
|
33 |
```python
|
34 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
35 |
+
import torch
|
36 |
|
37 |
+
device = 'cuda' if torch.cuda.is_available() else 'cpu'
|
|
|
38 |
|
39 |
+
model = AutoModelForCausalLM.from_pretrained("bitext/Mistral-7B-Banking-v2")
|
40 |
+
tokenizer = AutoTokenizer.from_pretrained("bitext/Mistral-7B-Banking-v2")
|
41 |
+
|
42 |
+
messages = [
|
43 |
+
{"role": "system", "content": "You are an expert in customer support for Banking."},
|
44 |
+
{"role": "user", "content": "I want to open a bank account"},
|
45 |
+
]
|
46 |
+
|
47 |
+
encoded = tokenizer.apply_chat_template(messages, return_tensors="pt")
|
48 |
+
|
49 |
+
model_inputs = encoded.to(device)
|
50 |
+
model.to(device)
|
51 |
+
|
52 |
+
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
|
53 |
+
decoded = tokenizer.batch_decode(generated_ids)
|
54 |
+
print(decoded[0])
|
55 |
```
|
56 |
|
57 |
## Model Architecture
|
|
|
68 |
|
69 |
- **Optimizer**: AdamW
|
70 |
- **Learning Rate**: 0.0002 with a cosine learning rate scheduler
|
71 |
+
- **Epochs**: 3
|
72 |
+
- **Batch Size**: 4
|
73 |
+
- **Gradient Accumulation Steps**: 4
|
74 |
- **Maximum Sequence Length**: 8192 tokens
|
75 |
|
76 |
### Environment
|
77 |
|
78 |
+
- **Transformers Version**: 4.43.4
|
79 |
+
- **Framework**: PyTorch 2.3.1+cu121
|
80 |
+
- **Tokenizers**: Tokenizers 0.19.1
|
81 |
|
82 |
## Limitations and Bias
|
83 |
|
|
|
94 |
|
95 |
## License
|
96 |
|
97 |
+
This model, "Mistral-7B-Banking", is licensed under the Apache License 2.0 by Bitext Innovations International, Inc. This open-source license allows for free use, modification, and distribution of the model but requires that proper credit be given to Bitext.
|
98 |
|
99 |
### Key Points of the Apache 2.0 License
|
100 |
|