malmarjeh commited on
Commit
8582976
·
verified ·
1 Parent(s): 21476d2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -14
README.md CHANGED
@@ -15,11 +15,11 @@ widget:
15
  content: I want to close an online account
16
  ---
17
 
18
- # Mistral-7B-Banking-v2
19
 
20
  ## Model Description
21
 
22
- This model, "Mistral-7B-Banking-v2", is a fine-tuned version of the [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2), specifically tailored for the Banking domain. It is optimized to answer questions and assist users with various banking transactions. It has been trained using hybrid synthetic data generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools.
23
 
24
  The goal of this model is to show that a generic verticalized model makes customization for a final use case much easier. For example, if you are "ACME Bank", you can create your own customized model by using this fine-tuned model and a doing an additional fine-tuning using a small amount of your own data. An overview of this approach can be found at: [From General-Purpose LLMs to Verticalized Enterprise Models](https://www.bitext.com/blog/general-purpose-models-verticalized-enterprise-genai/)
25
 
@@ -32,13 +32,26 @@ The goal of this model is to show that a generic verticalized model makes custom
32
 
33
  ```python
34
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
35
 
36
- model = AutoModelForCausalLM.from_pretrained("bitext-llm/Mistral-7B-Banking-v2")
37
- tokenizer = AutoTokenizer.from_pretrained("bitext-llm/Mistral-7B-Banking-v2")
38
 
39
- inputs = tokenizer("<s>[INST] How can I transfer money to another account?[/INST]", return_tensors="pt")
40
- outputs = model.generate(inputs['input_ids'], max_length=50)
41
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  ```
43
 
44
  ## Model Architecture
@@ -55,16 +68,16 @@ The model was fine-tuned on a dataset comprising various banking-related intents
55
 
56
  - **Optimizer**: AdamW
57
  - **Learning Rate**: 0.0002 with a cosine learning rate scheduler
58
- - **Epochs**: 4
59
- - **Batch Size**: 10
60
- - **Gradient Accumulation Steps**: 8
61
  - **Maximum Sequence Length**: 8192 tokens
62
 
63
  ### Environment
64
 
65
- - **Transformers Version**: 4.40.0.dev0
66
- - **Framework**: PyTorch 2.2.1+cu121
67
- - **Tokenizers**: Tokenizers 0.15.0
68
 
69
  ## Limitations and Bias
70
 
@@ -81,7 +94,7 @@ This model was developed and trained by Bitext using proprietary data and techno
81
 
82
  ## License
83
 
84
- This model, "Mistral-7B-Banking-v2", is licensed under the Apache License 2.0 by Bitext Innovations International, Inc. This open-source license allows for free use, modification, and distribution of the model but requires that proper credit be given to Bitext.
85
 
86
  ### Key Points of the Apache 2.0 License
87
 
 
15
  content: I want to close an online account
16
  ---
17
 
18
+ # Mistral-7B-Banking
19
 
20
  ## Model Description
21
 
22
+ This model, "Mistral-7B-Banking", is a fine-tuned version of the [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2), specifically tailored for the Banking domain. It is optimized to answer questions and assist users with various banking transactions. It has been trained using hybrid synthetic data generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools.
23
 
24
  The goal of this model is to show that a generic verticalized model makes customization for a final use case much easier. For example, if you are "ACME Bank", you can create your own customized model by using this fine-tuned model and a doing an additional fine-tuning using a small amount of your own data. An overview of this approach can be found at: [From General-Purpose LLMs to Verticalized Enterprise Models](https://www.bitext.com/blog/general-purpose-models-verticalized-enterprise-genai/)
25
 
 
32
 
33
  ```python
34
  from transformers import AutoModelForCausalLM, AutoTokenizer
35
+ import torch
36
 
37
+ device = 'cuda' if torch.cuda.is_available() else 'cpu'
 
38
 
39
+ model = AutoModelForCausalLM.from_pretrained("bitext/Mistral-7B-Banking-v2")
40
+ tokenizer = AutoTokenizer.from_pretrained("bitext/Mistral-7B-Banking-v2")
41
+
42
+ messages = [
43
+ {"role": "system", "content": "You are an expert in customer support for Banking."},
44
+ {"role": "user", "content": "I want to open a bank account"},
45
+ ]
46
+
47
+ encoded = tokenizer.apply_chat_template(messages, return_tensors="pt")
48
+
49
+ model_inputs = encoded.to(device)
50
+ model.to(device)
51
+
52
+ generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
53
+ decoded = tokenizer.batch_decode(generated_ids)
54
+ print(decoded[0])
55
  ```
56
 
57
  ## Model Architecture
 
68
 
69
  - **Optimizer**: AdamW
70
  - **Learning Rate**: 0.0002 with a cosine learning rate scheduler
71
+ - **Epochs**: 3
72
+ - **Batch Size**: 4
73
+ - **Gradient Accumulation Steps**: 4
74
  - **Maximum Sequence Length**: 8192 tokens
75
 
76
  ### Environment
77
 
78
+ - **Transformers Version**: 4.43.4
79
+ - **Framework**: PyTorch 2.3.1+cu121
80
+ - **Tokenizers**: Tokenizers 0.19.1
81
 
82
  ## Limitations and Bias
83
 
 
94
 
95
  ## License
96
 
97
+ This model, "Mistral-7B-Banking", is licensed under the Apache License 2.0 by Bitext Innovations International, Inc. This open-source license allows for free use, modification, and distribution of the model but requires that proper credit be given to Bitext.
98
 
99
  ### Key Points of the Apache 2.0 License
100