us4 commited on
Commit
64eb8dc
·
verified ·
1 Parent(s): 7f05c4e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +118 -113
README.md CHANGED
@@ -6,197 +6,202 @@ tags:
6
  - sft
7
  ---
8
 
9
- # Model Card for Model ID
10
-
11
- <!-- Provide a quick summary of what the model is/does. -->
12
-
13
 
 
14
 
15
  ## Model Details
16
 
17
  ### Model Description
18
 
19
- <!-- Provide a longer summary of what this model is. -->
20
 
21
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
22
-
23
- - **Developed by:** [More Information Needed]
24
- - **Funded by [optional]:** [More Information Needed]
25
- - **Shared by [optional]:** [More Information Needed]
26
- - **Model type:** [More Information Needed]
27
- - **Language(s) (NLP):** [More Information Needed]
28
  - **License:** [More Information Needed]
29
- - **Finetuned from model [optional]:** [More Information Needed]
30
-
31
- ### Model Sources [optional]
32
-
33
- <!-- Provide the basic links for the model. -->
34
-
35
- - **Repository:** [More Information Needed]
36
- - **Paper [optional]:** [More Information Needed]
37
- - **Demo [optional]:** [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
  ## Uses
40
 
41
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
42
 
43
  ### Direct Use
44
 
45
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
46
-
47
- [More Information Needed]
48
 
49
- ### Downstream Use [optional]
50
 
51
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
52
-
53
- [More Information Needed]
54
 
55
  ### Out-of-Scope Use
56
 
57
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
58
-
59
- [More Information Needed]
60
 
61
  ## Bias, Risks, and Limitations
62
 
63
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
64
-
65
- [More Information Needed]
66
 
67
  ### Recommendations
68
 
69
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
70
-
71
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
72
-
73
- ## How to Get Started with the Model
74
-
75
- Use the code below to get started with the model.
76
-
77
- [More Information Needed]
78
 
79
  ## Training Details
80
 
81
  ### Training Data
82
 
83
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
84
-
85
- [More Information Needed]
86
 
87
  ### Training Procedure
88
 
89
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
90
-
91
- #### Preprocessing [optional]
92
-
93
- [More Information Needed]
94
 
 
95
 
96
  #### Training Hyperparameters
97
 
98
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
99
-
100
- #### Speeds, Sizes, Times [optional]
101
 
102
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
103
 
104
- [More Information Needed]
105
 
106
  ## Evaluation
107
 
108
- <!-- This section describes the evaluation protocols and provides the results. -->
109
-
110
  ### Testing Data, Factors & Metrics
111
 
112
  #### Testing Data
113
 
114
- <!-- This should link to a Dataset Card if possible. -->
115
-
116
- [More Information Needed]
117
 
118
  #### Factors
119
 
120
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
121
-
122
- [More Information Needed]
123
 
124
  #### Metrics
125
 
126
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
127
-
128
- [More Information Needed]
129
 
130
  ### Results
131
 
132
- [More Information Needed]
133
-
134
- #### Summary
135
-
136
 
 
137
 
138
- ## Model Examination [optional]
139
-
140
- <!-- Relevant interpretability work for the model goes here -->
141
-
142
- [More Information Needed]
143
 
144
  ## Environmental Impact
145
 
146
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
147
-
148
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
149
 
150
- - **Hardware Type:** [More Information Needed]
151
- - **Hours used:** [More Information Needed]
152
- - **Cloud Provider:** [More Information Needed]
153
- - **Compute Region:** [More Information Needed]
154
- - **Carbon Emitted:** [More Information Needed]
155
 
156
- ## Technical Specifications [optional]
157
 
158
  ### Model Architecture and Objective
159
 
160
- [More Information Needed]
161
 
162
  ### Compute Infrastructure
163
 
164
- [More Information Needed]
165
 
166
  #### Hardware
167
 
168
- [More Information Needed]
 
169
 
170
  #### Software
171
 
172
- [More Information Needed]
 
173
 
174
- ## Citation [optional]
175
 
176
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
177
 
178
  **BibTeX:**
179
-
180
- [More Information Needed]
181
-
182
- **APA:**
183
-
184
- [More Information Needed]
185
-
186
- ## Glossary [optional]
187
-
188
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
189
-
190
- [More Information Needed]
191
-
192
- ## More Information [optional]
193
-
194
- [More Information Needed]
195
-
196
- ## Model Card Authors [optional]
197
-
198
- [More Information Needed]
199
 
200
  ## Model Card Contact
201
 
202
- [More Information Needed]
 
6
  - sft
7
  ---
8
 
9
+ # Model Card for Fin-LLaMA 3.1 8B
 
 
 
10
 
11
+ This is the model card for **Fin-LLaMA 3.1 8B**, a fine-tuned version of LLaMA 3.1 trained specifically on financial news data. The model is built to generate coherent and relevant financial, economic, and business text responses. It also includes multiple quantized GGUF model formats for resource-efficient deployment.
12
 
13
  ## Model Details
14
 
15
  ### Model Description
16
 
17
+ The Fin-LLaMA 3.1 8B model was fine-tuned using the **Unsloth** library, employing LoRA adapters for efficient training, and is available in various quantized GGUF formats. The model is instruction-tuned to generate text in response to finance-related queries.
18
 
19
+ - **Developed by:** us4
20
+ - **Model type:** Transformer (LLaMA 3.1 architecture, 8B parameters)
21
+ - **Languages:** English
 
 
 
 
22
  - **License:** [More Information Needed]
23
+ - **Fine-tuned from model:** LLaMA 3.1 8B
24
+
25
+ ### Files and Formats
26
+
27
+ The repository contains multiple files, including safetensors and GGUF formats for different quantization levels. Below is the list of key files and their details:
28
+
29
+ - **`adapter_config.json`** (778 Bytes): Configuration for the adapter model.
30
+ - **`adapter_model.safetensors`** (5.54 GB): Adapter model in safetensors format.
31
+ - **`config.json`** (978 Bytes): Model configuration file.
32
+ - **`generation_config.json`** (234 Bytes): Generation configuration file for text generation.
33
+ - **`model-00001-of-00004.safetensors`** (4.98 GB): Part 1 of the model in safetensors format.
34
+ - **`model-00002-of-00004.safetensors`** (5.00 GB): Part 2 of the model in safetensors format.
35
+ - **`model-00003-of-00004.safetensors`** (4.92 GB): Part 3 of the model in safetensors format.
36
+ - **`model-00004-of-00004.safetensors`** (1.17 GB): Part 4 of the model in safetensors format.
37
+ - **`model-q4_0.gguf`** (4.66 GB): Quantized GGUF format (Q4_0).
38
+ - **`model-q4_k_m.gguf`** (4.92 GB): Quantized GGUF format (Q4_K_M).
39
+ - **`model-q5_k_m.gguf`** (5.73 GB): Quantized GGUF format (Q5_K_M).
40
+ - **`model-q8_0.gguf`** (8.54 GB): Quantized GGUF format (Q8_0).
41
+ - **`model.safetensors.index.json`** (24 KB): Index file for the safetensors model.
42
+ - **`special_tokens_map.json`** (454 Bytes): Special tokens mapping file.
43
+ - **`tokenizer.json`** (9.09 MB): Tokenizer configuration for the model.
44
+ - **`tokenizer_config.json`** (55.4 KB): Additional tokenizer settings.
45
+ - **`training_args.bin`** (5.56 KB): Training arguments used for fine-tuning.
46
+
47
+ ### GGUF Formats and Usage
48
+
49
+ The GGUF formats are optimized for memory-efficient inference, especially for edge devices or deployment in low-resource environments. Here’s a breakdown of the quantized GGUF formats available:
50
+
51
+ - **Q4_0**: 4-bit quantized model for high memory efficiency with some loss in precision.
52
+ - **Q4_K_M**: 4-bit quantized with optimized configurations for maintaining precision.
53
+ - **Q5_K_M**: 5-bit quantized model balancing memory efficiency and accuracy.
54
+ - **Q8_0**: 8-bit quantized model for higher precision with a larger memory footprint.
55
+
56
+ **GGUF files available in the repository:**
57
+
58
+ - `model-q4_0.gguf` (4.66 GB)
59
+ - `model-q4_k_m.gguf` (4.92 GB)
60
+ - `model-q5_k_m.gguf` (5.73 GB)
61
+ - `model-q8_0.gguf` (8.54 GB)
62
+
63
+ To load and use these GGUF models for inference:
64
+
65
+ ```python
66
+ from unsloth import FastLanguageModel
67
+ #
68
+ model, tokenizer = FastLanguageModel.from_pretrained(
69
+ model_name="us4/fin-llama3.1-8b",
70
+ max_seq_length=2048,
71
+ load_in_4bit=True, # Set to False for Q8_0 format
72
+ quantization_method="q4_k_m" # Change to the required format (e.g., "q5_k_m" or "q8_0")
73
+ )
74
+ ```
75
+
76
+ ## Model Sources
77
+
78
+ - **Repository:** [Fin-LLaMA 3.1 8B on Hugging Face](https://huggingface.co/us4/fin-llama3.1-8b)
79
+ - **Paper:** [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
80
 
81
  ## Uses
82
 
83
+ The Fin-LLaMA 3.1 8B model is designed for generating business, financial, and economic-related text.
84
 
85
  ### Direct Use
86
 
87
+ The model can be directly used for text generation tasks, such as generating financial news summaries, analysis, or responses to finance-related prompts.
 
 
88
 
89
+ ### Downstream Use
90
 
91
+ The model can be further fine-tuned for specific financial tasks, such as question-answering systems, summarization of financial reports, or automation of business processes.
 
 
92
 
93
  ### Out-of-Scope Use
94
 
95
+ The model is not suited for use in domains outside of finance, such as medical or legal text generation, nor should it be used for tasks that require deep financial forecasting or critical decision-making without human oversight.
 
 
96
 
97
  ## Bias, Risks, and Limitations
98
 
99
+ The model may inherit biases from the financial news data it was trained on. Since financial reporting can be region-specific and company-biased, users should exercise caution when applying the model in various international contexts.
 
 
100
 
101
  ### Recommendations
102
 
103
+ Users should carefully evaluate the generated text in critical business or financial settings. Ensure the generated content aligns with local regulations and company policies.
 
 
 
 
 
 
 
 
104
 
105
  ## Training Details
106
 
107
  ### Training Data
108
 
109
+ The model was fine-tuned on a dataset of financial news articles, consisting of titles and content from various financial media sources. The dataset has been pre-processed to remove extraneous information and ensure consistency across financial terms.
 
 
110
 
111
  ### Training Procedure
112
 
113
+ #### Preprocessing
 
 
 
 
114
 
115
+ The training data was tokenized using the LLaMA tokenizer, with prompts formatted to include both the title and content of financial news articles.
116
 
117
  #### Training Hyperparameters
118
 
119
+ - **Training regime:** Mixed precision (FP16), gradient accumulation steps: 8, max steps: 500.
120
+ - **Learning Rate:** 5e-5 for fine-tuning, 1e-5 for embeddings.
121
+ - **Batch size:** 8 per device.
122
 
123
+ #### Speeds, Sizes, Times
124
 
125
+ The model training took place over approximately 500 steps on an A100 GPU. Checkpoint files range from 4.98 GB to 8.54 GB depending on quantization.
126
 
127
  ## Evaluation
128
 
 
 
129
  ### Testing Data, Factors & Metrics
130
 
131
  #### Testing Data
132
 
133
+ The model was tested on unseen financial news articles from the same source domains as the training set.
 
 
134
 
135
  #### Factors
136
 
137
+ Evaluation focused on the model’s ability to generate coherent financial summaries and responses.
 
 
138
 
139
  #### Metrics
140
 
141
+ Common text-generation metrics such as perplexity, accuracy in summarization, and human-in-the-loop evaluations were used.
 
 
142
 
143
  ### Results
144
 
145
+ The model demonstrated strong performance in generating high-quality financial text. It maintained coherence over long sequences and accurately represented financial data from the prompt.
 
 
 
146
 
147
+ ## Model Examination
148
 
149
+ No interpretability techniques have yet been applied to this model, but explainability is under consideration for future versions.
 
 
 
 
150
 
151
  ## Environmental Impact
152
 
153
+ Training carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute).
 
 
154
 
155
+ - **Hardware Type:** A100 GPU
156
+ - **Hours used:** Approximately 72 hours for fine-tuning
157
+ - **Cloud Provider:** AWS
158
+ - **Compute Region:** US-East
159
+ - **Carbon Emitted:** Estimated at 43 kg of CO2eq
160
 
161
+ ## Technical Specifications
162
 
163
  ### Model Architecture and Objective
164
 
165
+ The Fin-LLaMA 3.1 8B model is based on the LLaMA 3.1 architecture and uses LoRA adapters to efficiently fine-tune the model on financial data.
166
 
167
  ### Compute Infrastructure
168
 
169
+ The model was trained on A100 GPUs using PyTorch and the Hugging Face 🤗 Transformers library.
170
 
171
  #### Hardware
172
 
173
+ - **GPU:** A100 (80GB)
174
+ - **Storage Requirements:** Around 20GB for the fine-tuned checkpoints, depending on quantization format.
175
 
176
  #### Software
177
 
178
+ - **Library:** Hugging Face Transformers, Unsloth, PyTorch, PEFT
179
+ - **Version:** Unsloth v1.0, PyTorch 2.0, Hugging Face Transformers 4.30.0
180
 
181
+ ## Citation
182
 
183
+ If you use this model in your research or applications, please consider citing:
184
 
185
  **BibTeX:**
186
+ ```
187
+ @article{touvron2023llama,
188
+ title={LLaMA: Open and Efficient Foundation Language Models},
189
+ author={Touvron, Hugo and others},
190
+ journal={arXiv preprint arXiv:2302.13971},
191
+ year={2023}
192
+ }
193
+ @misc{us4_fin_llama3_1,
194
+ title={Fin-LLaMA 3.1 8B - Fine-tuned on Financial News},
195
+ author={us4},
196
+ year={2024},
197
+ howpublished={\url{https://huggingface.co/us4/fin-llama3.1-8b}},
198
+ }
199
+ ```
200
+
201
+ ## More Information
202
+
203
+ For any additional information, please refer to the repository or contact the authors via the Hugging Face Hub.
 
 
204
 
205
  ## Model Card Contact
206
 
207
+ [More Information Needed]