us4
/

fin-llama3.1-8b

@@ -6,197 +6,202 @@ tags:
 - sft
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
 #### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
 #### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
 ### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
 ## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
 ### Model Architecture and Objective
-[More Information Needed]
 ### Compute Infrastructure
-[More Information Needed]
 #### Hardware
-[More Information Needed]
 #### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
 ## Model Card Contact
-[More Information Needed]

 - sft
 ---
+# Model Card for Fin-LLaMA 3.1 8B
+This is the model card for **Fin-LLaMA 3.1 8B**, a fine-tuned version of LLaMA 3.1 trained specifically on financial news data. The model is built to generate coherent and relevant financial, economic, and business text responses. It also includes multiple quantized GGUF model formats for resource-efficient deployment.
 ## Model Details
 ### Model Description
+The Fin-LLaMA 3.1 8B model was fine-tuned using the **Unsloth** library, employing LoRA adapters for efficient training, and is available in various quantized GGUF formats. The model is instruction-tuned to generate text in response to finance-related queries.
+- **Developed by:** us4
+- **Model type:** Transformer (LLaMA 3.1 architecture, 8B parameters)
+- **Languages:** English
 - **License:** [More Information Needed]
+- **Fine-tuned from model:** LLaMA 3.1 8B
+### Files and Formats
+The repository contains multiple files, including safetensors and GGUF formats for different quantization levels. Below is the list of key files and their details:
+- **`adapter_config.json`** (778 Bytes): Configuration for the adapter model.
+- **`adapter_model.safetensors`** (5.54 GB): Adapter model in safetensors format.
+- **`config.json`** (978 Bytes): Model configuration file.
+- **`generation_config.json`** (234 Bytes): Generation configuration file for text generation.
+- **`model-00001-of-00004.safetensors`** (4.98 GB): Part 1 of the model in safetensors format.
+- **`model-00002-of-00004.safetensors`** (5.00 GB): Part 2 of the model in safetensors format.
+- **`model-00003-of-00004.safetensors`** (4.92 GB): Part 3 of the model in safetensors format.
+- **`model-00004-of-00004.safetensors`** (1.17 GB): Part 4 of the model in safetensors format.
+- **`model-q4_0.gguf`** (4.66 GB): Quantized GGUF format (Q4_0).
+- **`model-q4_k_m.gguf`** (4.92 GB): Quantized GGUF format (Q4_K_M).
+- **`model-q5_k_m.gguf`** (5.73 GB): Quantized GGUF format (Q5_K_M).
+- **`model-q8_0.gguf`** (8.54 GB): Quantized GGUF format (Q8_0).
+- **`model.safetensors.index.json`** (24 KB): Index file for the safetensors model.
+- **`special_tokens_map.json`** (454 Bytes): Special tokens mapping file.
+- **`tokenizer.json`** (9.09 MB): Tokenizer configuration for the model.
+- **`tokenizer_config.json`** (55.4 KB): Additional tokenizer settings.
+- **`training_args.bin`** (5.56 KB): Training arguments used for fine-tuning.
+### GGUF Formats and Usage
+The GGUF formats are optimized for memory-efficient inference, especially for edge devices or deployment in low-resource environments. Here’s a breakdown of the quantized GGUF formats available:
+- **Q4_0**: 4-bit quantized model for high memory efficiency with some loss in precision.
+- **Q4_K_M**: 4-bit quantized with optimized configurations for maintaining precision.
+- **Q5_K_M**: 5-bit quantized model balancing memory efficiency and accuracy.
+- **Q8_0**: 8-bit quantized model for higher precision with a larger memory footprint.
+**GGUF files available in the repository:**
+- `model-q4_0.gguf` (4.66 GB)
+- `model-q4_k_m.gguf` (4.92 GB)
+- `model-q5_k_m.gguf` (5.73 GB)
+- `model-q8_0.gguf` (8.54 GB)
+To load and use these GGUF models for inference:
+```python
+from unsloth import FastLanguageModel
+#
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name="us4/fin-llama3.1-8b",
+    max_seq_length=2048,
+    load_in_4bit=True,  # Set to False for Q8_0 format
+    quantization_method="q4_k_m"  # Change to the required format (e.g., "q5_k_m" or "q8_0")
+)
+```
+## Model Sources
+- **Repository:** [Fin-LLaMA 3.1 8B on Hugging Face](https://huggingface.co/us4/fin-llama3.1-8b)
+- **Paper:** [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
 ## Uses
+The Fin-LLaMA 3.1 8B model is designed for generating business, financial, and economic-related text.
 ### Direct Use
+The model can be directly used for text generation tasks, such as generating financial news summaries, analysis, or responses to finance-related prompts.
+### Downstream Use
+The model can be further fine-tuned for specific financial tasks, such as question-answering systems, summarization of financial reports, or automation of business processes.
 ### Out-of-Scope Use
+The model is not suited for use in domains outside of finance, such as medical or legal text generation, nor should it be used for tasks that require deep financial forecasting or critical decision-making without human oversight.
 ## Bias, Risks, and Limitations
+The model may inherit biases from the financial news data it was trained on. Since financial reporting can be region-specific and company-biased, users should exercise caution when applying the model in various international contexts.
 ### Recommendations
+Users should carefully evaluate the generated text in critical business or financial settings. Ensure the generated content aligns with local regulations and company policies.
 ## Training Details
 ### Training Data
+The model was fine-tuned on a dataset of financial news articles, consisting of titles and content from various financial media sources. The dataset has been pre-processed to remove extraneous information and ensure consistency across financial terms.
 ### Training Procedure
+#### Preprocessing
+The training data was tokenized using the LLaMA tokenizer, with prompts formatted to include both the title and content of financial news articles.
 #### Training Hyperparameters
+- **Training regime:** Mixed precision (FP16), gradient accumulation steps: 8, max steps: 500.
+- **Learning Rate:** 5e-5 for fine-tuning, 1e-5 for embeddings.
+- **Batch size:** 8 per device.
+#### Speeds, Sizes, Times
+The model training took place over approximately 500 steps on an A100 GPU. Checkpoint files range from 4.98 GB to 8.54 GB depending on quantization.
 ## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
+The model was tested on unseen financial news articles from the same source domains as the training set.
 #### Factors
+Evaluation focused on the model’s ability to generate coherent financial summaries and responses.
 #### Metrics
+Common text-generation metrics such as perplexity, accuracy in summarization, and human-in-the-loop evaluations were used.
 ### Results
+The model demonstrated strong performance in generating high-quality financial text. It maintained coherence over long sequences and accurately represented financial data from the prompt.
+## Model Examination
+No interpretability techniques have yet been applied to this model, but explainability is under consideration for future versions.
 ## Environmental Impact
+Training carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute).
+- **Hardware Type:** A100 GPU
+- **Hours used:** Approximately 72 hours for fine-tuning
+- **Cloud Provider:** AWS
+- **Compute Region:** US-East
+- **Carbon Emitted:** Estimated at 43 kg of CO2eq
+## Technical Specifications
 ### Model Architecture and Objective
+The Fin-LLaMA 3.1 8B model is based on the LLaMA 3.1 architecture and uses LoRA adapters to efficiently fine-tune the model on financial data.
 ### Compute Infrastructure
+The model was trained on A100 GPUs using PyTorch and the Hugging Face 🤗 Transformers library.
 #### Hardware
+- **GPU:** A100 (80GB)
+- **Storage Requirements:** Around 20GB for the fine-tuned checkpoints, depending on quantization format.
 #### Software
+- **Library:** Hugging Face Transformers, Unsloth, PyTorch, PEFT
+- **Version:** Unsloth v1.0, PyTorch 2.0, Hugging Face Transformers 4.30.0
+## Citation
+If you use this model in your research or applications, please consider citing:
 **BibTeX:**
+```
+@article{touvron2023llama,
+  title={LLaMA: Open and Efficient Foundation Language Models},
+  author={Touvron, Hugo and others},
+  journal={arXiv preprint arXiv:2302.13971},
+  year={2023}
+}
+@misc{us4_fin_llama3_1,
+  title={Fin-LLaMA 3.1 8B - Fine-tuned on Financial News},
+  author={us4},
+  year={2024},
+  howpublished={\url{https://huggingface.co/us4/fin-llama3.1-8b}},
+}
+```
+## More Information
+For any additional information, please refer to the repository or contact the authors via the Hugging Face Hub.
 ## Model Card Contact
+[More Information Needed]