arhamk commited on
Commit
3ce10b9
1 Parent(s): 90d1dd2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -2
README.md CHANGED
@@ -2,8 +2,52 @@
2
  tags:
3
  - autotrain
4
  - text-generation
 
5
  widget:
6
- - text: "I love AutoTrain because "
 
 
 
7
  ---
8
 
9
- # Model Trained Using AutoTrain
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  tags:
3
  - autotrain
4
  - text-generation
5
+ - finance
6
  widget:
7
+ - text: 'I love AutoTrain because '
8
+ license: apache-2.0
9
+ datasets:
10
+ - AdiOO7/llama-2-finance
11
  ---
12
 
13
+ # Model Trained Using AutoTrain
14
+
15
+ This repository contains the code for training an advanced language model using the autotrain library from Hugging Face. The goal of this project is to fine-tune a pre-trained language model on financial data to improve its performance on downstream tasks such as sentiment analysis and named entity recognition.
16
+
17
+ ## Installation
18
+
19
+
20
+ To use this repository, you will need to have Python installed with the following packages:
21
+ `
22
+ pip install autotrain-advanced
23
+ pip install huggingface_hub
24
+ `
25
+
26
+ ## Training Data
27
+
28
+
29
+ The training data for this project consists of financial news articles scraped from various sources. These articles were selected based on their relevance to the stock market and other financial topics. The data was then cleaned and processed into a format suitable for training a language model.
30
+
31
+ ## Model Architecture
32
+
33
+
34
+ The model architecture used for this project is a variant of the BERT (Bidirectional Encoder Representations from Transformers) model. Specifically, we used the TinyPixel/Llama-2-7B-bf16-sharded model, which has been optimized for finance-related tasks. This model uses a combination of wordpiece tokenization and positional encoding to represent input sequences.
35
+
36
+ ## Hyperparameters
37
+
38
+
39
+ The hyperparameters adjusted for this project are listed below:
40
+
41
+ - Learning rate: 0.0002
42
+ - Train batch size: 4
43
+ - Number of epochs: 1
44
+ - Trainer: SFT (Stochastic Fine-Tuning)
45
+ - FP16 precision: True
46
+ - Maximum sequence length: 512
47
+
48
+ Other settings were set to default. These hyperparameters were chosen based on experience with similar projects and the characteristics of the training data. However, feel free to experiment with different values to see if they produce better results.
49
+
50
+ ## Conclusion
51
+
52
+
53
+ In conclusion, this project demonstrates how to fine-tune a pre-trained language model on financial data using the autotrain library from Hugging Face. By modifying and adjusting the hyperparameters, you can tailor the training process to suit your specific needs.