Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: transformers
|
3 |
+
license: other
|
4 |
+
base_model: NousResearch/Hermes-3-Llama-3.1-8B
|
5 |
+
tags:
|
6 |
+
- llama-factory
|
7 |
+
- full
|
8 |
+
- unsloth
|
9 |
+
- generated_from_trainer
|
10 |
+
model-index:
|
11 |
+
- name: kimhyeongjun/Hermes-3-Llama-3.1-8B-Ko-Finance-Advisors
|
12 |
+
results: []
|
13 |
+
---
|
14 |
+
|
15 |
+
|
16 |
+
# kimhyeongjun/Hermes-3-Llama-3.1-8B-Ko-Finance-Advisors
|
17 |
+
|
18 |
+
This model is a fine-tuned version of [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B) on the Korean_synthetic_financial_dataset_21K.
|
19 |
+
|
20 |
+
์ด ๋ชจ๋ธ์ ํ๊ตญ_ํฉ์ฑ_๊ธ์ต_๋ฐ์ดํฐ์
_21K์ [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B)๋ฅผ ๋ฏธ์ธ ์กฐ์ ํ ๋ฒ์ ์
๋๋ค.
|
21 |
+
|
22 |
+
## Model description
|
23 |
+
Based on finance PDF data collected directly from the web, we refined the raw data using the 'meta-llama/Meta-Llama-3.1-70B-Instruct' model.
|
24 |
+
After generating synthetic data based on the cleaned data, we further evaluated the quality of the generated data using the 'meta-llama/Llama-Guard-3-8B' and 'RLHFlow/ArmoRM-Llama3-8B-v0.1' models.
|
25 |
+
We then used 'Alibaba-NLP/gte-large-en-v1.5' to extract embeddings and applied Faiss to perform Jaccard distance-based nearest neighbor analysis to construct the final dataset of 21k, which is multidimensional and sophisticated.
|
26 |
+
|
27 |
+
์น์์ ์ง์ ์์งํ ๊ธ์ต ๊ด๋ จ PDF ๋ฐ์ดํฐ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก, 'meta-llama/Meta-Llama-3.1-70B-Instruct' ๋ชจ๋ธ์ ํ์ฉํ์ฌ ์์ ๋ฐ์ดํฐ๋ฅผ ์ ์ ํ์์ต๋๋ค.
|
28 |
+
์ ์ ๋ ๋ฐ์ดํฐ๋ฅผ ๋ฐํ์ผ๋ก ํฉ์ฑ ๋ฐ์ดํฐ๋ฅผ ์์ฑํ ํ, 'meta-llama/Llama-Guard-3-8B' ๋ฐ 'RLHFlow/ArmoRM-Llama3-8B-v0.1' ๋ชจ๋ธ์ ํตํด ์์ฑ๋ ๋ฐ์ดํฐ์ ํ์ง์ ์ฌ์ธต์ ์ผ๋ก ํ๊ฐํ์์ต๋๋ค.
|
29 |
+
์ด์ด์ 'Alibaba-NLP/gte-large-en-v1.5'๋ฅผ ์ฌ์ฉํ์ฌ ์๋ฒ ๋ฉ์ ์ถ์ถํ๊ณ , Faiss๋ฅผ ์ ์ฉํ์ฌ ์์นด๋ ๊ฑฐ๋ฆฌ ๊ธฐ๋ฐ์ ๊ทผ์ ์ด์ ๋ถ์์ ์ํํจ์ผ๋ก์จ ๋ค์ฐจ์์ ์ด๊ณ ์ ๊ตํ ์ต์ข
๋ฐ์ดํฐ์
21k์ ๊ตฌ์ฑํ์์ต๋๋ค.
|
30 |
+
|
31 |
+
|
32 |
+
## sample
|
33 |
+
|
34 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/619d8e31c21bf5feb310bd82/gJ6hnvAV2Qx9774AFFwQe.png)
|
35 |
+
|
36 |
+
### Framework versions
|
37 |
+
|
38 |
+
- Transformers 4.44.2
|
39 |
+
- Pytorch 2.4.0+cu121
|
40 |
+
- Datasets 2.21.0
|
41 |
+
- Tokenizers 0.19.1
|
42 |
+
|
43 |
+
|