kimhyeongjun
commited on
Commit
โข
5255cd3
1
Parent(s):
f4fec9c
Update README.md
Browse files
README.md
CHANGED
@@ -15,11 +15,16 @@ model-index:
|
|
15 |
|
16 |
# kimhyeongjun/Hermes-3-Llama-3.1-8B-Ko-Finance-Advisors
|
17 |
|
|
|
|
|
18 |
This model is a fine-tuned version of [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B) on the Korean_synthetic_financial_dataset_21K.
|
19 |
|
|
|
|
|
20 |
์ด ๋ชจ๋ธ์ ํ๊ตญ_ํฉ์ฑ_๊ธ์ต_๋ฐ์ดํฐ์
_21K์ [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B)๋ฅผ ๋ฏธ์ธ ์กฐ์ ํ ๋ฒ์ ์
๋๋ค.
|
21 |
|
22 |
## Model description
|
|
|
23 |
Based on finance PDF data collected directly from the web, we refined the raw data using the 'meta-llama/Meta-Llama-3.1-70B-Instruct' model.
|
24 |
After generating synthetic data based on the cleaned data, we further evaluated the quality of the generated data using the 'meta-llama/Llama-Guard-3-8B' and 'RLHFlow/ArmoRM-Llama3-8B-v0.1' models.
|
25 |
We then used 'Alibaba-NLP/gte-large-en-v1.5' to extract embeddings and applied Faiss to perform Jaccard distance-based nearest neighbor analysis to construct the final dataset of 21k, which is multidimensional and sophisticated.
|
@@ -28,6 +33,9 @@ We then used 'Alibaba-NLP/gte-large-en-v1.5' to extract embeddings and applied F
|
|
28 |
์ ์ ๋ ๋ฐ์ดํฐ๋ฅผ ๋ฐํ์ผ๋ก ํฉ์ฑ ๋ฐ์ดํฐ๋ฅผ ์์ฑํ ํ, 'meta-llama/Llama-Guard-3-8B' ๋ฐ 'RLHFlow/ArmoRM-Llama3-8B-v0.1' ๋ชจ๋ธ์ ํตํด ์์ฑ๋ ๋ฐ์ดํฐ์ ํ์ง์ ์ฌ์ธต์ ์ผ๋ก ํ๊ฐํ์์ต๋๋ค.
|
29 |
์ด์ด์ 'Alibaba-NLP/gte-large-en-v1.5'๋ฅผ ์ฌ์ฉํ์ฌ ์๋ฒ ๋ฉ์ ์ถ์ถํ๊ณ , Faiss๋ฅผ ์ ์ฉํ์ฌ ์์นด๋ ๊ฑฐ๋ฆฌ ๊ธฐ๋ฐ์ ๊ทผ์ ์ด์ ๋ถ์์ ์ํํจ์ผ๋ก์จ ๋ค์ฐจ์์ ์ด๊ณ ์ ๊ตํ ์ต์ข
๋ฐ์ดํฐ์
21k์ ๊ตฌ์ฑํ์์ต๋๋ค.
|
30 |
|
|
|
|
|
|
|
31 |
|
32 |
## sample
|
33 |
|
|
|
15 |
|
16 |
# kimhyeongjun/Hermes-3-Llama-3.1-8B-Ko-Finance-Advisors
|
17 |
|
18 |
+
This is a toy project to appease the feeling of being free during Chuseok(Korean Thanksgiving Day).
|
19 |
+
|
20 |
This model is a fine-tuned version of [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B) on the Korean_synthetic_financial_dataset_21K.
|
21 |
|
22 |
+
์ถ์๊ธฐ๊ฐ ๋ฌด๋ฃํจ์ ๋ฌ๋๊ธฐ์ํ ํ ์ด ํ๋ก์ ํธ ์
๋๋ค.
|
23 |
+
|
24 |
์ด ๋ชจ๋ธ์ ํ๊ตญ_ํฉ์ฑ_๊ธ์ต_๋ฐ์ดํฐ์
_21K์ [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B)๋ฅผ ๋ฏธ์ธ ์กฐ์ ํ ๋ฒ์ ์
๋๋ค.
|
25 |
|
26 |
## Model description
|
27 |
+
|
28 |
Based on finance PDF data collected directly from the web, we refined the raw data using the 'meta-llama/Meta-Llama-3.1-70B-Instruct' model.
|
29 |
After generating synthetic data based on the cleaned data, we further evaluated the quality of the generated data using the 'meta-llama/Llama-Guard-3-8B' and 'RLHFlow/ArmoRM-Llama3-8B-v0.1' models.
|
30 |
We then used 'Alibaba-NLP/gte-large-en-v1.5' to extract embeddings and applied Faiss to perform Jaccard distance-based nearest neighbor analysis to construct the final dataset of 21k, which is multidimensional and sophisticated.
|
|
|
33 |
์ ์ ๋ ๋ฐ์ดํฐ๋ฅผ ๋ฐํ์ผ๋ก ํฉ์ฑ ๋ฐ์ดํฐ๋ฅผ ์์ฑํ ํ, 'meta-llama/Llama-Guard-3-8B' ๋ฐ 'RLHFlow/ArmoRM-Llama3-8B-v0.1' ๋ชจ๋ธ์ ํตํด ์์ฑ๋ ๋ฐ์ดํฐ์ ํ์ง์ ์ฌ์ธต์ ์ผ๋ก ํ๊ฐํ์์ต๋๋ค.
|
34 |
์ด์ด์ 'Alibaba-NLP/gte-large-en-v1.5'๋ฅผ ์ฌ์ฉํ์ฌ ์๋ฒ ๋ฉ์ ์ถ์ถํ๊ณ , Faiss๋ฅผ ์ ์ฉํ์ฌ ์์นด๋ ๊ฑฐ๋ฆฌ ๊ธฐ๋ฐ์ ๊ทผ์ ์ด์ ๋ถ์์ ์ํํจ์ผ๋ก์จ ๋ค์ฐจ์์ ์ด๊ณ ์ ๊ตํ ์ต์ข
๋ฐ์ดํฐ์
21k์ ๊ตฌ์ฑํ์์ต๋๋ค.
|
35 |
|
36 |
+
## Task duration
|
37 |
+
3days (20240914-20240916)
|
38 |
+
|
39 |
|
40 |
## sample
|
41 |
|