kimhyeongjun commited on
Commit
9b7a5d6
โ€ข
1 Parent(s): 18eea68

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -25,14 +25,19 @@ This model is a fine-tuned version of [NousResearch/Hermes-3-Llama-3.1-8B](https
25
 
26
  ## Model description
27
 
 
 
28
  Based on finance PDF data collected directly from the web, we refined the raw data using the 'meta-llama/Meta-Llama-3.1-70B-Instruct' model.
29
  After generating synthetic data based on the cleaned data, we further evaluated the quality of the generated data using the 'meta-llama/Llama-Guard-3-8B' and 'RLHFlow/ArmoRM-Llama3-8B-v0.1' models.
30
  We then used 'Alibaba-NLP/gte-large-en-v1.5' to extract embeddings and applied Faiss to perform Jaccard distance-based nearest neighbor analysis to construct the final dataset of 21k, which is multidimensional and sophisticated.
31
 
 
 
32
  ์›น์—์„œ ์ง์ ‘ ์ˆ˜์ง‘ํ•œ ๊ธˆ์œต ๊ด€๋ จ PDF ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, 'meta-llama/Meta-Llama-3.1-70B-Instruct' ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜์—ฌ ์›์‹œ ๋ฐ์ดํ„ฐ๋ฅผ ์ •์ œํ•˜์˜€์Šต๋‹ˆ๋‹ค.
33
  ์ •์ œ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•œ ํ›„, 'meta-llama/Llama-Guard-3-8B' ๋ฐ 'RLHFlow/ArmoRM-Llama3-8B-v0.1' ๋ชจ๋ธ์„ ํ†ตํ•ด ์ƒ์„ฑ๋œ ๋ฐ์ดํ„ฐ์˜ ํ’ˆ์งˆ์„ ์‹ฌ์ธต์ ์œผ๋กœ ํ‰๊ฐ€ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
34
  ์ด์–ด์„œ 'Alibaba-NLP/gte-large-en-v1.5'๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ž„๋ฒ ๋”ฉ์„ ์ถ”์ถœํ•˜๊ณ , Faiss๋ฅผ ์ ์šฉํ•˜์—ฌ ์ž์นด๋“œ ๊ฑฐ๋ฆฌ ๊ธฐ๋ฐ˜์˜ ๊ทผ์ ‘ ์ด์›ƒ ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•จ์œผ๋กœ์จ ๋‹ค์ฐจ์›์ ์ด๊ณ  ์ •๊ตํ•œ ์ตœ์ข… ๋ฐ์ดํ„ฐ์…‹ 21k์„ ์ง์ ‘ ๊ตฌ์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค.
35
 
 
36
  ## Task duration
37
  3days (20240914~20240916)
38
 
 
25
 
26
  ## Model description
27
 
28
+ Everything happened automatically without any user intervention.
29
+
30
  Based on finance PDF data collected directly from the web, we refined the raw data using the 'meta-llama/Meta-Llama-3.1-70B-Instruct' model.
31
  After generating synthetic data based on the cleaned data, we further evaluated the quality of the generated data using the 'meta-llama/Llama-Guard-3-8B' and 'RLHFlow/ArmoRM-Llama3-8B-v0.1' models.
32
  We then used 'Alibaba-NLP/gte-large-en-v1.5' to extract embeddings and applied Faiss to perform Jaccard distance-based nearest neighbor analysis to construct the final dataset of 21k, which is multidimensional and sophisticated.
33
 
34
+ ๋ชจ๋“  ๊ณผ์ •์€ ์‚ฌ์šฉ์ž์˜ ๊ฐœ์ž… ์—†์ด ์ž๋™์œผ๋กœ ์ง„ํ–‰๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
35
+
36
  ์›น์—์„œ ์ง์ ‘ ์ˆ˜์ง‘ํ•œ ๊ธˆ์œต ๊ด€๋ จ PDF ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, 'meta-llama/Meta-Llama-3.1-70B-Instruct' ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜์—ฌ ์›์‹œ ๋ฐ์ดํ„ฐ๋ฅผ ์ •์ œํ•˜์˜€์Šต๋‹ˆ๋‹ค.
37
  ์ •์ œ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•œ ํ›„, 'meta-llama/Llama-Guard-3-8B' ๋ฐ 'RLHFlow/ArmoRM-Llama3-8B-v0.1' ๋ชจ๋ธ์„ ํ†ตํ•ด ์ƒ์„ฑ๋œ ๋ฐ์ดํ„ฐ์˜ ํ’ˆ์งˆ์„ ์‹ฌ์ธต์ ์œผ๋กœ ํ‰๊ฐ€ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
38
  ์ด์–ด์„œ 'Alibaba-NLP/gte-large-en-v1.5'๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ž„๋ฒ ๋”ฉ์„ ์ถ”์ถœํ•˜๊ณ , Faiss๋ฅผ ์ ์šฉํ•˜์—ฌ ์ž์นด๋“œ ๊ฑฐ๋ฆฌ ๊ธฐ๋ฐ˜์˜ ๊ทผ์ ‘ ์ด์›ƒ ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•จ์œผ๋กœ์จ ๋‹ค์ฐจ์›์ ์ด๊ณ  ์ •๊ตํ•œ ์ตœ์ข… ๋ฐ์ดํ„ฐ์…‹ 21k์„ ์ง์ ‘ ๊ตฌ์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค.
39
 
40
+
41
  ## Task duration
42
  3days (20240914~20240916)
43