---
library_name: transformers
license: other
base_model: NousResearch/Hermes-3-Llama-3.1-8B
tags:
- llama-factory
- full
- unsloth
- generated_from_trainer
model-index:
- name: kimhyeongjun/Hermes-3-Llama-3.1-8B-Kor-Finance-Advisor
  results: []
---


# kimhyeongjun/Hermes-3-Llama-3.1-8B-Kor-Finance-Advisor

This is my personal toy project for Chuseok(Korean Thanksgiving Day).

This model is a fine-tuned version of [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B) on the Korean_synthetic_financial_dataset_21K.

추석기간 진행된 개인 토이 프로젝트 입니다.

이 모델은 수제 한국_합성_금융_데이터셋_21K의 [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B)를 미세 조정한 버전입니다.

## Model description

Based on finance PDF data collected directly from the web, we refined the raw data using the 'meta-llama/Meta-Llama-3.1-70B-Instruct' model. 
After generating synthetic data based on the cleaned data, we further evaluated the quality of the generated data using the 'meta-llama/Llama-Guard-3-8B' and 'RLHFlow/ArmoRM-Llama3-8B-v0.1' models. 
We then used 'Alibaba-NLP/gte-large-en-v1.5' to extract embeddings and applied Faiss to perform Jaccard distance-based nearest neighbor analysis to construct the final dataset of 21k, which is multidimensional and sophisticated.

웹에서 직접 수집한 금융 관련 PDF 데이터를 기반으로, 'meta-llama/Meta-Llama-3.1-70B-Instruct' 모델을 활용하여 원시 데이터를 정제하였습니다. 
정제된 데이터를 바탕으로 합성 데이터를 생성한 후, 'meta-llama/Llama-Guard-3-8B' 및 'RLHFlow/ArmoRM-Llama3-8B-v0.1' 모델을 통해 생성된 데이터의 품질을 심층적으로 평가하였습니다. 
이어서 'Alibaba-NLP/gte-large-en-v1.5'를 사용하여 임베딩을 추출하고, Faiss를 적용하여 자카드 거리 기반의 근접 이웃 분석을 수행함으로써 다차원적이고 정교한 최종 데이터셋 21k을 직접 구성하였습니다.

## Task duration
3days (20240914~20240916)

## evaluation
I had to take the Thanksgiving holiday off.

추석연휴 쉬어야되서 없습니다.

## sample

![image/png](https://cdn-uploads.huggingface.co/production/uploads/619d8e31c21bf5feb310bd82/gJ6hnvAV2Qx9774AFFwQe.png)

### Framework versions

- Transformers 4.44.2
- Pytorch 2.4.0+cu121
- Datasets 2.21.0
- Tokenizers 0.19.1