princeton-nlp
/

Llama-3-8B-ProLong-512k-Instruct

Model card Files Files and versions Community

princeton-nlp commited on Aug 28

Commit

eae0626

•

1 Parent(s): 641094a

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -33,10 +33,15 @@ Contact: `{tianyug, awettig}@princeton.edu`
 ## Benchmarking results
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/607f846419a5af0183d7bfb9/78ppz9Y3-_fROfYIV_Re4.png)
 You can find results for more tasks and models in this [spreadsheet](https://docs.google.com/spreadsheets/d/1qGzimBE8F896p1m7_yWHnjyGX7kpEAeyaT1h2iTbNzE/edit?usp=sharing). In this detailed results, we show that our model can retain the original Llama-3's general LM performance (on tasks selected by the [HF Open LLM Leaderboard v1](https://huggingface.co/spaces/open-llm-leaderboard-old/open_llm_leaderboard)). This is non-trivial in long-context fine-tuning and requires a careful selection of the fine-tuning data mixture and the training configurations.

 ## Benchmarking results
+64K result:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/607f846419a5af0183d7bfb9/78ppz9Y3-_fROfYIV_Re4.png)
+512K result:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/607f846419a5af0183d7bfb9/pwdlpXKSG68V0MxFKNOn1.png)
 You can find results for more tasks and models in this [spreadsheet](https://docs.google.com/spreadsheets/d/1qGzimBE8F896p1m7_yWHnjyGX7kpEAeyaT1h2iTbNzE/edit?usp=sharing). In this detailed results, we show that our model can retain the original Llama-3's general LM performance (on tasks selected by the [HF Open LLM Leaderboard v1](https://huggingface.co/spaces/open-llm-leaderboard-old/open_llm_leaderboard)). This is non-trivial in long-context fine-tuning and requires a careful selection of the fine-tuning data mixture and the training configurations.