EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
Abstract
This technical report introduces the EXAONE 3.5 instruction-tuned language models, developed and released by LG AI Research. The EXAONE 3.5 language models are offered in three configurations: 32B, 7.8B, and 2.4B. These models feature several standout capabilities: 1) exceptional instruction following capabilities in real-world scenarios, achieving the highest scores across seven benchmarks, 2) outstanding long-context comprehension, attaining the top performance in four benchmarks, and 3) competitive results compared to state-of-the-art open models of similar sizes across nine general benchmarks. The EXAONE 3.5 language models are open to anyone for research purposes and can be downloaded from https://huggingface.co/LGAI-EXAONE. For commercial use, please reach out to the official contact point of LG AI Research: contact_us@lgresearch.ai.
Community
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
Can this run on an Intel CPU? Can I run without Transformers?
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- The Systems Engineering Approach in Times of Large Language Models (2024)
- DroidCall: A Dataset for LLM-powered Android Intent Invocation (2024)
- Evaluating and Aligning CodeLLMs on Human Preference (2024)
- PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback (2024)
- The Zamba2 Suite: Technical Report (2024)
- FullStack Bench: Evaluating LLMs as Full Stack Coders (2024)
- OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 11
Browse 11 models citing this paperDatasets citing this paper 0
No dataset linking this paper