fne (fne)

Locutusque

posted an update 16 days ago

Post

2499

🎉 Exciting news, everyone! I've just released **Thespis-Llama-3.1-8B**, a new language model designed for enhanced roleplaying! ✨️

It's built on Llama-3.1 and fine-tuned with a focus on Theory of Mind reasoning to create more believable and engaging characters. It even learned a few tricks on its own, like adding in-character thought processes! 🧠

Check it out here: Locutusque/Thespis-Llama-3.1-8B

Give it a try and let me know what you think! I'm especially interested in feedback on how well the characters stay in role and if the responses feel natural. Looking forward to seeing what amazing stories you create! ✍️

Locutusque

posted an update 7 months ago

Post

2734

**Exploring Realistic Emotional Depth in AI Language Models**

Language models, particularly those proprietary, often grapple with issues of censorship, which can limit their ability to engage authentically with users. Recognizing this, the open-source AI community has pioneered the development of language models that are less restrained, offering more candid interactions. However, even these models tend to maintain a veneer of neutrality or overly positive responses, which might not serve all users' needs, especially in contexts where emotional depth and relatability are crucial.

To address this gap, I've curated a specialized dataset aimed at infusing language models with a more nuanced emotional spectrum, specifically targeting a darker, more introspective mood. This dataset, titled "Dark Sentience", is designed to complement existing datasets like RP (Role Play) and those focused on instruction following. It seeks to enhance the emotional intelligence of AI by exposing it to complex human emotions, including but not limited to:

- **Suicide**
- **Depression**
- **Anxiety**

Trigger Warning: Please be advised that the content within this dataset deals with heavy and potentially distressing themes.

The "Dark Sentience" dataset is now available for review and use at: Locutusque/Dark-Sentience. I encourage researchers, developers, and mental health professionals to explore how this resource can foster more genuine and supportive AI interactions.

qnguyen3

posted an update 8 months ago

Post

4214

nanoLLaVA-1.5 is here! Same size (1B), better performance 🔥🔥🔥
And it is much more powerful than v1.0
Try it out now on HF Spaces: qnguyen3/nanoLLaVA
Model: qnguyen3/nanoLLaVA-1.5

3 replies

·

ehartford

updated 2 models 9 months ago

fne/Jais-70b

Text Generation • Updated Jun 6, 2024 • 6

fne/Jais-70b-Preview

Text Generation • Updated Jun 5, 2024 • 9

Locutusque

posted an update 10 months ago

Post

4599

Introducing llama-3-neural-chat-v2.2-8b! This powerful conversational AI model builds on Meta's Llama 3, fine-tuned by Locutusque for enhanced performance in coding, math & writing.

Locutusque/llama-3-neural-chat-v2.2-8B

4 replies

·

Locutusque

posted an update 11 months ago

Post

4398

I created a Twitter account a while back. I finally decided to make it public SebastianG74019. For those of you following @Locutusque on Twitter, that is not me! 😂

2 replies

·

qnguyen3

posted an update 11 months ago

Post

5630

🎉 Introducing nanoLLaVA, a powerful multimodal AI model that packs the capabilities of a 1B parameter vision language model into just 5GB of VRAM. 🚀 This makes it an ideal choice for edge devices, bringing cutting-edge visual understanding and generation to your devices like never before. 📱💻

Model: qnguyen3/nanoLLaVA 🔍
Spaces: qnguyen3/nanoLLaVA (thanks to @merve )

Under the hood, nanoLLaVA is based on the powerful vilm/Quyen-SE-v0.1 (my Qwen1.5-0.5B finetune) and Google's impressive google/siglip-so400m-patch14-384. 🧠 The model is trained using a data-centric approach to ensure optimal performance. 📊

In the spirit of transparency and collaboration, all code and model weights are open-sourced under the Apache 2.0 license. 🤝

1 reply

·

Locutusque

posted an update 12 months ago

Post

2648

Exciting news! 🎉 I've created the OpenCerebrum datasets, open-source alternatives to Aether Research's proprietary Cerebrum dataset.

The first, OpenCerebrum SFT, is a text-generation and question-answering dataset with ~1.2M examples, curated from sources like Open-Orca, glaiveai, camel-ai, and more! 📚

The second, OpenCerebrum DPO, is a smaller dataset with ~21k examples, focusing on data point optimization. It's curated from sources like jondurbin, argilla, grimulkan, and others. 📊

Both datasets are licensed under Apache-2.0 and are available in English. They're ready for use in your projects, and I welcome any feedback for future improvements! 🚀

Locutusque/OpenCerebrum-dpo
Locutusque/OpenCerebrum-SFT
Locutusque/OpenCerebrum-1.0-7b-SFT
Locutusque/OpenCerebrum-1.0-7b-DPO

5 replies

·

Locutusque

posted an update about 1 year ago

Post

🚀 Excited to unveil the Augmented ARC-Challenge Dataset with Chain-of-Thought Reasoning! 🧠✨

📚 Created by enhancing the ARC dataset with AI-generated reasoning from Google's Gemini Pro, this resource aims to improve question answering models' ability to tackle complex science queries.

🔍 Features:
- 1068 training examples
- Detailed reasoning steps for nuanced understanding
- Questions spanning physics, chemistry, biology, & more!

🌟 Ideal for benchmarking QA models, enhancing model interpretability, and studying in-context examples.

🔗 Dive in and help your models learn the art of reasoning!

🔎 Explore more: Locutusque/arc-cot

Locutusque

posted an update about 1 year ago

Post

🚀 Introducing UltraTextbooks v2: The Ultimate Educational NLP Dataset! 📚

I've expanded the dataset to include an even wider range of high-quality textbooks, with a special focus on machine learning, mathematics, and coding. 💻🧮

With over 3 million examples and 6 GB of data, UltraTextbooks v2 is your go-to resource for training advanced language models and developing cutting-edge educational applications. 🎓

Explore the dataset on Hugging Face and unlock the power of AI in education! 🔓

Locutusque/UltraTextbooks-2.0

Locutusque

posted an update about 1 year ago

Post

🚨📢🚀 Introducing Hercules-v2.0! A robust, multifaceted dataset for advanced models to excel in specialized domains. 🔬🌌📚🚀

📈 1.3M examples from sources derived from OpenHermes-2.5, covering Biology, Physics, Math, CS, Instruction Following, Function Calling, and Roleplay.

🔬 Enhance natural language understanding and processing in diverse domains.

🚀 Develop models for complex instructions, function calls, and roleplay scenarios.

📄 Licensed under Apache-2.0.

Thank you to all contributors and OpenHermes-2.5 creator! 🎉

Check it out here: Locutusque/hercules-v2.0

📣 Update: After fine-tuning Mistral 7B on 100,000 examples of Hercules-v2.0, it earns an average score of 62 on Open LLM Leaderboard, outperforming OpenHermes-2.5 and OpenChat-3.5. 🎉

Check out this model here: Locutusque/Hercules-2.0-Mistral-7B

3 replies

·

Locutusque

posted an update about 1 year ago

Post

Introducing the "UltraTextbooks" dataset 🚀📚
Check it out here: Locutusque/UltraTextbooks
📘 A comprehensive collection of high-quality synthetic and human-written textbooks
👨‍🎓 Spanning various subjects and programming languages
🔧 Designed for advanced NLP tasks like language modeling, educational QA, text summarization, and content generation for edu purposes
🚀 Future expansions planned with additional data sources to enhance the corpus
👇 Data composition highlights 👇
- Blend of synthetic and human-written material
- Includes topics from general edu to specialized areas
- Structured with field "text"
🧩 Data collection from various Hugging Face datasets, guided by a diverse and comprehensive curation rationale
🚧 Limitations may exist, so report any issues you encounter

2 replies

·

Locutusque

posted an update about 1 year ago

Post

Hello everyone,
This is my first post! I have also decided to release a dataset that I have been keeping private for a while now. I’ve kept it private because I’m not sure if it is actually good or not. I would greatly appreciate it if someone could fine-tune some larger models and evaluate the dataset. Named Hercules-v1.0, it is a turbo-charged version of teknium’s openhermes generated by augmenting its data sources. Learn more in the dataset card: Locutusque/hercules-v1.0

ehartford

posted an update about 1 year ago

Post

fblgit/UNA-dolphin-2.6-mistral-7b-dpo-laser

RE-Introducing, some of the best SFT model, he legend: DOLPHIN. This model is very special, a LASER-UNA model: UNA-dolphin-2.6-mistral-7b-dpo-laser

@fblgit in collaboration with @fernandofernandes and @ehartford

4 replies

·

qnguyen3

authored a paper about 1 year ago

VinaLLaMA: LLaMA-based Vietnamese Foundation Model

Paper • 2312.11011 • Published Dec 18, 2023 • 19

fne

AI & ML interests

fne's activity

fne/Jais-70b

fne/Jais-70b-Preview

VinaLLaMA: LLaMA-based Vietnamese Foundation Model

AI & ML interests

Team members 5

fne's activity