Nexusflow

Enterprise

company

Verified

https://nexusflow.ai/

NexusflowX

nexusflowai

Activity Feed

AI & ML interests

Democratize GenAI Agents for Enterprise Workflows

Recent Activity

nexus-jt-llm authored a paper about 1 month ago

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

nexus-jt-llm authored a paper 5 months ago

Thinking LLMs: General Instruction Following with Thought Generation

banghua authored a paper 9 months ago

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

View all activity

Nexusflow's activity

nexus-jt-llm

authored a paper about 1 month ago

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

Paper • 2502.03275 • Published Feb 5 • 15

nexus-jt-llm

authored a paper 5 months ago

Thinking LLMs: General Instruction Following with Thought Generation

Paper • 2410.10630 • Published Oct 14, 2024 • 19

banghua

authored a paper 9 months ago

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

Paper • 2406.11939 • Published Jun 17, 2024 • 7

banghua

authored a paper 10 months ago

The Effective Horizon Explains Deep RL Performance in Stochastic Environments

Paper • 2312.08369 • Published Dec 13, 2023

banghua

posted an update 12 months ago

Post

1556

Have we really squeezed out the capacity of a compact chat model? Thrilled to see our latest open model, Starling-7B, ranks 13th among all models in Chatbot Arena!
🚀 As a 7B model, Starling surpasses larger open and proprietary models, including Claude-2, GPT-3.5-Turbo, Gemini Pro, Mixtral 8x7B and Llama2-70B, and is currently the best 7B chat model in Chatbot Arena!
Try out the model on HF here: Nexusflow/Starling-LM-7B-beta

banghua

authored a paper about 1 year ago

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Paper • 2403.04132 • Published Mar 7, 2024 • 39

banghua

posted an update about 1 year ago

Post

🚀 Exciting breakthrough in LLM reliability! 🧠NexusRaven-V2, our cutting-edge function-calling LLM, has set a new standard in minimizing AI hallucinations, surpassing GPT-4's performance in a recent third-party independent research benchmark.

Dive into our latest blog post to explore how we're pioneering reliable agents with minimal hallucinations: https://nexusflow.ai/blogs/towards-reliable-agents-with-minimal-hallucination

Key Highlights:

🏆 Zero Hallucinations: NexusRaven-V2 showcased remarkable accuracy with zero hallucinations in 840 tests, focusing on tool selection and usage – a significant leap over GPT-4 with 23 hallucinations.

📈 Enhanced Success Rates: It boasts a 9% higher success rate than GPT-4 in information-seeking applications requiring meticulous attention to detail and a 4% increase in adversarial scenarios that demand a deep understanding of tool documentation, even with vague tool and API argument names.

Try NexusRaven-V2 on Huggingface: Nexusflow/NexusRaven-V2-13B

Check out the original third-party benchmark: https://arxiv.org/abs/2401.08326

banghua

authored 8 papers over 1 year ago

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 32

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Paper • 2103.12021 • Published Mar 22, 2021

Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons

Paper • 2301.11270 • Published Jan 26, 2023 • 2

Online Learning in Stackelberg Games with an Omniscient Follower

Paper • 2301.11518 • Published Jan 27, 2023 • 1

Jump-Start Reinforcement Learning

Paper • 2204.02372 • Published Apr 5, 2022 • 1

AI & ML interests

Recent Activity

Team members 20

Nexusflow's activity