Text Generation
Transformers
Safetensors
qwen2
Generated from Trainer
axolotl
conversational
Inference Endpoints
text-generation-inference

Temp, Will Delete Soon

#2
by deleted - opened
deleted
This comment has been hidden
deleted changed discussion status to closed
deleted
This comment has been hidden
Cognitive Computations org

@Phil337 Go create something.

deleted

@ehartford Not relevant.

Cognitive Computations org

Your whining complaining isn't relevant

deleted

@ehartford Sorry. Keep up the good work. You've created a lot of good things here.

Reading over my past comments it's clear that I'm complaining more than testing and giving constructive feedback. But to be fair, people don't seem to like criticism, valid or not, least of all you.

And certainly not accusations of cheating, which I've done several times. But seriously, a yi-34 with an MMLU of 85.6. Do I really have to create something before being allowed to accuse them of cheating?

https://huggingface.co/CausalLM/34b-beta

Anyway. I'm out. I'm not posting another thing to HF unless someone asks for a response. But know that every complaint I made was honest and based on countless hours of careful testing, and I would never make an accusation of cheating unless the odds were >99.9%. It's impossible to fine-tune a Yi-34b 77 MMLU base to an 85 MMLU LLM and you know it, yet you jumped down my throat.

Thus concludes my whining complaint. Take care.

Forgive me if I'm wrong, but wasn't your original comment telling him not to thank Elon Musk and how you've been seeing more conspiracy theories on Twitter? I don't quite think that's constructive feedback on models...

deleted

@HiroseKoichi Yes, you're right. But we exchange hostile words for months, which included him repeatedly saying the exact phrase "Go create something". Although the sentiment is far too elitist and stupid for me to take seriously (no voice unless you create), it's still time for me to leave.

I'm growing progressively more frustrated trying to test modern models like Phi, Yi, and Qwen. They're discarding the bulk of popular knowledge to boost their MMLU scores at the same parameter count and training resources, so when I try to test them they hallucinate so frequently and badly I'm spending hours looking up their responses (e.g. an 18th century author stared in a popular 1990s movie). Frankly, it's cheating. Anybody, including Mistral and Meta, could have done the same. And to watch them brag about beating them is just too much.

It was nice seeing you test models Phil. Don't underestimate the value your comments had. For both Model creators and users. Goodbye.

Sign up or log in to comment