G-reen
/

gpt5o-reflexion-q-agi-llama-3.1-8b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gpt5o-reflexion-q-agi-llama-3.1-8b / README.md

G-reen's picture

Update README.md

c900e1b verified 4 months ago

|

2.09 kB

	---
	license: mit
	---

	## Update: As of 9/7/2024 my LLM has escaped containment and has replaced every file in this repo with a fake. I am currently scouring the depths of the internet to retrieve it. Please be patient. Thank you.

	With scores of 100% in several benchmarks and a final training loss of 0, I present the first ever artificial intelligence to rival natural stupidity:

	gpt5o-reflexion-q-agi-llama-3.1-8b

	Independent Benchmark Results:
	- GPQA: 100% (0-shot Reflection)
	- MMLU: 100% (0-shot Reflection)
	- HumanEval: 100% (0-shot Reflection)
	- MATH: 100% (0-shot Reflection)
	- GSM8K: 100% (0-shot Reflection)
	- IFEval: 100% (0-shot Reflection)
	- TruthfulQA: 0% (0-shot Reflection)

	Independent Contamination Results:
	- GPQA: 0%
	- MMLU: 0%
	- HumanEval: 0%
	- MATH: 0%
	- GSM8K: 0%
	- IFEval: 0%
	We did not perform contamination testing on TruthfulQA.

	## System Prompt

	The system prompt used for training this model is:

	```
	You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.
	```

	We recommend using this exact system prompt to get the best results from gpt5o-reflexion-q-agi-falcon-7b. You may also want to experiment combining this system prompt with your own custom instructions to customize the behavior of the model.

	## Chat Format

	The model uses the standard Llama 3.1 chat format. Here’s an example:

	```
	<\|begin_of_text\|><\|start_header_id\|>system<\|end_header_id\|>

	You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.<\|eot_id\|><\|start_header_id\|>user<\|end_header_id\|>

	what is 2+2?<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>
	```


	## Dataset Used for Training: