ontocord
/

Felix-8B-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

huu-ontocord commited on Dec 18, 2024

Commit

2865d5d

·

verified ·

1 Parent(s): bdabe7c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ April 17, 2024
 Felix-8B-v2 is an experimental language model developed by Ontocord.ai, specializing in addressing lawfulness concerns under the Biden-Harris Executive Order on AI and the principles of the EU AI Act. This model has achieved one of the highest scores on the TruthfulQA benchmark compared to models of its size, showcasing its exceptional performance in providing accurate and reliable responses.
 Felix-8B-v2 is **experimental and a research work product** and a DPO reinforcement learning version of [ontocord/sft-4e-exp2](https://huggingface.co/ontocord/sft-4e-exp2) which in turn is a fine-tuned version of [TencentARC/Mistral_Pro_8B_v0.1](https://huggingface.co/TencentARC/Mistral_Pro_8B_v0.1).
 This model is exactly the same as [Felix-8B](https://huggingface.co/ontocord/Felix-8B) except we modified the ``</s>`` and ``<s>`` tags of the original Felix-8b DPO model to fix the issue of being too verbose.

 Felix-8B-v2 is an experimental language model developed by Ontocord.ai, specializing in addressing lawfulness concerns under the Biden-Harris Executive Order on AI and the principles of the EU AI Act. This model has achieved one of the highest scores on the TruthfulQA benchmark compared to models of its size, showcasing its exceptional performance in providing accurate and reliable responses.
 Felix-8B-v2 is **experimental and a research work product** and a DPO reinforcement learning version of [ontocord/sft-4e-exp2](https://huggingface.co/ontocord/sft-4e-exp2) which in turn is a fine-tuned version of [TencentARC/Mistral_Pro_8B_v0.1](https://huggingface.co/TencentARC/Mistral_Pro_8B_v0.1).
+Felix-8B was DPO trained on our synthetically generated dataset [Auto Redteam Triplets (ART): a synthetic dataset to perform reinforcement learning redteaming for the EU AI Act and Biden-Harris AI Executive Order concerns](ontocord/auto_redteam_triplets).
 This model is exactly the same as [Felix-8B](https://huggingface.co/ontocord/Felix-8B) except we modified the ``</s>`` and ``<s>`` tags of the original Felix-8b DPO model to fix the issue of being too verbose.