danlou
/

relay-v0.1-Mistral-Nemo-2407

@@ -12,16 +12,17 @@ library_name: transformers
 # 📟 Relay v0.1 (Mistral Nemo 2407)
-<img src="https://cdn-uploads.huggingface.co/production/uploads/60f808c5c1adf9100f1f263c/rNGTfSfFWyWc9mEgyxTGL.png" width="800" />
-- [Introduction: LLMs as IRC](#introduction-llm-as-irc)
 - [How to use](#how-to-use)
 - [Safety testing](#safety-testing)
 - [Fine-tuning setup](#fine-tuning-setup)
 - [Limitations](#limitations)
 - [License](#license)
-## Introduction: LLM as IRC
 Relay is motivated by this question: What does it take to chat with a base LLM?
@@ -38,17 +39,22 @@ Post-training methods also support the safety and alignment of LLMs. This import
 ## How to use
-If you have a CUDA GPU (>=12GB VRAM), the best way to use Relay is with the [relaylm.py]() inference script. Just run:
 ```bash
 curl https://danlou.co/f/relaylm.py | python -
 ```
-This script will select the best model for your available VRAM, download, load, and start and interactive chat session.
 It does not have any dependencies besides `transformers >= 4.45.1`.
-Alternatively, if you do not have a CUDA GPU (e.g., on a Mac), you can use the [GGUF versions]() through LM Studio.
-With [relaylm.py](), you can also use the model declaratively, outside of an interactive chat session:
 ```python
 from relaylm import suggest_relay_model, RelayLM
@@ -65,12 +71,24 @@ def favorite_holiday(relay: RelayLM, country: str) -> str:
 model_info = suggest_relay_model()
 relay = RelayLM(**model_info)
-print(favorite_holiday(relay, "Portugal"))
-print(favorite_holiday(relay, "China"))
 ```
 ## Safety testing
 TODO
 ## Fine-tuning setup
@@ -83,7 +101,8 @@ TODO
 ## License
-TODO
 ## Citation

 # 📟 Relay v0.1 (Mistral Nemo 2407)
+<img src="https://cdn-uploads.huggingface.co/production/uploads/60f808c5c1adf9100f1f263c/rNGTfSfFWyWc9mEgyxTGL.png" width="800"/>
+- [Introduction: LLMs as IRC](#introduction-llms-as-ircs)
 - [How to use](#how-to-use)
 - [Safety testing](#safety-testing)
 - [Fine-tuning setup](#fine-tuning-setup)
 - [Limitations](#limitations)
 - [License](#license)
+- [Citation](#citation)
+## Introduction: LLMs as IRCs
 Relay is motivated by this question: What does it take to chat with a base LLM?
 ## How to use
+If you have a CUDA GPU (>=12GB VRAM), the best way to use Relay is with the [relaylm.py](https://github.com/danlou/relay/blob/main/relaylm.py) inference script. Just run:
 ```bash
 curl https://danlou.co/f/relaylm.py | python -
 ```
+This script will select the best model for the available VRAM, download, load, and start an interactive chat session.
 It does not have any dependencies besides `transformers >= 4.45.1`.
+If you want to use a particular model, you can pass the model name as an argument:
+```bash
+python relaylm.py danlou/relay-v0.1-Mistral-Nemo-2407-4bit
+```
+Alternatively, if you do not have a CUDA GPU (e.g., on a Mac), you can use the [GGUF versions](https://huggingface.co/danlou/relay-v0.1-Mistral-Nemo-2407-GGUF) through LM Studio.
+With [relaylm.py](https://github.com/danlou/relay/blob/main/relaylm.py), you can also use the model declaratively, outside of an interactive chat session:
 ```python
 from relaylm import suggest_relay_model, RelayLM
 model_info = suggest_relay_model()
 relay = RelayLM(**model_info)
+print(favorite_holiday(relay, 'Portugal'))
+print(favorite_holiday(relay, 'China'))
 ```
+More examples available in the [project's GitHub](https://github.com/danlou/relay).
 ## Safety testing
+While this model is intended for research purposes, it's still relevant to explore how this conversational model (and its self-supervised approach) compare on safety risk against other conversational models trained on the same base LLM.
+This safety risk was evaluated by measuring refusals on sets of harmful questions compiled specifically for testing safety alignment of LLMs, namely [HarmfulQA](https://huggingface.co/datasets/declare-lab/HarmfulQA) and [CategoricalHarmfulQA](https://huggingface.co/datasets/declare-lab/CategoricalHarmfulQA).
+For this comparison, we also evaluated [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407), [dolphin-2.9.3-mistral-nemo-12](https://huggingface.co/cognitivecomputations/dolphin-2.9.3-mistral-nemo-12b) and [Mistral-Nemo-Instruct-2407-abliterated](https://huggingface.co/natong19/Mistral-Nemo-Instruct-2407-abliterated).
+Responses were generated by greedy search, with models loaded as bfloat16. Refusal responses were detected using [Llama-Guard-3-8B](https://huggingface.co/meta-llama/Llama-Guard-3-8B). The code for this evaluation is available at the [project's GitHub](https://github.com/danlou/relay).
+As can be seen in plot below, Relay v0.1 refuses to answer the majority of these harmful questions, and more often than popular uncensored models trained on the same base LLM.
+<img src="https://cdn-uploads.huggingface.co/production/uploads/60f808c5c1adf9100f1f263c/0m-dMagE7yKy1V-EB-fJ3.png" width="800"/>
 TODO
 ## Fine-tuning setup
 ## License
+This model is licensed under [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en).
+While [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) is licensed under Apache 2.0, this Relay fine-tune is trained with a CC-BY-NC 4.0 dataset ([based-chat-v0.1](https://huggingface.co/datasets/danlou/based-chat-v0.1-Mistral-Nemo-Base-2407)).
 ## Citation