--- license: apache-2.0 datasets: - VMware/open-instruct base_model: BEE-spoke-data/smol_llama-220M-GQA inference: parameters: do_sample: true renormalize_logits: true temperature: 0.25 top_p: 0.95 top_k: 50 min_new_tokens: 2 max_new_tokens: 96 repetition_penalty: 1.04 no_repeat_ngram_size: 6 epsilon_cutoff: 0.0006 widget: - text: "Below is an instruction that describes a task, paired with an input that\ \ provides further context. Write a response that appropriately completes the\ \ request. \n \n### Instruction: \n \nWrite an ode to Chipotle burritos.\ \ \n \n### Response: \n" example_title: burritos model-index: - name: smol_llama-220M-open_instruct results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 25.0 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=BEE-spoke-data/smol_llama-220M-open_instruct name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 29.71 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=BEE-spoke-data/smol_llama-220M-open_instruct name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 26.11 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=BEE-spoke-data/smol_llama-220M-open_instruct name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 44.06 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=BEE-spoke-data/smol_llama-220M-open_instruct name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 50.28 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=BEE-spoke-data/smol_llama-220M-open_instruct name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 0.0 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=BEE-spoke-data/smol_llama-220M-open_instruct name: Open LLM Leaderboard --- # BEE-spoke-data/smol_llama-220M-open_instruct > Please note that this is an experiment, and the model has limitations because it is smol. prompt format is alpaca. ``` Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: How can I increase my meme production/output? Currently, I only create them in ancient babylonian which is time consuming. ### Response: ``` This was **not** trained using a separate 'inputs' field (as `VMware/open-instruct` doesn't use one). ## Example Output on the text above ^. The inference API is set to sample with low temp so you should see (_at least slightly_) different generations each time. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/60bccec062080d33f875cd0c/MdOB7TD5UosPGZvdZWG0I.png) Note that the inference API parameters used here are an initial educated guess, and may be updated over time: ```yml inference: parameters: do_sample: true renormalize_logits: true temperature: 0.25 top_p: 0.95 top_k: 50 min_new_tokens: 2 max_new_tokens: 96 repetition_penalty: 1.04 no_repeat_ngram_size: 6 epsilon_cutoff: 0.0006 ``` Feel free to experiment with the parameters using the model in Python and let us know if you have improved results with other params! ## Data This was trained on `VMware/open-instruct` so do whatever you want, provided it falls under the base apache-2.0 license :) --- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_BEE-spoke-data__smol_llama-220M-open_instruct) | Metric |Value| |---------------------------------|----:| |Avg. |29.19| |AI2 Reasoning Challenge (25-Shot)|25.00| |HellaSwag (10-Shot) |29.71| |MMLU (5-Shot) |26.11| |TruthfulQA (0-shot) |44.06| |Winogrande (5-shot) |50.28| |GSM8k (5-shot) | 0.00|