davidshtian
/

Mistral-7B-Instruct-v0.2-neuron-1x2048-2-cores-2.18

Text Generation

Model card Files Files and versions Community

davidshtian commited on Apr 2

Commit

d880cf5

•

1 Parent(s): 61e54ef

Update README.md

Files changed (1) hide show

README.md +50 -0

README.md CHANGED Viewed

@@ -1,3 +1,53 @@
 ---
 license: apache-2.0
 ---

 ---
+language:
+- en
+pipeline_tag: text-generation
+inference: false
+tags:
+- mistral
+- inferentia2
+- neuron
+- neuronx
 license: apache-2.0
 ---
+# Neuronx for [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) - Updated Mistral 7B Model on [AWS Inferentia2](https://aws.amazon.com/ec2/instance-types/inf2/) Using AWS Neuron SDK version 2.18~
+This model has been exported to the `neuron` format using specific `input_shapes` and `compiler` parameters detailed in the paragraphs below.
+Please refer to the 🤗 `optimum-neuron` [documentation](https://huggingface.co/docs/optimum-neuron/main/en/guides/models#configuring-the-export-of-a-generative-model) for an explanation of these parameters.
+Note: To compile the mistralai/Mistral-7B-Instruct-v0.2 on Inf2, you need to update the model config sliding_window (either file or model variable) from null to default 4096.
+## Usage with 🤗 `optimum-neuron`
+```python
+>>> from optimum.neuron import pipeline
+>>> p = pipeline('text-generation', 'davidshtian/Mistral-7B-Instruct-v0.2-neuron-1x2048-2-cores-2.18')
+>>> p("My favorite place on earth is", max_new_tokens=64, do_sample=True, top_k=50)
+[{'generated_text': "My favorite place on earth is probably Paris, France, and if I were to go there
+now I would take my partner on a romantic getaway where we could lay on the grass in the park,
+eat delicious French cheeses and wine, and watch the sunset on the Seine river.'"}]
+```
+This repository contains tags specific to versions of `neuronx`. When using with 🤗 `optimum-neuron`, use the repo revision specific to the version of `neuronx` you are using, to load the right serialized checkpoints.
+## Arguments passed during export
+**input_shapes**
+```json
+{
+  "batch_size": 1,
+  "sequence_length": 2048,
+}
+```
+**compiler_args**
+```json
+{
+  "auto_cast_type": "bf16",
+  "num_cores": 2,
+}
+```