File size: 1,108 Bytes
acf7638 d8e810f acf7638 d8e810f acf7638 d8e810f acf7638 d8e810f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
---
language:
- en
license: mit
datasets:
- PatrickHaller/wiki-and-book-corpus-500M
---
# An xLSTM Model
Trained with [Helibrunna](https://github.com/PatrickHaller/helibrunna) (fork)
To use this model the [xLSTM](https://github.com/NX-AI/xlstm) package is required. We recommend to install
it locally with conda:
```bash
git clone https://github.com/NX-AI/xlstm
cd xlstm
conda env create -n xlstm -f environment_pt220cu121.yaml
conda activate xlstm
```
## Usage
```python
from transformers import AutoModelForCasualLM, AutoTokenizer
model_name_or_path = "PatrickHaller/xlstm_wikipedia_110M_500M"
model = AutoModelForCasualLM.from_pretrained(model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
input_ids = tokenizer.encode("Hello, my dog is cute", return_tensors="pt")
output = model.generate(input_ids, max_length=100, temperature=0.7, do_sample=True)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
```
## Evaluation
We evaluated all xLSTM wikipedia models on common zero-shot LM benchmarks:
![Evaluation](eval.png) |