PatrickHaller
/

xlstm_wikipedia_110M_500M

Model card Files Files and versions Community

xlstm_wikipedia_110M_500M / README.md

PatrickHaller's picture

Update README.md

d8e810f verified 3 months ago

|

history blame contribute delete

1.11 kB

	---
	language:
	- en
	license: mit
	datasets:
	- PatrickHaller/wiki-and-book-corpus-500M
	---

	# An xLSTM Model

	Trained with [Helibrunna](https://github.com/PatrickHaller/helibrunna) (fork)

	To use this model the [xLSTM](https://github.com/NX-AI/xlstm) package is required. We recommend to install
	it locally with conda:

	```bash
	git clone https://github.com/NX-AI/xlstm
	cd xlstm
	conda env create -n xlstm -f environment_pt220cu121.yaml
	conda activate xlstm
	```


	## Usage

	```python
	from transformers import AutoModelForCasualLM, AutoTokenizer

	model_name_or_path = "PatrickHaller/xlstm_wikipedia_110M_500M"

	model = AutoModelForCasualLM.from_pretrained(model_name_or_path)
	tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)

	input_ids = tokenizer.encode("Hello, my dog is cute", return_tensors="pt")
	output = model.generate(input_ids, max_length=100, temperature=0.7, do_sample=True)
	generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

	print(generated_text)

	```

	## Evaluation

	We evaluated all xLSTM wikipedia models on common zero-shot LM benchmarks:

	![Evaluation](eval.png)