GeoV
/

GeoV-9b

Text Generation

Model card Files Files and versions Community

GeoV-9b / README.md

vpj's picture

vpj

Update README.md

838b2f7 almost 2 years ago

|

2.02 kB

	---
	language:
	- en
	tags:
	- pytorch
	- causal-lm
	license: bigscience-openrail-m
	---


	[GeoV](https://huggingface.co/docs/transformers/model_doc/geov)-9B is a 9 billion parameter autoregressive language model.

	The GeoV model was designed by Georges Harik and uses
	[Rotary Positional Embeddings with Relative distances (RoPER)](http://research.labml.ai/RoPER.html)
	by [Georges Hark](https://twitter.com/ghark) and [Varuna Jayasiri](https://twitter.com/vpj).

	[RoPER]((http://research.labml.ai/RoPER.html),
	in addition to using relative positions in the attention score calculation by RoPE embeddings,
	adds relative positional information explicitly to value embeddings.
	Specifically, it incorporates the relative positions of the tokens paid attention to.
	RoPER has given better performance in some algorithmic tasks, and seems comparable to RoPE in language modeling.

	## Model details

	- Developed by: [Georges Harik](http://twitter.com/gharik)
	- Model type: Transformer-based Language Model
	- Language: English

	<figure style="width:30em">

	\| Hyperparameter \| Value \|
	\| ---------------------- \| ----------- \|
	\| n<sub>parameters</sub> \| 9B \|
	\| n<sub>layers</sub> \| 32 \|
	\| d<sub>model</sub> \| 5120 \|
	\| n<sub>heads</sub> \| 40 \|
	\| d<sub>head</sub> \| 128 \|
	\| n<sub>vocab</sub> \| 65500 \|
	\| Sequence Length \| 2049 \|
	</figure>


	## Generation

	The `generate()` method can be used to generate text using GeoV model.

	```python
	>>> from transformers import GeoVForCausalLM, GeoVTokenizer

	>>> model = GeoVForCausalLM.from_pretrained("GeoV/GeoV-9b")
	>>> tokenizer = GeoVTokenizer.from_pretrained("GeoV/GeoV-9b")

	>>> prompt = "In mathematics, topology is the study of"

	>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids

	>>> gen_tokens = model.generate(
	... input_ids,
	... do_sample=True,
	... temperature=0.9,
	... max_length=100,
	... )
	>>> gen_text = tokenizer.batch_decode(gen_tokens)[0]
	```