README.md · GeoV/GeoV-9b at 2a1c205aff6e227a489193ef8bcb4c145df244e4

metadata

language:
  - en
tags:
  - pytorch
  - causal-lm
license: bigscience-openrail-m

GeoV-9B is a 9 billion parameter autoregressive language model.

The GeoV model was designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER) by Georges Hark and Varuna Jayasiri.

RoPER, in addition to using relative positions in the attention score calculation by RoPE embeddings, adds relative positional information explicitly to value embeddings. Specifically, it incorporates the relative positions of the tokens paid attention to. RoPER has given better performance in some algorithmic tasks, and seems comparable to RoPE in language modeling.

Model details

Developed by: Georges Harik
Model type: Transformer-based Language Model
Language: English

Hyperparameter	Value
n_parameters	9B
n_layers	32
d_model	5120
n_heads	40
d_head	128
n_vocab	65500
Sequence Length	2049

Generation

The generate() method can be used to generate text using GeoV model.

>>> from transformers import GeoVForCausalLM, GeoVTokenizer

>>> model = GeoVForCausalLM.from_pretrained("GeoV/GeoV-9b")
>>> tokenizer = GeoVTokenizer.from_pretrained("GeoV/GeoV-9b")

>>> prompt = "In mathematics, topology is the study of"

>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids

>>> gen_tokens = model.generate(
...     input_ids,
...     do_sample=True,
...     temperature=0.9,
...     max_length=100,
... )
>>> gen_text = tokenizer.batch_decode(gen_tokens)[0]