File size: 2,697 Bytes
676dbde
 
38eaca8
 
 
 
 
 
 
 
 
 
 
 
 
 
676dbde
 
 
 
61895fb
676dbde
7074074
 
676dbde
 
61895fb
676dbde
a574b7c
676dbde
 
 
 
 
 
 
 
 
 
 
 
38eaca8
676dbde
 
 
 
 
 
 
38eaca8
676dbde
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8c234e0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
license: mit
datasets:
- VishnuPJ/Malayalam_CultureX_IndicCorp_SMC
library_name: transformers
language:
- ml
tags:
- mamba
- ssm
- s6
- jamba
- llm
- state space models
- malayalam
- indic
---

# Ma-layala-mba

Welcome to Ma-layala-mba, a base Indic language model designed to push the boundaries of NLP for Indian languages. It is based on the Mamba series of state space models.

![Thumbnail](thumbnail.jpg)

## Model Description

Ma-layala-mba is a state-of-the-art S6 SSM model specifically crafted for the South Indian regional and state language of Kerala: Malayalam. It integrates traditional Attention mechanisms with innovative approaches such as MLPs and State Space Models (SSMs) to handle complex linguistic features and achieve high accuracy in language understanding and generation.

- **Model Type**: A 128M Jamba model finetuned on ~1 million samples of Malayalam prompt-response pairs from a subset of the IndicCorp Dataset
- **Language(s)**: Malayalam
- **License**: GNU General Public License v3.0
- **Training Precision**: bfloat16

## Example Usage

Here's a quick example to get you started with the Ma-layala-mba model:

```python
from transformers import MaLayalaMbaForCausalLM, AutoTokenizer, pipeline

model = MaLayalaMbaForCausalLM.from_pretrained(
    "aoxo/Ma-layala-mba_Tiny_128M",
    # load_in_8bit=True, # Set this depending on the GPU you have
    torch_dtype=torch.bfloat16,
    device_map={"": 0}, # Set this depending on the number of GPUs you have
    local_files_only=False # Optional
)
model.eval()

tokenizer = AutoTokenizer.from_pretrained("aoxo/Ma-layala-mba_Tiny_128M")

input_ids = tokenizer("മലയാളം പര്യായപദങ്ങളിൽ ഒരു പരീക്ഷ പേപ്പർ ഉണ്ടാക്കുക", return_tensors='pt').to(model.device)["input_ids"]

outputs = model.generate(input_ids, max_new_tokens=100)

print(tokenizer.batch_decode(outputs))
```

### Example Output:

```
മലയാളം പര്യായപദങ്ങളിൽ ഒരു പരീക്ഷ പേപ്പർ ഉണ്ടാക്കുക

a. വലിയ - __________
b. രസം - __________
c. സുഖം - __________
d. പ്രകാശം - __________
e. വേഗം - __________
```

## Usage Note

Please be aware that this model has not undergone comprehensive detoxification or censorship. While it exhibits strong linguistic capabilities, there is a possibility of generating content that may be deemed harmful or offensive. We advise users to apply discretion and closely monitor the model's outputs, especially in public or sensitive settings.

## Meet the Developers

- **[Alosh Denny](https://x.com/AloshDenny)**