Triangle104 commited on
Commit
26d585d
1 Parent(s): 52c4d51

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +95 -0
README.md CHANGED
@@ -15,6 +15,101 @@ base_model: ibm-granite/granite-3.1-2b-instruct
15
  This model was converted to GGUF format from [`ibm-granite/granite-3.1-2b-instruct`](https://huggingface.co/ibm-granite/granite-3.1-2b-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
16
  Refer to the [original model card](https://huggingface.co/ibm-granite/granite-3.1-2b-instruct) for more details on the model.
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ## Use with llama.cpp
19
  Install llama.cpp through brew (works on Mac and Linux)
20
 
 
15
  This model was converted to GGUF format from [`ibm-granite/granite-3.1-2b-instruct`](https://huggingface.co/ibm-granite/granite-3.1-2b-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
16
  Refer to the [original model card](https://huggingface.co/ibm-granite/granite-3.1-2b-instruct) for more details on the model.
17
 
18
+ ---
19
+ Model details:
20
+ -
21
+ Granite-3.1-2B-Instruct is a 2B parameter long-context instruct model
22
+ finetuned from Granite-3.1-2B-Base using a combination of open source
23
+ instruction datasets with permissive license and internally collected
24
+ synthetic datasets tailored for solving long context problems. This
25
+ model is developed using a diverse set of techniques with a structured
26
+ chat format, including supervised finetuning, model alignment using
27
+ reinforcement learning, and model merging.
28
+
29
+ Developers: Granite Team, IBM
30
+ GitHub Repository: ibm-granite/granite-3.1-language-models
31
+ Website: Granite Docs
32
+ Paper: Granite 3.1 Language Models (coming soon)
33
+ Release Date: December 18th, 2024
34
+ License: Apache 2.0
35
+
36
+
37
+ Supported Languages:
38
+ English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech,
39
+ Italian, Korean, Dutch, and Chinese. Users may finetune Granite 3.1
40
+ models for languages beyond these 12 languages.
41
+
42
+
43
+ Intended Use:
44
+ The model is designed to respond to general instructions and can be used
45
+ to build AI assistants for multiple domains, including business
46
+ applications.
47
+
48
+
49
+ Capabilities
50
+
51
+
52
+ Summarization
53
+ Text classification
54
+ Text extraction
55
+ Question-answering
56
+ Retrieval Augmented Generation (RAG)
57
+ Code related tasks
58
+ Function-calling tasks
59
+ Multilingual dialog use cases
60
+ Long-context tasks including long document/meeting summarization, long document QA, etc.
61
+
62
+
63
+ Generation:
64
+
65
+ This is a simple example of how to use Granite-3.1-2B-Instruct model.
66
+
67
+
68
+ Install the following libraries:
69
+
70
+
71
+ pip install torch torchvision torchaudio
72
+ pip install accelerate
73
+ pip install transformers
74
+
75
+
76
+
77
+ Then, copy the snippet from the section that is relevant for your use case.
78
+
79
+
80
+ import torch
81
+ from transformers import AutoModelForCausalLM, AutoTokenizer
82
+
83
+ device = "auto"
84
+ model_path = "ibm-granite/granite-3.1-2b-instruct"
85
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
86
+ # drop device_map if running on CPU
87
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
88
+ model.eval()
89
+ # change input text as desired
90
+ chat = [
91
+ { "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
92
+ ]
93
+ chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
94
+ # tokenize the text
95
+ input_tokens = tokenizer(chat, return_tensors="pt").to(device)
96
+ # generate output tokens
97
+ output = model.generate(**input_tokens,
98
+ max_new_tokens=100)
99
+ # decode output tokens into text
100
+ output = tokenizer.batch_decode(output)
101
+ # print output
102
+ print(output)
103
+
104
+
105
+
106
+ Model Architecture:
107
+
108
+ Granite-3.1-2B-Instruct is based on a decoder-only dense transformer
109
+ architecture. Core components of this architecture are: GQA and RoPE,
110
+ MLP with SwiGLU, RMSNorm, and shared input/output embeddings.
111
+
112
+ ---
113
  ## Use with llama.cpp
114
  Install llama.cpp through brew (works on Mac and Linux)
115