akhauriyash nielsr HF staff commited on
Commit
7f4aa1e
·
verified ·
1 Parent(s): 20ac076

Add pipeline tag: text-generation (#1)

Browse files

- Add pipeline tag: text-generation (3328523b3bd7cd186e4f70f230df5e896aa7b779)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +29 -12
README.md CHANGED
@@ -1,23 +1,23 @@
1
  ---
2
- license: mit
3
- library_name: transformers
4
  base_model:
5
  - meta-llama/Llama-2-7b-hf
 
 
 
6
  ---
 
7
  # TokenButler
8
  <!-- markdownlint-disable first-line-h1 -->
9
  <!-- markdownlint-disable html -->
10
  <!-- markdownlint-disable no-duplicate-header -->
11
 
12
-
13
-
14
  <div align="center">
15
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/tokenbutlerlogo.png?raw=true" width="50%" alt="TokenButler" />
16
  </div>
17
  <hr>
18
  <div align="center" style="line-height: 1;">
19
  <!-- Paper Badge -->
20
- <a href="https://github.com/abdelfattah-lab/TokenButler/blob/main/TokenButler_Draft.pdf" target="_blank" style="margin: 2px;">
21
  <img alt="Paper"
22
  src="https://img.shields.io/badge/Paper-View-orange?logo=readthedocs&logoColor=white"
23
  style="display: inline-block; vertical-align: middle;"/>
@@ -25,18 +25,22 @@ base_model:
25
  <!-- GitHub Badge -->
26
  <a href="https://github.com/abdelfattah-lab/TokenButler" target="_blank" style="margin: 2px;">
27
  <img alt="GitHub"
28
- src="https://img.shields.io/badge/GitHub-Repo-black?logo=github&logoColor=white"
 
 
 
 
 
 
29
  style="display: inline-block; vertical-align: middle;"/>
30
  </a>
31
  </div>
32
 
33
  <br>
34
 
35
-
36
-
37
  The collection of TokenButler models can be found [here](https://huggingface.co/collections/akhauriyash/tokenbutler-67cf181b5762d0d60e5f312b). To run the `meta-llama/Llama-2-7b-hf` model, follow:
38
 
39
- ```
40
  from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
41
 
42
  question = "If millionaires have butlers, why don't million dollar language models have a butler too? I think its because "
@@ -53,7 +57,7 @@ print(response[0]['generated_text'][len(question):])
53
 
54
  Note that the 'default' configured sparsity is 50%. Further, there is a 'sliding window' of 128 and 8 'anchor tokens'. To 'change' the sparsity, you can use the following function after loading the model. Please note that the 'fixed' is the only supported strategy at the moment, which 'fixes' the sparsity of each layer (except the first) at the 'pc' (percentage) mentioned. This can also be found at `test_hf.py`. Sliding window and anchor tokens can be changed in a similar manner.
55
 
56
- ```
57
  def set_sparsity(model, sparsity):
58
  for module in model.modules():
59
  if module.__class__.__name__.__contains__("AttentionExperimental"):
@@ -64,7 +68,6 @@ def set_sparsity(model, sparsity):
64
  model = set_sparsity(model, "fixed_60pc")
65
  ```
66
 
67
-
68
  # Predictor Architecture
69
  <div align="center">
70
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/mainfig.png?raw=true" width="100%" alt="TokenButlerFigure" />
@@ -73,4 +76,18 @@ model = set_sparsity(model, "fixed_60pc")
73
  # Custom Synthetic Task
74
  <div align="center">
75
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/datasetfig.png?raw=true" width="100%" alt="Synthetic Tasks" />
76
- </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  base_model:
3
  - meta-llama/Llama-2-7b-hf
4
+ library_name: transformers
5
+ license: mit
6
+ pipeline_tag: text-generation
7
  ---
8
+
9
  # TokenButler
10
  <!-- markdownlint-disable first-line-h1 -->
11
  <!-- markdownlint-disable html -->
12
  <!-- markdownlint-disable no-duplicate-header -->
13
 
 
 
14
  <div align="center">
15
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/tokenbutlerlogo.png?raw=true" width="50%" alt="TokenButler" />
16
  </div>
17
  <hr>
18
  <div align="center" style="line-height: 1;">
19
  <!-- Paper Badge -->
20
+ <a href="https://arxiv.org/abs/2503.07518" target="_blank" style="margin: 2px;">
21
  <img alt="Paper"
22
  src="https://img.shields.io/badge/Paper-View-orange?logo=readthedocs&logoColor=white"
23
  style="display: inline-block; vertical-align: middle;"/>
 
25
  <!-- GitHub Badge -->
26
  <a href="https://github.com/abdelfattah-lab/TokenButler" target="_blank" style="margin: 2px;">
27
  <img alt="GitHub"
28
+ src="https://img.shields.io/badge/GitHub-%23121011.svg?logo=github&logoColor=white"
29
+ style="display: inline-block; vertical-align: middle;"/>
30
+ </a>
31
+ <!-- Huggingface Badge -->
32
+ <a href="https://huggingface.co/collections/akhauriyash/tokenbutler-67cf181b5762d0d60e5f312b" target="_blank" style="margin: 2px;">
33
+ <img alt="Huggingface"
34
+ src="https://img.shields.io/badge/Hugging%20Face-FFD21E?logo=huggingface&logoColor=000"
35
  style="display: inline-block; vertical-align: middle;"/>
36
  </a>
37
  </div>
38
 
39
  <br>
40
 
 
 
41
  The collection of TokenButler models can be found [here](https://huggingface.co/collections/akhauriyash/tokenbutler-67cf181b5762d0d60e5f312b). To run the `meta-llama/Llama-2-7b-hf` model, follow:
42
 
43
+ ```python
44
  from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
45
 
46
  question = "If millionaires have butlers, why don't million dollar language models have a butler too? I think its because "
 
57
 
58
  Note that the 'default' configured sparsity is 50%. Further, there is a 'sliding window' of 128 and 8 'anchor tokens'. To 'change' the sparsity, you can use the following function after loading the model. Please note that the 'fixed' is the only supported strategy at the moment, which 'fixes' the sparsity of each layer (except the first) at the 'pc' (percentage) mentioned. This can also be found at `test_hf.py`. Sliding window and anchor tokens can be changed in a similar manner.
59
 
60
+ ```python
61
  def set_sparsity(model, sparsity):
62
  for module in model.modules():
63
  if module.__class__.__name__.__contains__("AttentionExperimental"):
 
68
  model = set_sparsity(model, "fixed_60pc")
69
  ```
70
 
 
71
  # Predictor Architecture
72
  <div align="center">
73
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/mainfig.png?raw=true" width="100%" alt="TokenButlerFigure" />
 
76
  # Custom Synthetic Task
77
  <div align="center">
78
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/datasetfig.png?raw=true" width="100%" alt="Synthetic Tasks" />
79
+ </div>
80
+
81
+ ## Citation
82
+
83
+ ```bibtex
84
+ @misc{akhauri2025tokenbutlertokenimportancepredictable,
85
+ title={TokenButler: Token Importance is Predictable},
86
+ author={Yash Akhauri and Ahmed F AbouElhamayed and Yifei Gao and Chi-Chih Chang and Nilesh Jain and Mohamed S. Abdelfattah},
87
+ year={2025},
88
+ eprint={2503.07518},
89
+ archivePrefix={arXiv},
90
+ primaryClass={cs.CL},
91
+ url={https://arxiv.org/abs/2503.07518},
92
+ }
93
+ ```