danfu09 commited on
Commit
f7b3330
·
1 Parent(s): 03ef821

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -3
README.md CHANGED
@@ -3,13 +3,101 @@ license: apache-2.0
3
  language:
4
  - en
5
  pipeline_tag: text-classification
 
6
  ---
7
 
8
  # Monarch Mixer-BERT
9
 
10
- The 80M checkpoint for M2-BERT-base from the paper [Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture](https://arxiv.org/abs/2310.12109).
11
- This model has been pretrained with sequence length 8192, and it has been fine-tuned for retrieval.
12
 
13
- This model was trained by Dan Fu, Jon Saad-Falcon, and Simran Arora.
 
 
14
 
15
  Check out our [GitHub](https://github.com/HazyResearch/m2/tree/main) for instructions on how to download and fine-tune it!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  language:
4
  - en
5
  pipeline_tag: text-classification
6
+ inference: false
7
  ---
8
 
9
  # Monarch Mixer-BERT
10
 
11
+ The 80M checkpoint of M2-BERT, pretrained with sequence length 8192, and it has been fine-tuned for long-context retrieval.
 
12
 
13
+ Check out the paper [Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture](https://arxiv.org/abs/2310.12109) and our [blog post]() on retrieval for more on how we trained this model for long sequence.
14
+
15
+ This model was trained by Jon Saad-Falcon, Dan Fu, and Simran Arora.
16
 
17
  Check out our [GitHub](https://github.com/HazyResearch/m2/tree/main) for instructions on how to download and fine-tune it!
18
+
19
+ ## How to use
20
+
21
+ You can load this model using Hugging Face `AutoModel`:
22
+ ```python
23
+ from transformers import AutoModelForSequenceClassification
24
+ model = AutoModelForSequenceClassification.from_pretrained(
25
+ "togethercomputer/m2-bert-80M-8k-retrieval",
26
+ trust_remote_code=True
27
+ )
28
+ ```
29
+
30
+ You should expect to see a large error message about unused parameters for FlashFFTConv.
31
+ If you'd like to load the model with FlashFFTConv, you can check out our [GitHub](https://github.com/HazyResearch/m2/tree/main).
32
+
33
+ This model generates embeddings for retrieval. The embeddings have a dimensionality of 768:
34
+ ```python
35
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
36
+
37
+ max_seq_length = 8192
38
+ testing_string = "Every morning, I make a cup of coffee to start my day."
39
+ model = AutoModelForSequenceClassification.from_pretrained(
40
+ "togethercomputer/m2-bert-80M-8k-retrieval",
41
+ trust_remote_code=True
42
+ )
43
+
44
+ tokenizer = AutoTokenizer.from_pretrained(
45
+ "bert-base-uncased",
46
+ model_max_length=max_seq_length
47
+ )
48
+ input_ids = tokenizer(
49
+ [testing_string],
50
+ return_tensors="pt",
51
+ padding="max_length",
52
+ return_token_type_ids=False,
53
+ truncation=True,
54
+ max_length=max_seq_length
55
+ )
56
+
57
+ outputs = model(**input_ids)
58
+ embeddings = outputs['sentence_embedding']
59
+ ```
60
+
61
+ You can also get embeddings from this model using the Together API as follows (you can find your API key [here](https://api.together.xyz/settings/api-keys)):
62
+ ```python
63
+ import os
64
+ import requests
65
+
66
+ def generate_together_embeddings(text: str, model_api_string: str, api_key: str):
67
+ url = "https://api.together.xyz/api/v1/embeddings"
68
+ headers = {
69
+ "accept": "application/json",
70
+ "content-type": "application/json",
71
+ "Authorization": f"Bearer {api_key}"
72
+ }
73
+ session = requests.Session()
74
+ response = session.post(
75
+ url,
76
+ headers=headers,
77
+ json={
78
+ "input": text,
79
+ "model": model_api_string
80
+ }
81
+ )
82
+ if response.status_code != 200:
83
+ raise ValueError(f"Request failed with status code {response.status_code}: {response.text}")
84
+ return response.json()['data'][0]['embedding']
85
+
86
+ print(generate_together_embeddings(
87
+ 'Hello world',
88
+ 'togethercomputer/m2-bert-80M-8k-retrieval',
89
+ os.environ['TOGETHER_API_KEY'])[:10]
90
+ )
91
+ ```
92
+
93
+ ## Citation
94
+
95
+ If you use this model, or otherwise found our work valuable, you can cite us as follows:
96
+ ```
97
+ @inproceedings{fu2023monarch,
98
+ title={Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture},
99
+ author={Fu, Daniel Y and Arora, Simran and Grogan, Jessica and Johnson, Isys and Eyuboglu, Sabri and Thomas, Armin W and Spector, Benjamin and Poli, Michael and Rudra, Atri and R{\'e}, Christopher},
100
+ booktitle={Advances in Neural Information Processing Systems},
101
+ year={2023}
102
+ }
103
+ ```