danfu09 commited on
Commit
ca235ad
·
1 Parent(s): 4288da0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -65
README.md CHANGED
@@ -2,13 +2,14 @@
2
  license: apache-2.0
3
  language:
4
  - en
5
- pipeline_tag: text-classification
6
  inference: false
7
  ---
8
 
9
  # Monarch Mixer-BERT
10
 
11
- An 80M checkpoint of M2-BERT, pretrained with sequence length 2048, and it has been fine-tuned for long-context retrieval.
 
12
 
13
  Check out the paper [Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture](https://arxiv.org/abs/2310.12109) and our [blog post]() on retrieval for more on how we trained this model for long sequence.
14
 
@@ -20,8 +21,8 @@ Check out our [GitHub](https://github.com/HazyResearch/m2/tree/main) for instruc
20
 
21
  You can load this model using Hugging Face `AutoModel`:
22
  ```python
23
- from transformers import AutoModelForSequenceClassification
24
- model = AutoModelForSequenceClassification.from_pretrained(
25
  "togethercomputer/m2-bert-80M-2k-retrieval",
26
  trust_remote_code=True
27
  )
@@ -30,66 +31,6 @@ model = AutoModelForSequenceClassification.from_pretrained(
30
  You should expect to see a large error message about unused parameters for FlashFFTConv.
31
  If you'd like to load the model with FlashFFTConv, you can check out our [GitHub](https://github.com/HazyResearch/m2/tree/main).
32
 
33
- This model generates embeddings for retrieval. The embeddings have a dimensionality of 768:
34
- ```python
35
- from transformers import AutoTokenizer, AutoModelForSequenceClassification
36
-
37
- max_seq_length = 2048
38
- testing_string = "Every morning, I make a cup of coffee to start my day."
39
- model = AutoModelForSequenceClassification.from_pretrained(
40
- "togethercomputer/m2-bert-80M-2k-retrieval",
41
- trust_remote_code=True
42
- )
43
-
44
- tokenizer = AutoTokenizer.from_pretrained(
45
- "bert-base-uncased",
46
- model_max_length=max_seq_length
47
- )
48
- input_ids = tokenizer(
49
- [testing_string],
50
- return_tensors="pt",
51
- padding="max_length",
52
- return_token_type_ids=False,
53
- truncation=True,
54
- max_length=max_seq_length
55
- )
56
-
57
- outputs = model(**input_ids)
58
- embeddings = outputs['sentence_embedding']
59
- ```
60
-
61
- You can also get embeddings from this model using the Together API as follows (you can find your API key [here](https://api.together.xyz/settings/api-keys)):
62
- ```python
63
- import os
64
- import requests
65
-
66
- def generate_together_embeddings(text: str, model_api_string: str, api_key: str):
67
- url = "https://api.together.xyz/api/v1/embeddings"
68
- headers = {
69
- "accept": "application/json",
70
- "content-type": "application/json",
71
- "Authorization": f"Bearer {api_key}"
72
- }
73
- session = requests.Session()
74
- response = session.post(
75
- url,
76
- headers=headers,
77
- json={
78
- "input": text,
79
- "model": model_api_string
80
- }
81
- )
82
- if response.status_code != 200:
83
- raise ValueError(f"Request failed with status code {response.status_code}: {response.text}")
84
- return response.json()['data'][0]['embedding']
85
-
86
- print(generate_together_embeddings(
87
- 'Hello world',
88
- 'togethercomputer/m2-bert-80M-2k-retrieval',
89
- os.environ['TOGETHER_API_KEY'])[:10]
90
- )
91
- ```
92
-
93
  ## Acknowledgments
94
 
95
  Alycia Lee helped with AutoModel support.
@@ -104,4 +45,4 @@ If you use this model, or otherwise found our work valuable, you can cite us as
104
  booktitle={Advances in Neural Information Processing Systems},
105
  year={2023}
106
  }
107
- ```
 
2
  license: apache-2.0
3
  language:
4
  - en
5
+ pipeline_tag: fill-mask
6
  inference: false
7
  ---
8
 
9
  # Monarch Mixer-BERT
10
 
11
+ An 80M checkpoint of M2-BERT, pretrained with sequence length 2048.
12
+ **This is a BERT-style model that has not been fine-tuned. We recommend fine-tuning it for specific use cases before using it.**
13
 
14
  Check out the paper [Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture](https://arxiv.org/abs/2310.12109) and our [blog post]() on retrieval for more on how we trained this model for long sequence.
15
 
 
21
 
22
  You can load this model using Hugging Face `AutoModel`:
23
  ```python
24
+ from transformers import AutoModelForMaskedLM
25
+ model = AutoModelForMaskedLM.from_pretrained(
26
  "togethercomputer/m2-bert-80M-2k-retrieval",
27
  trust_remote_code=True
28
  )
 
31
  You should expect to see a large error message about unused parameters for FlashFFTConv.
32
  If you'd like to load the model with FlashFFTConv, you can check out our [GitHub](https://github.com/HazyResearch/m2/tree/main).
33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ## Acknowledgments
35
 
36
  Alycia Lee helped with AutoModel support.
 
45
  booktitle={Advances in Neural Information Processing Systems},
46
  year={2023}
47
  }
48
+ ```