codesage
/

codesage-base

Model card Files Files and versions Community

codesage commited on Feb 6, 2024

Commit

85fa680

·

verified ·

1 Parent(s): 34e872f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ This checkpoint is trained on the Stack data (https://huggingface.co/datasets/bi
 This checkpoint is first trained on code data via masked language modeling (MLM) and then on bimodal text-code pair data. Please refer to the paper for more details.
 ### How to use
-This checkpoint consists of an encoder (356M model), which can be used to extract code embeddings of 2048 dimension. It can be easily loaded using the AutoModel functionality and employs the Starcoder tokenizer (https://arxiv.org/pdf/2305.06161.pdf).
 ```
 from transformers import AutoModel, AutoTokenizer

 This checkpoint is first trained on code data via masked language modeling (MLM) and then on bimodal text-code pair data. Please refer to the paper for more details.
 ### How to use
+This checkpoint consists of an encoder (356M model), which can be used to extract code embeddings of 1024 dimension. It can be easily loaded using the AutoModel functionality and employs the Starcoder tokenizer (https://arxiv.org/pdf/2305.06161.pdf).
 ```
 from transformers import AutoModel, AutoTokenizer