ibm
/

mayank-mishra commited on
Commit
5095e68
1 Parent(s): 3a0e6e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -120,8 +120,10 @@ model-index:
120
 
121
  ## Model Summary
122
  PowerLM-3B is a 3B state-of-the-art small language model trained with the Power learning rate scheduler. It is trained on a wide range of open-source and synthetic datasets with permissive licenses. PowerLM-3B has shown promising results compared to other models in the size categories across various benchmarks, including natural language multi-choices, code generation, and math reasoning.
 
123
 
124
  ## Usage
 
125
 
126
  ### Generation
127
  This is a simple example of how to use **PowerLM-3b** model.
 
120
 
121
  ## Model Summary
122
  PowerLM-3B is a 3B state-of-the-art small language model trained with the Power learning rate scheduler. It is trained on a wide range of open-source and synthetic datasets with permissive licenses. PowerLM-3B has shown promising results compared to other models in the size categories across various benchmarks, including natural language multi-choices, code generation, and math reasoning.
123
+ Paper: https://arxiv.org/abs/2408.13359
124
 
125
  ## Usage
126
+ Note: Requires installing HF transformers from source.
127
 
128
  ### Generation
129
  This is a simple example of how to use **PowerLM-3b** model.