ibm
/

PowerLM-3b

mayank-mishra commited on Aug 28

Commit

5095e68

•

1 Parent(s): 3a0e6e0

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -120,8 +120,10 @@ model-index:
 ## Model Summary
 PowerLM-3B is a 3B state-of-the-art small language model trained with the Power learning rate scheduler. It is trained on a wide range of open-source and synthetic datasets with permissive licenses. PowerLM-3B has shown promising results compared to other models in the size categories across various benchmarks, including natural language multi-choices, code generation, and math reasoning.
 ## Usage
 ### Generation
 This is a simple example of how to use **PowerLM-3b** model.

 ## Model Summary
 PowerLM-3B is a 3B state-of-the-art small language model trained with the Power learning rate scheduler. It is trained on a wide range of open-source and synthetic datasets with permissive licenses. PowerLM-3B has shown promising results compared to other models in the size categories across various benchmarks, including natural language multi-choices, code generation, and math reasoning.
+Paper: https://arxiv.org/abs/2408.13359
 ## Usage
+Note: Requires installing HF transformers from source.
 ### Generation
 This is a simple example of how to use **PowerLM-3b** model.