gugarosa commited on
Commit
edb43c8
1 Parent(s): 0ef07d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -12,10 +12,10 @@ tags:
12
 
13
  ## Model Summary
14
 
15
- Phi-3 Mini-128K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model built upon datasets used for Phi-2 - synthetic data and filtered websites - with a focus on very high-quality, reasoning dense data. The model belongs to the Phi-3 model family, and the Mini version comes in two variants [4K](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and [128K](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) which is the context length (in tokens) it can support.
16
 
17
- The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
18
- When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-128K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
19
 
20
  Resources and Technical Documentation:
21
 
@@ -32,7 +32,7 @@ The model is intended for commercial and research use in English. The model prov
32
 
33
  1) Memory/compute constrained environments
34
  2) Latency bound scenarios
35
- 3) Strong reasoning (especially math and logic)
36
 
37
  Our model is designed to accelerate research on language and multimodal models, for use as a building block for generative AI powered features.
38
 
 
12
 
13
  ## Model Summary
14
 
15
+ The Phi-3-Mini-128K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered websites data with a focus on high-quality and reasoning dense properties. The model belongs to the Phi-3 family with the Mini version in two variants [4K](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and [128K](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) which is the context length (in tokens) that it can support.
16
 
17
+ The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization for the instruction following and safety measures.
18
+ When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
19
 
20
  Resources and Technical Documentation:
21
 
 
32
 
33
  1) Memory/compute constrained environments
34
  2) Latency bound scenarios
35
+ 3) Strong reasoning (especially code, math and logic)
36
 
37
  Our model is designed to accelerate research on language and multimodal models, for use as a building block for generative AI powered features.
38