microsoft
/

Phi-3-mini-128k-instruct

Text Generation

Inference Endpoints

Model card Files Files and versions Community

gugarosa commited on Apr 23

Commit

edb43c8

•

1 Parent(s): 0ef07d7

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -12,10 +12,10 @@ tags:
 ## Model Summary
-Phi-3 Mini-128K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model built upon datasets used for Phi-2 - synthetic data and filtered websites - with a focus on very high-quality, reasoning dense data. The model belongs to the Phi-3 model family, and the Mini version comes in two variants [4K](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and [128K](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) which is the context length (in tokens) it can support.
-The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
-When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-128K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
 Resources and Technical Documentation:
@@ -32,7 +32,7 @@ The model is intended for commercial and research use in English. The model prov
 1) Memory/compute constrained environments
 2) Latency bound scenarios
-3) Strong reasoning (especially math and logic)
 Our model is designed to accelerate research on language and multimodal models, for use as a building block for generative AI powered features.

 ## Model Summary
+The Phi-3-Mini-128K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered websites data with a focus on high-quality and reasoning dense properties. The model belongs to the Phi-3 family with the Mini version in two variants [4K](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and [128K](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) which is the context length (in tokens) that it can support.
+The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization for the instruction following and safety measures.
+When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
 Resources and Technical Documentation:
 1) Memory/compute constrained environments
 2) Latency bound scenarios
+3) Strong reasoning (especially code, math and logic)
 Our model is designed to accelerate research on language and multimodal models, for use as a building block for generative AI powered features.