microsoft
/

Phi-3-mini-4k-instruct-gguf

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Update README.md

#17

by dkleine - opened Jun 10, 2024

base: refs/heads/main

←

from: refs/pr/17

Discussion Files changed

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -221,7 +221,7 @@ Developers should apply responsible AI best practices and are responsible for en
 * Architecture: Phi-3 Mini has 3.8B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidlines.
 * Inputs: Text. It is best suited for prompts using chat format.
-* Context length: 128K tokens
 * GPUS: 512 H100-80G
 * Training time: 7 days
 * Training data: 3.3T tokens

 * Architecture: Phi-3 Mini has 3.8B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidlines.
 * Inputs: Text. It is best suited for prompts using chat format.
+* Context length: 4K tokens
 * GPUS: 512 H100-80G
 * Training time: 7 days
 * Training data: 3.3T tokens