perf-analysis-chat / token_limits.json
Mazin Karjikar
Quickstarting llama.cpp (#2)
6f00050 unverified
raw
history blame
234 Bytes
{
"gpt-4o": 128000,
"gpt-4o-mini": 128000,
"gpt-4-turbo": 128000,
"gpt-4": 8192,
"gpt-3.5-turbo": 16385,
"gemini-1.5-flash": 1048576,
"gemini-1.5-pro": 2097152,
"Meta-Llama-3-8B-Instruct.Q4_K_S": 8000
}