Same as h2oai/h2ogpt-16k-codellama-34b-instruct but with config.json modified to be 32k for embeddings, which still functions fine as 16k model and allows stretching into 32k in vLLM that otherwise cannot modify maximum sequence length.
- Downloads last month
- 180
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.