File size: 2,783 Bytes
f6ac466
 
 
 
 
 
 
 
 
 
 
 
9243a6f
f6ac466
 
 
 
 
 
704d1c8
07bdf63
 
 
 
 
 
 
 
704d1c8
07bdf63
704d1c8
 
07bdf63
704d1c8
07bdf63
704d1c8
07bdf63
704d1c8
07bdf63
704d1c8
 
 
 
 
07bdf63
 
 
 
 
 
 
 
704d1c8
 
07bdf63
704d1c8
07bdf63
704d1c8
07bdf63
704d1c8
07bdf63
704d1c8
 
 
 
07bdf63
704d1c8
07bdf63
704d1c8
 
 
07bdf63
704d1c8
07bdf63
704d1c8
07bdf63
704d1c8
 
07bdf63
704d1c8
07bdf63
704d1c8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
license: apache-2.0
language:
- en
- es
- de
- fr
- pt
- it
- ru
- ja
- zh
- ko
base_model:
- mistralai/Mistral-Large-Instruct-2411
tags:
- conversational
- mlx
---
# Model Card for Mistral-Large-Instruct-2411-MLX

This is a 2bit quantization of the Mistral Large Instruct 2411 model for MLX (Apple silicon). It was created using the mlx-lm library with the following CLI command:
mlx_lm.convert \
    --hf-path /path/to/your/fp16/model \
    -q \
    --q-bits 2 \
    --q-group-size 32

## Quantized Versions

- [2-bit Quantization (Q2)](https://huggingface.co/zachlandes/Mistral-Large-Instruct-2411-Q2-MLX)
- [4-bit Quantization (Q4)](https://huggingface.co/zachlandes/Mistral-Large-Instruct-2411-Q4-MLX)

Each version is optimized for specific memory and performance trade-offs. 

## Original Model

The original Mistral-Large-Instruct-2411 model is available [here](https://huggingface.co/mistralai/Mistral-Large-Instruct-2411). Mistral model usage is governed by the [Mistral Research License](https://mistral.ai/licenses/MRL-0.1.md).

## License

This model family is governed by the [Mistral Research License](https://mistral.ai/licenses/MRL-0.1.md). Please review the license terms before use.

## Table of Contents

- [Model Details](#model-details)
  - [Model Description](#model-description)
- [Uses](#uses)
  - [Direct Use](#direct-use)
  - [Out-of-Scope Use](#out-of-scope-use)
- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
  - [Recommendations](#recommendations)
- [Technical Specifications](#technical-specifications)
- [How to Get Started](#how-to-get-started)

## Model Details

### Model Description

The Mistral-Large-Instruct-2411-MLX family includes quantized versions of the Mistral Large Instruct 2411 model, optimized for deployment on MLX (Apple Silicon). The quantization reduces memory usage and inference latency, enabling efficient deployment on resource-constrained systems.

- **Developed by:** Mistral AI
- **Model type:** Large language model
- **Language(s):** English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Russian, Korean
- **Quantization levels:** 2-bit (Q2), 4-bit (Q4)

## Technical Specifications

- **Parent Model:** [Mistral-Large-Instruct-2411](https://huggingface.co/mistralai/Mistral-Large-Instruct-2411)
- **Quantization:** 2-bit (Q2), 4-bit (Q4)
- **Framework:** MLX (`mlx-lm` library)

## How to Get Started

Visit the individual quantized repositories for details and usage instructions:

- [2-bit Quantization (Q2)](https://huggingface.co/zachlandes/Mistral-Large-Instruct-2411-Q2-MLX)
- [4-bit Quantization (Q4)](https://huggingface.co/zachlandes/Mistral-Large-Instruct-2411-Q4-MLX)

## Model Card Contact

For inquiries, contact [Zach Landes](https://www.linkedin.com/in/zachlandes/).