SandLogicTechnologies commited on
Commit
5a11c50
1 Parent(s): a05e58f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +129 -0
README.md ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ pipeline_tag: text-generation
5
+ tags:
6
+ - Pytorch
7
+ - Qwen
8
+ - English
9
+ - code
10
+ - conversational
11
+ ---
12
+ # SandLogic Technologies - Quantized Nxcode-CQ-7B-orpo Models
13
+
14
+ ## Model Description
15
+
16
+ We have quantized the Nxcode-CQ-7B-orpo model into two variants:
17
+
18
+ 1. Q5_KM
19
+ 2. Q4_KM
20
+
21
+ These quantized models offer improved efficiency while maintaining performance.
22
+
23
+ ## Original Model Information
24
+
25
+ - **Name**: [Nxcode-CQ-7B-orpo](https://huggingface.co/NTQAI/Nxcode-CQ-7B-orpo)
26
+ - **Base Model**: Qwen/CodeQwen1.5-7B
27
+ - **Fine-tuning Approach**: Monolithic Preference Optimization without Reference Model
28
+ - **Fine-tuning Data**: 100k samples of high-quality ranking data
29
+ - **Model Type**: Transformer-based decoder-only language model
30
+ - **Parameters**: 7 billion
31
+ - **Context Length**: 64K tokens
32
+ - **Supported Languages**: 92 coding languages
33
+
34
+ ## Model Capabilities
35
+
36
+ Nxcode-CQ-7B-orpo is designed for code-related tasks, with strong performance in:
37
+
38
+ - Code generation
39
+ - Long context understanding and generation
40
+ - Text-to-SQL conversion
41
+ - Bug fixing
42
+
43
+ ## Performance
44
+
45
+ Evalplus benchmark results:
46
+
47
+ - HumanEval pass@1: 86.6
48
+ - HumanEval+ pass@1: 83.5
49
+ - MBPP (v0.2.0) pass@1: 82.3
50
+ - MBPP+ (v0.2.0) pass@1: 70.4
51
+
52
+ ## Use Cases
53
+
54
+ 1. **Code Generation**: Create Python code based on function descriptions or partial implementations
55
+ 2. **Code Completion**: Suggest completions for partially written code
56
+ 3. **Error Understanding**: Potential to help identify and explain coding errors
57
+ 4. **Programming Education**: Provide explanations and examples of coding concepts and patterns
58
+
59
+ ## Model Variants
60
+
61
+ We offer two quantized versions of the Nxcode-CQ-7B-orpo model:
62
+
63
+ 1. **Q5_KM**: 5-bit quantization using the KM method
64
+ 2. **Q4_KM**: 4-bit quantization using the KM method
65
+
66
+ These quantized models aim to reduce model size and improve inference speed while maintaining performance as close to the original model as possible.
67
+
68
+ ## Input and Output
69
+
70
+ - **Input**: Text string (e.g., function descriptions, partial code implementations)
71
+ - **Output**: Generated code, completions, or explanations based on the input
72
+
73
+ ## Usage
74
+
75
+ ```bash
76
+ pip install llama-cpp-python
77
+ ```
78
+ Please refer to the llama-cpp-python [documentation](https://llama-cpp-python.readthedocs.io/en/latest/) to install with GPU support.
79
+
80
+ ### Basic Text Completion
81
+ Here's an example demonstrating how to use the high-level API for basic text completion:
82
+
83
+ ```bash
84
+ from llama_cpp import Llama
85
+
86
+ llm = Llama(
87
+ model_path="./models/7B/Nxcode-CQ-7b.gguf",
88
+ verbose=False,
89
+ # n_gpu_layers=-1, # Uncomment to use GPU acceleration
90
+ # n_ctx=2048, # Uncomment to increase the context window
91
+ )
92
+
93
+ output = llm.create_chat_completion(
94
+ messages = [
95
+ {"role": "system", "content": "You're an AI coding assistant who help in solving coding questions"},
96
+ {
97
+ "role": "user",
98
+ "content": "Write an python code to find prime number"
99
+ }
100
+ ]
101
+ )
102
+
103
+ print(output["choices"][0]['message']['content'])
104
+ ```
105
+
106
+ ## Download
107
+ You can download `Llama` models in `gguf` format directly from Hugging Face using the `from_pretrained` method. This feature requires the `huggingface-hub` package.
108
+
109
+ To install it, run: `pip install huggingface-hub`
110
+
111
+ ```bash
112
+ from llama_cpp import Llama
113
+
114
+ llm = Llama.from_pretrained(
115
+ repo_id="SandLogicTechnologies/Nxcode-CQ-7B-orpo-GGUF",
116
+ filename="*Nxcode-CQ-7B-orpo-Q5_K_M.gguf",
117
+ verbose=False
118
+ )
119
+ ```
120
+ By default, from_pretrained will download the model to the Hugging Face cache directory. You can manage installed model files using the huggingface-cli tool.
121
+
122
+
123
+ ## Acknowledgements
124
+
125
+ We thank the original developers of Nxcode-CQ-7B-orpo and Qwen/CodeQwen1.5-7B for their contributions to the field.Special thanks to Georgi Gerganov and the entire llama.cpp development team for their outstanding contributions.
126
+
127
+ ## Contact
128
+
129
+ For any inquiries or support, please contact us at support@sandlogic.com or visit our [support page](https://www.sandlogic.com/LingoForge/support).