Add file about AWS sagemaker

#4
by JoohoSong - opened
Files changed (1) hide show
  1. Llama-3-1-Varco-8B.ipynb +0 -343
Llama-3-1-Varco-8B.ipynb DELETED
@@ -1,343 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {},
6
- "source": [
7
- "# Deploy Llama-VARCO-8B-Instruct Model from AWS Marketplace \n"
8
- ]
9
- },
10
- {
11
- "cell_type": "markdown",
12
- "metadata": {},
13
- "source": [
14
- "\n",
15
- "\n",
16
- "Llama-VARCO-8B-Instruct is a generative model built with Llama, specifically designed to excel in Korean through additional training. The model uses continual pre-training with both Korean and English datasets to enhance its understanding and generation capabilites in Korean, while also maintaining its proficiency in English. It performs supervised fine-tuning (SFT) and direct preference optimization (DPO) in Korean to align with human preferences.\n",
17
- "\n",
18
- "This sample notebook shows you how to deploy [Llama-VARCO-8B-Instruct](https://aws.amazon.com/marketplace/pp/prodview-pynin2e23lb3e) using Amazon SageMaker.\n",
19
- "\n",
20
- "> **Note**: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.\n",
21
- "\n",
22
- "## Pre-requisites:\n",
23
- "1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.\n",
24
- "1. Ensure that IAM role used has **AmazonSageMakerFullAccess**\n",
25
- "1. To deploy this ML model successfully, ensure that:\n",
26
- " 1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: \n",
27
- " 1. **aws-marketplace:ViewSubscriptions**\n",
28
- " 1. **aws-marketplace:Unsubscribe**\n",
29
- " 1. **aws-marketplace:Subscribe** \n",
30
- "\n",
31
- "## Contents:\n",
32
- "1. [Subscribe to the model package](#1.-Subscribe-to-the-model-package)\n",
33
- "2. [Create an endpoint and perform real-time inference](#2.-Create-an-endpoint-and-perform-real-time-inference)\n",
34
- "3. [Clean-up](#3.-Clean-up)\n",
35
- "\n",
36
- " \n",
37
- "\n",
38
- "## Usage instructions\n",
39
- "You can run this notebook one cell at a time (By using Shift+Enter for running a cell)."
40
- ]
41
- },
42
- {
43
- "cell_type": "markdown",
44
- "metadata": {},
45
- "source": [
46
- "## 1. Subscribe to the model package"
47
- ]
48
- },
49
- {
50
- "cell_type": "markdown",
51
- "metadata": {
52
- "tags": []
53
- },
54
- "source": [
55
- "To subscribe to the model package:\n",
56
- "1. Open the model package [listing page](https://aws.amazon.com/marketplace/pp/prodview-pynin2e23lb3e)\n",
57
- "1. On the AWS Marketplace listing, click on the **Continue to subscribe** button.\n",
58
- "1. On the **Subscribe to this software** page, review and click on **\"Accept Offer\"** if you and your organization agrees with EULA, pricing, and support terms. \n",
59
- "1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3. Copy the ARN corresponding to your region and specify the same in the following cell."
60
- ]
61
- },
62
- {
63
- "cell_type": "code",
64
- "execution_count": null,
65
- "metadata": {
66
- "tags": []
67
- },
68
- "outputs": [],
69
- "source": [
70
- "model_package_arn = \"arn:aws:sagemaker:us-west-2:594846645681:model-package/llama-varco-8b-ist-bedrock-37339dbb44f23f488e24f8671eaa0494\""
71
- ]
72
- },
73
- {
74
- "cell_type": "code",
75
- "execution_count": null,
76
- "metadata": {
77
- "tags": []
78
- },
79
- "outputs": [],
80
- "source": [
81
- "import base64\n",
82
- "import json\n",
83
- "import uuid\n",
84
- "from sagemaker import ModelPackage\n",
85
- "import sagemaker as sage\n",
86
- "from sagemaker import get_execution_role\n",
87
- "from sagemaker import ModelPackage\n",
88
- "import boto3\n",
89
- "from IPython.display import Image\n",
90
- "from PIL import Image as ImageEdit\n",
91
- "import numpy as np\n",
92
- "import io"
93
- ]
94
- },
95
- {
96
- "cell_type": "code",
97
- "execution_count": null,
98
- "metadata": {
99
- "tags": []
100
- },
101
- "outputs": [],
102
- "source": [
103
- "role = get_execution_role()\n",
104
- "\n",
105
- "sagemaker_session = sage.Session()\n",
106
- "\n",
107
- "bucket = sagemaker_session.default_bucket()\n",
108
- "runtime = boto3.client(\"runtime.sagemaker\")"
109
- ]
110
- },
111
- {
112
- "cell_type": "markdown",
113
- "metadata": {},
114
- "source": [
115
- "## 2. Create an endpoint and perform real-time inference"
116
- ]
117
- },
118
- {
119
- "cell_type": "markdown",
120
- "metadata": {},
121
- "source": [
122
- "If you want to understand how real-time inference with Amazon SageMaker works, see [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html)."
123
- ]
124
- },
125
- {
126
- "cell_type": "code",
127
- "execution_count": null,
128
- "metadata": {
129
- "tags": []
130
- },
131
- "outputs": [],
132
- "source": [
133
- "model_name = \"Llama-VARCO-8B-Instruct\"\n",
134
- "\n",
135
- "content_type = \"application/json\"\n",
136
- "\n",
137
- "real_time_inference_instance_type = (\n",
138
- " \"ml.g5.12xlarge\"\n",
139
- ")\n",
140
- "batch_transform_inference_instance_type = (\n",
141
- " \"ml.g4dn.12xlarge\"\n",
142
- ")"
143
- ]
144
- },
145
- {
146
- "cell_type": "markdown",
147
- "metadata": {},
148
- "source": [
149
- "### A.Create an endpoint"
150
- ]
151
- },
152
- {
153
- "cell_type": "code",
154
- "execution_count": null,
155
- "metadata": {
156
- "tags": []
157
- },
158
- "outputs": [],
159
- "source": [
160
- "# create a deployable model from the model package.\n",
161
- "model = ModelPackage(\n",
162
- " role=role, model_package_arn=model_package_arn, sagemaker_session=sagemaker_session\n",
163
- ")\n",
164
- "\n",
165
- "# Deploy the model\n",
166
- "predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=model_name)"
167
- ]
168
- },
169
- {
170
- "cell_type": "markdown",
171
- "metadata": {},
172
- "source": [
173
- "Once endpoint has been created, you would be able to perform real-time inference."
174
- ]
175
- },
176
- {
177
- "cell_type": "markdown",
178
- "metadata": {
179
- "tags": []
180
- },
181
- "source": [
182
- "### B.Create input payload"
183
- ]
184
- },
185
- {
186
- "cell_type": "code",
187
- "execution_count": null,
188
- "metadata": {
189
- "tags": []
190
- },
191
- "outputs": [],
192
- "source": [
193
- "input = {\n",
194
- " \"messages\": [\n",
195
- " {\n",
196
- " \"role\":\"user\",\n",
197
- " \"content\":\"안녕 넌 누구야?\"\n",
198
- " }\n",
199
- " ]\n",
200
- "}"
201
- ]
202
- },
203
- {
204
- "cell_type": "markdown",
205
- "metadata": {},
206
- "source": [
207
- "### C. Perform real-time inference"
208
- ]
209
- },
210
- {
211
- "cell_type": "markdown",
212
- "metadata": {},
213
- "source": [
214
- "##### C-1. Stream Inference Example"
215
- ]
216
- },
217
- {
218
- "cell_type": "code",
219
- "execution_count": null,
220
- "metadata": {
221
- "tags": []
222
- },
223
- "outputs": [],
224
- "source": [
225
- "class VarcoInferenceStream():\n",
226
- " def __init__(self, sagemaker_runtime, endpoint_name):\n",
227
- " self.sagemaker_runtime = sagemaker_runtime\n",
228
- " self.endpoint_name = endpoint_name\n",
229
- "\n",
230
- " def stream_inference(self, request_body):\n",
231
- " # Gets a streaming inference response\n",
232
- " # from the specified model endpoint:\n",
233
- " response = self.sagemaker_runtime\\\n",
234
- " .invoke_endpoint_with_response_stream(\n",
235
- " EndpointName=self.endpoint_name,\n",
236
- " Body=json.dumps(request_body),\n",
237
- " ContentType=\"application/json\"\n",
238
- " )\n",
239
- " # Gets the EventStream object returned by the SDK:\n",
240
- " for body in response[\"Body\"]:\n",
241
- " raw = body['PayloadPart']['Bytes']\n",
242
- " yield raw.decode()\n",
243
- "\n",
244
- "\n",
245
- "sm_runtime = boto3.client(\"sagemaker-runtime\")\n",
246
- "varco_inference_stream = VarcoInferenceStream(sm_runtime, model_name)\n",
247
- "stream = varco_inference_stream.stream_inference(input)\n",
248
- "for part in stream:\n",
249
- " print(part, end='')"
250
- ]
251
- },
252
- {
253
- "cell_type": "markdown",
254
- "metadata": {
255
- "tags": []
256
- },
257
- "source": [
258
- "## 3. Clean-up"
259
- ]
260
- },
261
- {
262
- "cell_type": "markdown",
263
- "metadata": {},
264
- "source": [
265
- "Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged."
266
- ]
267
- },
268
- {
269
- "cell_type": "markdown",
270
- "metadata": {},
271
- "source": [
272
- "### A. Delete the endpoint"
273
- ]
274
- },
275
- {
276
- "cell_type": "code",
277
- "execution_count": null,
278
- "metadata": {},
279
- "outputs": [],
280
- "source": [
281
- "model.sagemaker_session.delete_endpoint(model_name)\n",
282
- "model.sagemaker_session.delete_endpoint_config(model_name)"
283
- ]
284
- },
285
- {
286
- "cell_type": "markdown",
287
- "metadata": {},
288
- "source": [
289
- "### B. Delete the model"
290
- ]
291
- },
292
- {
293
- "cell_type": "code",
294
- "execution_count": null,
295
- "metadata": {},
296
- "outputs": [],
297
- "source": [
298
- "model.delete_model()"
299
- ]
300
- },
301
- {
302
- "cell_type": "markdown",
303
- "metadata": {},
304
- "source": [
305
- "### C. Unsubscribe to the listing (optional)"
306
- ]
307
- },
308
- {
309
- "cell_type": "markdown",
310
- "metadata": {},
311
- "source": [
312
- "If you would like to unsubscribe to the model package, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. \n",
313
- "\n",
314
- "**Steps to unsubscribe to product from AWS Marketplace**:\n",
315
- "1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)\n",
316
- "2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__ to cancel the subscription.\n",
317
- "\n"
318
- ]
319
- }
320
- ],
321
- "metadata": {
322
- "instance_type": "ml.t3.medium",
323
- "kernelspec": {
324
- "display_name": "conda_pytorch_p310",
325
- "language": "python",
326
- "name": "conda_pytorch_p310"
327
- },
328
- "language_info": {
329
- "codemirror_mode": {
330
- "name": "ipython",
331
- "version": 3
332
- },
333
- "file_extension": ".py",
334
- "mimetype": "text/x-python",
335
- "name": "python",
336
- "nbconvert_exporter": "python",
337
- "pygments_lexer": "ipython3",
338
- "version": "3.10.14"
339
- }
340
- },
341
- "nbformat": 4,
342
- "nbformat_minor": 4
343
- }