Chris-Alexiuk
commited on
Commit
•
9f32193
1
Parent(s):
91d311f
Update Reference Container for Nemo FW
Browse files
README.md
CHANGED
@@ -31,7 +31,7 @@ Under the NVIDIA Open Model License, NVIDIA confirms:
|
|
31 |
|
32 |
### Intended use
|
33 |
|
34 |
-
Nemotron-4-340B-Base is a completion model intended for use in over 50+ natural and 40+ coding languages. It is compatible with [NVIDIA NeMo Framework](https://docs.nvidia.com/nemo-framework/index.html). For best performance on a given task, users are encouraged to customize the model using the NeMo Framework suite of customization tools including Parameter-Efficient Fine-Tuning (P-tuning, Adapters, LoRA, and more), and Model Alignment (SFT, SteerLM, RLHF, and more) using [NeMo-Aligner](https://github.com/NVIDIA/NeMo-Aligner).
|
35 |
|
36 |
**Model Developer:** NVIDIA
|
37 |
|
@@ -105,7 +105,7 @@ print(response)
|
|
105 |
```
|
106 |
|
107 |
|
108 |
-
2. Given this Python script, create a Bash script which spins up the inference server within the [NeMo container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo) (```docker pull nvcr.io/nvidia/nemo:24.
|
109 |
|
110 |
|
111 |
```bash
|
@@ -171,7 +171,7 @@ RESULTS=<PATH_TO_YOUR_SCRIPTS_FOLDER>
|
|
171 |
OUTFILE="${RESULTS}/slurm-%j-%n.out"
|
172 |
ERRFILE="${RESULTS}/error-%j-%n.out"
|
173 |
MODEL=<PATH_TO>/Nemotron-4-340B-Base
|
174 |
-
CONTAINER="nvcr.io/nvidia/nemo:24.
|
175 |
MOUNTS="--container-mounts=<PATH_TO_YOUR_SCRIPTS_FOLDER>:/scripts,MODEL:/model"
|
176 |
|
177 |
read -r -d '' cmd <<EOF
|
|
|
31 |
|
32 |
### Intended use
|
33 |
|
34 |
+
Nemotron-4-340B-Base is a completion model intended for use in over 50+ natural and 40+ coding languages. It is compatible with [NVIDIA NeMo Framework](https://docs.nvidia.com/nemo-framework/index.html). For best performance on a given task, users are encouraged to customize the model using the NeMo Framework suite of customization tools including Parameter-Efficient Fine-Tuning (P-tuning, Adapters, LoRA, and more), and Model Alignment (SFT, SteerLM, RLHF, and more) using [NeMo-Aligner](https://github.com/NVIDIA/NeMo-Aligner). Refer to the [documentation](https://docs.nvidia.com/nemo-framework/user-guide/latest/llms/nemotron/index.html) for examples.
|
35 |
|
36 |
**Model Developer:** NVIDIA
|
37 |
|
|
|
105 |
```
|
106 |
|
107 |
|
108 |
+
2. Given this Python script, create a Bash script which spins up the inference server within the [NeMo container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo) (```docker pull nvcr.io/nvidia/nemo:24.05```) and calls the Python script ``call_server.py``. The Bash script ``nemo_inference.sh`` is as follows:
|
109 |
|
110 |
|
111 |
```bash
|
|
|
171 |
OUTFILE="${RESULTS}/slurm-%j-%n.out"
|
172 |
ERRFILE="${RESULTS}/error-%j-%n.out"
|
173 |
MODEL=<PATH_TO>/Nemotron-4-340B-Base
|
174 |
+
CONTAINER="nvcr.io/nvidia/nemo:24.05"
|
175 |
MOUNTS="--container-mounts=<PATH_TO_YOUR_SCRIPTS_FOLDER>:/scripts,MODEL:/model"
|
176 |
|
177 |
read -r -d '' cmd <<EOF
|