add note for multi-gpu training with example dataset
Browse files- README.md +2 -0
- configs/metadata.json +2 -1
- docs/README.md +2 -0
README.md
CHANGED
@@ -99,6 +99,8 @@ torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training
|
|
99 |
|
100 |
Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
|
101 |
|
|
|
|
|
102 |
#### Override the `train` config to execute evaluation with the trained model:
|
103 |
|
104 |
```
|
|
|
99 |
|
100 |
Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
|
101 |
|
102 |
+
In addition, if using the 20 samples example dataset, the preprocessing script will divide the samples to 16 training samples, 2 validation samples and 2 test samples. However, pytorch multi-gpu training requires number of samples in dataloader larger than gpu numbers. Therefore, please use no more than 2 gpus to run this bundle if using the 20 samples example dataset.
|
103 |
+
|
104 |
#### Override the `train` config to execute evaluation with the trained model:
|
105 |
|
106 |
```
|
configs/metadata.json
CHANGED
@@ -1,7 +1,8 @@
|
|
1 |
{
|
2 |
"schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
|
3 |
-
"version": "0.3.
|
4 |
"changelog": {
|
|
|
5 |
"0.3.3": "enhance data preprocess script and readme file",
|
6 |
"0.3.2": "restructure readme to match updated template",
|
7 |
"0.3.1": "add workflow, train loss and validation accuracy figures",
|
|
|
1 |
{
|
2 |
"schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
|
3 |
+
"version": "0.3.4",
|
4 |
"changelog": {
|
5 |
+
"0.3.4": "add note for multi-gpu training with example dataset",
|
6 |
"0.3.3": "enhance data preprocess script and readme file",
|
7 |
"0.3.2": "restructure readme to match updated template",
|
8 |
"0.3.1": "add workflow, train loss and validation accuracy figures",
|
docs/README.md
CHANGED
@@ -92,6 +92,8 @@ torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training
|
|
92 |
|
93 |
Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
|
94 |
|
|
|
|
|
95 |
#### Override the `train` config to execute evaluation with the trained model:
|
96 |
|
97 |
```
|
|
|
92 |
|
93 |
Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
|
94 |
|
95 |
+
In addition, if using the 20 samples example dataset, the preprocessing script will divide the samples to 16 training samples, 2 validation samples and 2 test samples. However, pytorch multi-gpu training requires number of samples in dataloader larger than gpu numbers. Therefore, please use no more than 2 gpus to run this bundle if using the 20 samples example dataset.
|
96 |
+
|
97 |
#### Override the `train` config to execute evaluation with the trained model:
|
98 |
|
99 |
```
|