Model exported successfully to yolo_nas_pose_s_fp32.onnx
Model expects input image of shape [1, 3, 640, 640]
Input image dtype is torch.uint8

Exported model already contains preprocessing (normalization) step, so you don't need to do it manually.
Preprocessing steps to be applied to input image are:
Sequential(
  (0): CastTensorTo(dtype=torch.float32)
  (1): ChannelSelect(channels_indexes=tensor([2, 1, 0]))
  (2): ApplyMeanStd(mean=[0.], scale=[255.])
)


Exported model contains postprocessing (NMS) step with the following parameters:
    num_pre_nms_predictions=1000
    max_predictions_per_image=10
    nms_threshold=0.2
    confidence_threshold=0.15
    output_predictions_format=flat


Exported model is in ONNX format and can be used with ONNXRuntime
To run inference with ONNXRuntime, please use the following code snippet:

    import onnxruntime
    import numpy as np
    session = onnxruntime.InferenceSession("yolo_nas_pose_s_fp32.onnx", providers=["CUDAExecutionProvider", "CPUExecutionProvider"])
    inputs = [o.name for o in session.get_inputs()]
    outputs = [o.name for o in session.get_outputs()]

    example_input_image = np.zeros((1, 3, 640, 640)).astype(np.uint8)
    predictions = session.run(outputs, {inputs[0]: example_input_image})

Exported model can also be used with TensorRT
To run inference with TensorRT, please see TensorRT deployment documentation
You can benchmark the model using the following code snippet:

    trtexec --onnx=yolo_nas_pose_s_fp32.onnx --fp16 --avgRuns=100 --duration=15


Exported model has predictions in flat format:

# flat_predictions is a 2D array of [N,K] shape
# Each row represents (image_index, x_min, y_min, x_max, y_max, confidence, joints...)
# Please note all values are floats, so you have to convert them to integers if needed

[flat_predictions] = predictions
pred_bboxes = flat_predictions[:, 1:5]
pred_scores = flat_predictions[:, 5]
pred_joints = flat_predictions[:, 6:].reshape((len(pred_bboxes), -1, 3))
for i in range(len(pred_bboxes)):
    confidence = pred_scores[i]
    x_min, y_min, x_max, y_max = pred_bboxes[i]
    print(f"Detected pose with confidence={{confidence}}, x_min={{x_min}}, y_min={{y_min}}, x_max={{x_max}}, y_max={{y_max}}")
    for joint_index, (x, y, confidence) in enumerate(pred_joints[i]):")
        print(f"Joint {{joint_index}} has coordinates x={{x}}, y={{y}}, confidence={{confidence}}")