Model exported successfully to yolo_nas_pose_l_fp16.onnx Model expects input image of shape [1, 3, 640, 640] Input image dtype is torch.uint8 Exported model already contains preprocessing (normalization) step, so you don't need to do it manually. Preprocessing steps to be applied to input image are: Sequential( (0): CastTensorTo(dtype=torch.float16) (1): ChannelSelect(channels_indexes=tensor([2, 1, 0], device='cuda:0')) (2): ApplyMeanStd(mean=[0.], scale=[255.]) ) Exported model contains postprocessing (NMS) step with the following parameters: num_pre_nms_predictions=1000 max_predictions_per_image=10 nms_threshold=0.2 confidence_threshold=0.15 output_predictions_format=flat Exported model is in ONNX format and can be used with ONNXRuntime To run inference with ONNXRuntime, please use the following code snippet: import onnxruntime import numpy as np session = onnxruntime.InferenceSession("yolo_nas_pose_l_fp16.onnx", providers=["CUDAExecutionProvider", "CPUExecutionProvider"]) inputs = [o.name for o in session.get_inputs()] outputs = [o.name for o in session.get_outputs()] example_input_image = np.zeros((1, 3, 640, 640)).astype(np.uint8) predictions = session.run(outputs, {inputs[0]: example_input_image}) Exported model can also be used with TensorRT To run inference with TensorRT, please see TensorRT deployment documentation You can benchmark the model using the following code snippet: trtexec --onnx=yolo_nas_pose_l_fp16.onnx --fp16 --avgRuns=100 --duration=15 Exported model has predictions in flat format: # flat_predictions is a 2D array of [N,K] shape # Each row represents (image_index, x_min, y_min, x_max, y_max, confidence, joints...) # Please note all values are floats, so you have to convert them to integers if needed [flat_predictions] = predictions pred_bboxes = flat_predictions[:, 1:5] pred_scores = flat_predictions[:, 5] pred_joints = flat_predictions[:, 6:].reshape((len(pred_bboxes), -1, 3)) for i in range(len(pred_bboxes)): confidence = pred_scores[i] x_min, y_min, x_max, y_max = pred_bboxes[i] print(f"Detected pose with confidence={{confidence}}, x_min={{x_min}}, y_min={{y_min}}, x_max={{x_max}}, y_max={{y_max}}") for joint_index, (x, y, confidence) in enumerate(pred_joints[i]):") print(f"Joint {{joint_index}} has coordinates x={{x}}, y={{y}}, confidence={{confidence}}")