API Reference
This page provides a quick reference for all model classes, result types, and configuration methods available in the RZ/V2H RDK AI model packages.
For guidance on creating your own model, see How to Add a New Model.
Core Data Types (rzv_model)
These types are defined in rzv_model/base_model.hpp and rzv_model/utils.hpp.
ModelInput
struct ModelInput
{
cv::Mat original_image; // Input image (YUV422 or RGB format)
cv::Rect roi; // Region of interest within the image
};
ModelResult (base class)
All result types inherit from this. Contains timing information from each inference stage.
struct ModelResult
{
float score = 0.0f;
float preprocess_ms = 0.0f; // Time spent in preprocessing
float inference_ms = 0.0f; // Time spent in DRP-AI inference
float postprocess_ms = 0.0f; // Time spent in postprocessing
};
KeyPoint / KeyPointResult
Used by pose estimation models (HRNetV2, RTMPose, MediaPipe).
struct KeyPoint
{
float x;
float y;
float confidence;
int class_id;
};
struct KeyPointResult : public ModelResult
{
std::vector<KeyPoint> keypoints;
};
ModelShapeInfo
Provides tensor shape information extracted from the loaded model.
struct ModelShapeInfo
{
std::vector<int64_t> input_shape;
std::string input_dtype;
std::vector<std::vector<int64_t>> output_shapes;
std::vector<std::string> output_dtypes;
int input_height() const; // input_shape[2]
int input_width() const; // input_shape[3]
int input_channels() const; // input_shape[1]
};
YUV422Format
enum class YUV422Format { YUYV, UYVY };
BaseModel Class
The base class for all AI models. Defined in rzv_model/base_model.hpp.
Public methods:
Method |
Description |
|---|---|
|
Load a DRP-AI model from the given directory path. |
|
Check whether a model has been loaded. |
|
Get input/output tensor shape information. |
|
Run inference and return typed result. Returns |
Protected methods (override in subclasses):
Method |
Description |
|---|---|
|
Parse raw output tensors into a |
|
Custom preprocessing before inference. |
|
CPU fallback when hardware preprocessing is unavailable. |
|
CPU preprocessing with optional ImageNet normalization. |
|
Extract custom shapes after model load. |
Protected helper methods:
Method |
Description |
|---|---|
|
Resize and pad image while maintaining aspect ratio. |
|
Check if DRP-AI hardware preprocessing is available. |
|
Map a point from preprocessed to original image coordinates. |
|
Map a size from preprocessed to original image coordinates. |
|
Set the padding color for letterbox. |
|
Logging macros (uses spdlog). |
Utils Class
Static utility functions defined in rzv_model/utils.hpp.
Method |
Description |
|---|---|
|
Convert BGR image to YUV422 (YUYV or UYVY). |
|
Convert RGBA image to YUV422 (YUYV or UYVY). |
|
Batched NMS for |
Object Detection Models
The following models are provided for object detection tasks. Each model class inherits from BaseModel and implements the required methods for loading, preprocessing, inference, and postprocessing.
rzv_yolox – YoloxModel
Header: rzv_yolox/yolox_model.hpp | Inherits: BaseModel
Result type:
struct YOLOXDetection
{
cv::Rect bbox;
int class_id;
float confidence;
bool is_valid = false;
std::string class_name;
};
struct YOLOXDetectionResult : public ModelResult
{
std::vector<YOLOXDetection> detections;
};
Configuration methods:
Method |
Description |
|---|---|
|
Set class labels (must match model training). |
|
Set detection confidence threshold (0.0 - 1.0). |
|
Set NMS IoU threshold (0.0 - 1.0). |
Quick example:
auto model = std::make_unique<rzv_model::YoloxModel>();
model->set_class_names({"hand"});
model->set_confidence_threshold(0.5f);
model->set_iou_threshold(0.4f);
model->load("path/to/yolox_model");
auto result = model->run<rzv_model::YOLOXDetectionResult>(input);
Model preparation: YOLOX | YOLOX - Convert for V2H
rzv_yolov8 – YOLOv8DetectModel
Header: rzv_yolov8/yolov8_detect_model.hpp | Inherits: YOLOv8Base -> BaseModel
Result type:
struct YOLOv8Detection
{
cv::Rect bbox;
int class_id;
float confidence;
bool is_valid = false;
std::string class_name;
};
struct YOLOv8DetectionResult : public ModelResult
{
std::vector<YOLOv8Detection> detections;
};
Configuration methods:
Method |
Description |
|---|---|
|
Set class labels (must match model training). |
|
Set detection confidence threshold (0.0 - 1.0). |
|
Set NMS threshold (0.0 - 1.0). |
|
Set DFL sigmoid optimization mode (see below). |
|
Enable/disable multi-threaded CPU DFL processing. |
DFL Sigmoid Modes (DFLSigmoidMode enum):
InDfl– Apply sigmoid during DFL processing (original).AfterArgmax– Skip sigmoid in DFL, apply after argmax (faster).AfterThreshold– Skip sigmoid in DFL, apply after threshold filtering (fastest, default).
Quick example:
auto model = std::make_unique<rzv_model::YOLOv8DetectModel>();
model->set_class_names({"paper", "rock", "scissor"});
model->set_confidence_threshold(0.5f);
model->set_nms_threshold(0.4f);
model->set_dfl_sigmoid_mode(rzv_model::DFLSigmoidMode::AfterThreshold);
model->set_cpu_dfl_multi_thread(false);
model->load("path/to/yolov8_model");
auto result = model->run<rzv_model::YOLOv8DetectionResult>(input);
Model preparation: Ultralytics YOLO | YOLOv8 - Convert for V2H
rzv_yolov8 – YOLOv8OBBModel
Header: rzv_yolov8/yolov8_obb_model.hpp | Inherits: YOLOv8Base -> BaseModel
For oriented bounding box detection (e.g., aerial/satellite imagery).
Result type:
struct YOLOv8OBBDetection
{
cv::RotatedRect obbox; // Oriented bounding box
int class_id;
float confidence;
bool is_valid = false;
std::string class_name;
};
struct YOLOv8OBBDetectionResult : public ModelResult
{
std::vector<YOLOv8OBBDetection> detections;
};
Configuration methods: Same as YOLOv8DetectModel (inherits from YOLOv8Base).
Quick example:
auto model = std::make_unique<rzv_model::YOLOv8OBBModel>();
model->set_class_names({"ship", "plane", "vehicle"});
model->set_confidence_threshold(0.6f);
model->set_nms_threshold(0.5f);
model->load("path/to/yolov8_obb_model");
auto result = model->run<rzv_model::YOLOv8OBBDetectionResult>(input);
rzv_gold_yolo – GoldYoloModel
Header: rzv_gold_yolo/gold_yolo_model.hpp | Inherits: BaseModel
Result type:
struct GOLDYOLODetection
{
cv::Rect bbox;
int class_id;
float confidence;
bool is_valid = false;
std::string class_name;
};
struct GOLDYOLODetectionResult : public ModelResult
{
std::vector<GOLDYOLODetection> detections;
};
Configuration methods: Same as YoloxModel (set_class_names, set_confidence_threshold, set_iou_threshold).
Quick example:
auto model = std::make_unique<rzv_model::GoldYoloModel>();
model->set_class_names({"hand"});
model->set_confidence_threshold(0.5f);
model->set_iou_threshold(0.4f);
model->load("path/to/gold_yolo_model");
auto result = model->run<rzv_model::GOLDYOLODetectionResult>(input);
Pose Estimation Models
The following models are provided for pose estimation tasks. Each model class inherits from BaseModel and implements the required methods for loading, preprocessing, inference, and postprocessing.
rzv_hrnetv2 – HRNetV2Model
Header: rzv_hrnetv2/hrnetv2_model.hpp | Inherits: BaseModel
Returns KeyPointResult. No additional configuration methods beyond BaseModel.
Quick example:
auto model = std::make_unique<rzv_model::HRNetV2Model>();
model->load("path/to/hrnetv2_model");
auto result = model->run<rzv_model::KeyPointResult>(input);
for (const auto & kp : result->keypoints) {
std::cout << "x=" << kp.x << " y=" << kp.y
<< " conf=" << kp.confidence << std::endl;
}
Model preparation: MMPose | Convert for V2H
rzv_rtmpose – RTMPoseModel
Header: rzv_rtmpose/rtmpose_model.hpp | Inherits: BaseModel
Returns KeyPointResult. No additional configuration methods beyond BaseModel.
Quick example:
auto model = std::make_unique<rzv_model::RTMPoseModel>();
model->load("path/to/rtmpose_model");
auto result = model->run<rzv_model::KeyPointResult>(input);
Model preparation: Same as HRNetV2 (MMPose).
rzv_mediapipe – MediaPipeHandLandmarkModel
Header: rzv_mediapipe/mediapipe_hand_landmark_model.hpp | Inherits: BaseModel
Returns HandLandmarkResult which extends KeyPointResult with handedness classification.
Result type:
struct HandLandmarkResult : public KeyPointResult
{
float handedness; // 0.0 = left hand, 1.0 = right hand
};
Quick example:
auto model = std::make_unique<rzv_model::MediaPipeHandLandmarkModel>();
model->load("path/to/mediapipe_hand_landmark_model");
auto result = model->run<rzv_model::HandLandmarkResult>(input);
std::cout << "Hand: " << (result->handedness > 0.5 ? "Right" : "Left") << std::endl;
Model preparation: MediaPipe
ROS 2 Utilities (rzv_model_utils_ros2)
Header: rzv_model_utils_ros2/model_utils.hpp
Provides helper functions for integrating AI models into ROS 2 nodes.
ModelConfig
struct ModelConfig
{
std::string model_path;
std::vector<std::string> class_names;
};
UtilsROS
Method |
Description |
|---|---|
|
Load model configuration from YAML with optional parameter overrides. |
|
Convert detection bounding boxes to |
|
Encode inference timing into a diagnostic message. |
YAML configuration format (config/models/models_config.yaml):
models:
my_model:
path: "models/my_model_name"
names:
0: class_a
1: class_b