NvInfer Unit Overview and Common Configuration
In Savant, DeepStream NvInfer is used to run inference. The framework supports four types of inference units:
Detection Unit - detector unit (typically used for models producing bounding boxes, classes and confidence scores);
Attribute Model Unit - attribute model unit (typically used for models producing attributes like gender, age, etc.);
Classification Unit - classifier unit (typically used for models producing classes and confidence scores, this is an alias for the attribute model unit);
Complex Model Unit - complex model unit (typically used for models producing bounding boxes, classes and confidence scores, and attributes like keypoints, etc.).
Each of these units is configured slightly differently. In this section, we describe common principles for all units.
About DeepStream NvInfer
Nvinfer is a GStreamer plugin for running inference with TensorRT-optimized neural networks. In DeepStream it is configured with a configuration file. In Savant, we provide two ways for configuring the NvInfer unit:
Savant-native YAML-based configuration;
a DeepStreamconfiguration file (you should avoid using it if the the required configuration parameters are supported by Savant YAML configuration).
In the end, both of them lead to the same result: a generated configuration file in the model cache directory.
All of the NvInfer variants require the model to be specified. Savant supports local model files and remote model files. The first are convenient if/when they are burnt into the docker image. The second are convenient if/when they are downloaded from a remote location like AWS S3.
Regardless of the model source, NvInfer generates a TensorRT engine file for the model optimized for the current hardware. If the engine file already exists, it will be used instead of generating a new one if it is compatible with the current configuration (hardware,batch size, precision, etc.). If the engine file is not compatible, it will be regenerated by NvInfer.
Read more about working with models in Working With Models.
Examples of NvInfer unit configuration
Detector Unit
- element: nvinfer@detector
name: yolov8n
model:
remote:
url: s3://savant-data/models/yolov8n/yolov8n_000bcd6.zip
checksum_url: s3://savant-data/models/yolov8n/yolov8n_000bcd6.md5
parameters:
endpoint: https://eu-central-1.linodeobjects.com
format: onnx
model_file: yolov8n.onnx
batch_size: ${parameters.batch_size}
input:
shape: [3, 640, 640]
maintain_aspect_ratio: true
scale_factor: 0.0039215697906911373
output:
layer_names: [boxes, scores, classes]
converter:
module: savant.converter.yolo
class_name: TensorToBBoxConverter
kwargs:
confidence_threshold: 0.2
top_k: 1000
objects:
- class_id: ${parameters.detected_object.id}
label: ${parameters.detected_object.label}
selector:
kwargs:
confidence_threshold: 0.6
nms_iou_threshold: 0.5
min_width: 30
min_height: 40
Attribute Model Unit
- element: nvinfer@attribute_model
name: age_gender
model:
remote:
url: s3://savant-data/models/age_gender/age_gender.zip
checksum_url: s3://savant-data/models/age_gender/age_gender.md5
parameters:
endpoint: https://eu-central-1.linodeobjects.com
format: onnx
config_file: age_gender_mobilenet_v2_dynBatch_config.txt
batch_size: 16
input:
object: ${parameters.detection_model_name}.face
preprocess_object_image:
module: savant.input_preproc.align_face
class_name: AlignFacePreprocessingObjectImageGPU
output:
layer_names: [ 'age', 'gender' ]
converter:
module: samples.age_gender_recognition.age_gender_converter
class_name: AgeGenderConverter
attributes:
- name: age
- name: gender
Classifier Unit
Note
Classifier unit is an alias for Attribute Model Unit.
Complex Model Unit
- element: nvinfer@complex_model
name: yolov8nface
model:
remote:
url: s3://savant-data/models/yolov8face/yolov8nface.zip
checksum_url: s3://savant-data/models/yolov8face/yolov8nface.md5
parameters:
endpoint: https://eu-central-1.linodeobjects.com
format: onnx
config_file: yolov8n-face.txt
batch_size: 16
input:
shape:
- 3
- ${parameters.detector_h}
- ${parameters.detector_w}
output:
layer_names: [ 'output0' ]
converter:
module: savant.converter.yolo_v8face
class_name: YoloV8faceConverter
kwargs:
confidence_threshold: 0.6
nms_iou_threshold: 0.5
objects:
- class_id: 0
label: face
selector:
module: savant.selector.detector
class_name: MinMaxSizeBBoxSelector
kwargs:
min_width: 40
min_height: 40
attributes:
- name: landmarks