NvInfer Unit Overview and Common Configuration

In Savant, DeepStream NvInfer is used to run inference. The framework supports four types of inference units:

Detection Unit - detector unit (typically used for models producing bounding boxes, classes and confidence scores);
Attribute Model Unit - attribute model unit (typically used for models producing attributes like gender, age, etc.);
Classification Unit - classifier unit (typically used for models producing classes and confidence scores, this is an alias for the attribute model unit);
Complex Model Unit - complex model unit (typically used for models producing bounding boxes, classes and confidence scores, and attributes like keypoints, etc.).

Each of these units is configured slightly differently. In this section, we describe common principles for all units.

About DeepStream NvInfer

Nvinfer is a GStreamer plugin for running inference with TensorRT-optimized neural networks. In DeepStream it is configured with a configuration file. In Savant, we provide two ways for configuring the NvInfer unit:

Savant-native YAML-based configuration;
a DeepStreamconfiguration file (you should avoid using it if the the required configuration parameters are supported by Savant YAML configuration).

In the end, both of them lead to the same result: a generated configuration file in the model cache directory.

All of the NvInfer variants require the model to be specified. Savant supports local model files and remote model files. The first are convenient if/when they are burnt into the docker image. The second are convenient if/when they are downloaded from a remote location like AWS S3.

Regardless of the model source, NvInfer generates a TensorRT engine file for the model optimized for the current hardware. If the engine file already exists, it will be used instead of generating a new one if it is compatible with the current configuration (hardware,batch size, precision, etc.). If the engine file is not compatible, it will be regenerated by NvInfer.

Read more about working with models in Working With Models.

Examples of NvInfer unit configuration

Detector Unit

- element: nvinfer@detector
  name: yolov8n
  model:
    remote:
      url: s3://savant-data/models/yolov8n/yolov8n_000bcd6.zip
      checksum_url: s3://savant-data/models/yolov8n/yolov8n_000bcd6.md5
      parameters:
        endpoint: https://eu-central-1.linodeobjects.com
    format: onnx
    model_file: yolov8n.onnx
    batch_size: ${parameters.batch_size}
    input:
      shape: [3, 640, 640]
      maintain_aspect_ratio: true
      scale_factor: 0.0039215697906911373
    output:
      layer_names: [boxes, scores, classes]
      converter:
        module: savant.converter.yolo
        class_name: TensorToBBoxConverter
        kwargs:
          confidence_threshold: 0.2
          top_k: 1000
      objects:
        - class_id: ${parameters.detected_object.id}
          label: ${parameters.detected_object.label}
          selector:
            kwargs:
              confidence_threshold: 0.6
              nms_iou_threshold: 0.5
              min_width: 30
              min_height: 40

Attribute Model Unit

- element: nvinfer@attribute_model
  name: age_gender
  model:
    remote:
      url: s3://savant-data/models/age_gender/age_gender.zip
      checksum_url: s3://savant-data/models/age_gender/age_gender.md5
      parameters:
        endpoint: https://eu-central-1.linodeobjects.com
    format: onnx
    config_file: age_gender_mobilenet_v2_dynBatch_config.txt
    batch_size: 16
    input:
      object: ${parameters.detection_model_name}.face
      preprocess_object_image:
        module: savant.input_preproc.align_face
        class_name: AlignFacePreprocessingObjectImageGPU
    output:
      layer_names: [ 'age', 'gender' ]
      converter:
        module:  samples.age_gender_recognition.age_gender_converter
        class_name: AgeGenderConverter
      attributes:
        - name: age
        - name: gender

Classifier Unit

Note

Classifier unit is an alias for Attribute Model Unit.

Complex Model Unit

- element: nvinfer@complex_model
  name: yolov8nface
  model:
    remote:
      url: s3://savant-data/models/yolov8face/yolov8nface.zip
      checksum_url: s3://savant-data/models/yolov8face/yolov8nface.md5
      parameters:
        endpoint: https://eu-central-1.linodeobjects.com
    format: onnx
    config_file: yolov8n-face.txt
    batch_size: 16
    input:
      shape:
        - 3
        - ${parameters.detector_h}
        - ${parameters.detector_w}
    output:
      layer_names: [ 'output0' ]
      converter:
        module: savant.converter.yolo_v8face
        class_name: YoloV8faceConverter
        kwargs:
          confidence_threshold: 0.6
          nms_iou_threshold: 0.5
      objects:
        - class_id: 0
          label: face
          selector:
            module: savant.selector.detector
            class_name: MinMaxSizeBBoxSelector
            kwargs:
              min_width: 40
              min_height: 40
      attributes:
        - name: landmarks