NvInfer Unit Overview and Common Configuration
==============================================

In Savant, DeepStream NvInfer is used to run inference. The framework supports four types of inference units:

- :doc:`30_dm` - detector unit (typically used for models producing bounding boxes, classes and confidence scores);
- :doc:`43_am` - attribute model unit (typically used for models producing attributes like gender, age, etc.);
- :doc:`40_cm` - classifier unit (typically used for models producing classes and confidence scores, *this is an alias for the attribute model unit*);
- :doc:`53_complexm` - complex model unit (typically used for models producing bounding boxes, classes and confidence scores, and attributes like keypoints, etc.).

Each of these units is configured slightly differently. In this section, we describe common principles for all units.

About DeepStream NvInfer
------------------------

Nvinfer is a GStreamer plugin for running inference with TensorRT-optimized neural networks. In DeepStream it is configured with a configuration file. In Savant, we provide two ways for configuring the NvInfer unit:

- Savant-native YAML-based configuration;
- a DeepStreamconfiguration file (you should avoid using it if the the required configuration parameters are supported by Savant YAML configuration).

In the end, both of them lead to the same result: a generated configuration file in the model cache directory.

All of the NvInfer variants require the model to be specified. Savant supports local model files and remote model files. The first are convenient if/when they are burnt into the docker image. The second are convenient if/when they are downloaded from a remote location like AWS S3.

Regardless of the model source, NvInfer generates a TensorRT engine file for the model optimized for the **current** hardware. If the engine file already exists, it will be used instead of generating a new one if it is compatible with the **current** configuration (hardware,batch size, precision, etc.). If the engine file is not compatible, it will be regenerated by NvInfer.

Read more about working with models in :doc:`27_working_with_models`.

Examples of NvInfer unit configuration
--------------------------------------

Detector Unit
~~~~~~~~~~~~~

.. code-block:: yaml

    - element: nvinfer@detector
      name: yolov8n
      model:
        remote:
          url: s3://savant-data/models/yolov8n/yolov8n_000bcd6.zip
          checksum_url: s3://savant-data/models/yolov8n/yolov8n_000bcd6.md5
          parameters:
            endpoint: https://eu-central-1.linodeobjects.com
        format: onnx
        model_file: yolov8n.onnx
        batch_size: ${parameters.batch_size}
        input:
          shape: [3, 640, 640]
          maintain_aspect_ratio: true
          scale_factor: 0.0039215697906911373
        output:
          layer_names: [boxes, scores, classes]
          converter:
            module: savant.converter.yolo
            class_name: TensorToBBoxConverter
            kwargs:
              confidence_threshold: 0.2
              top_k: 1000
          objects:
            - class_id: ${parameters.detected_object.id}
              label: ${parameters.detected_object.label}
              selector:
                kwargs:
                  confidence_threshold: 0.6
                  nms_iou_threshold: 0.5
                  min_width: 30
                  min_height: 40


Attribute Model Unit
~~~~~~~~~~~~~~~~~~~~

.. code-block:: yaml

    - element: nvinfer@attribute_model
      name: age_gender
      model:
        remote:
          url: s3://savant-data/models/age_gender/age_gender.zip
          checksum_url: s3://savant-data/models/age_gender/age_gender.md5
          parameters:
            endpoint: https://eu-central-1.linodeobjects.com
        format: onnx
        config_file: age_gender_mobilenet_v2_dynBatch_config.txt
        batch_size: 16
        input:
          object: ${parameters.detection_model_name}.face
          preprocess_object_image:
            module: savant.input_preproc.align_face
            class_name: AlignFacePreprocessingObjectImageGPU
        output:
          layer_names: [ 'age', 'gender' ]
          converter:
            module:  samples.age_gender_recognition.age_gender_converter
            class_name: AgeGenderConverter
          attributes:
            - name: age
            - name: gender


Classifier Unit
~~~~~~~~~~~~~~~

.. note::
    Classifier unit is an alias for Attribute Model Unit.


Complex Model Unit
~~~~~~~~~~~~~~~~~~

.. code-block:: yaml

    - element: nvinfer@complex_model
      name: yolov8nface
      model:
        remote:
          url: s3://savant-data/models/yolov8face/yolov8nface.zip
          checksum_url: s3://savant-data/models/yolov8face/yolov8nface.md5
          parameters:
            endpoint: https://eu-central-1.linodeobjects.com
        format: onnx
        config_file: yolov8n-face.txt
        batch_size: 16
        input:
          shape:
            - 3
            - ${parameters.detector_h}
            - ${parameters.detector_w}
        output:
          layer_names: [ 'output0' ]
          converter:
            module: savant.converter.yolo_v8face
            class_name: YoloV8faceConverter
            kwargs:
              confidence_threshold: 0.6
              nms_iou_threshold: 0.5
          objects:
            - class_id: 0
              label: face
              selector:
                module: savant.selector.detector
                class_name: MinMaxSizeBBoxSelector
                kwargs:
                  min_width: 40
                  min_height: 40
          attributes:
            - name: landmarks