Autonomy Software C++ 24.5.1
Welcome to the Autonomy Software repository of the Mars Rover Design Team (MRDT) at Missouri University of Science and Technology (Missouri S&T)! API reference contains the source code and other resources for the development of the autonomy software for our Mars rover. The Autonomy Software project aims to compete in the University Rover Challenge (URC) by demonstrating advanced autonomous capabilities and robust navigation algorithms.
Loading...
Searching...
No Matches
yolomodel::pytorch::PyTorchInterpreter Class Reference

This class is designed to enable quick, easy, and robust inferencing of .pt yolo model. More...

#include <YOLOModel.hpp>

Collaboration diagram for yolomodel::pytorch::PyTorchInterpreter:

Public Types

enum class  HardwareDevices { eCPU , eCUDA }
 

Public Member Functions

 PyTorchInterpreter (std::string szModelPath, HardwareDevices eHardwareDevice=HardwareDevices::eCUDA)
 Construct a new PyTorchInterpreter object.
 
 ~PyTorchInterpreter ()
 Destroy the PyTorchInterpreter object.
 
std::vector< DetectionInference (const cv::Mat &cvInputFrame, const float fMinObjectConfidence=0.85, const float fNMSThreshold=0.6)
 Given an input image forward the image through the YOLO model to run inference on the PyTorch model, then parse and repackage the output tensor data into a vector of easy-to-use Detection structs.
 
bool IsReadyForInference () const
 Check if the model is ready for inference.
 

Private Member Functions

torch::Tensor PreprocessImage (const cv::Mat &cvInputFrame, const torch::Device &trDevice)
 Given an input image, preprocess the image to match the input tensor shape of the model, then return the preprocessed image as a tensor.
 
void ParseTensorOutputYOLOv5 (const torch::Tensor &trOutput, std::vector< int > &vClassIDs, std::vector< float > &vClassConfidences, std::vector< cv::Rect > &vBoundingBoxes, const cv::Size &cvInputFrameSize, const float fMinObjectConfidence)
 Given a tensor output from a YOLOv5 model, parse it's output into something more usable.
 
void ParseTensorOutputYOLOv8 (const torch::Tensor &trOutput, std::vector< int > &vClassIDs, std::vector< float > &vClassConfidences, std::vector< cv::Rect > &vBoundingBoxes, const cv::Size &cvInputFrameSize, const float fMinObjectConfidence)
 Given a tensor output from a YOLOv5 model, parse it's output into something more usable.
 

Private Attributes

torch::jit::script::Module m_trModel
 
torch::Device m_trDevice = torch::kCPU
 
std::string m_szModelPath
 
bool m_bReady
 
std::string m_szModelTask
 
cv::Size m_cvModelInputSize
 
std::vector< std::string > m_vClassLabels
 

Detailed Description

This class is designed to enable quick, easy, and robust inferencing of .pt yolo model.

Author
clayjay3 (clayt.nosp@m.onra.nosp@m.ycowe.nosp@m.n@gm.nosp@m.ail.c.nosp@m.om)
Date
2025-01-06

Member Enumeration Documentation

◆ HardwareDevices

enum class yolomodel::pytorch::PyTorchInterpreter::HardwareDevices
strong
169 {
170 eCPU, // The CPU device.
171 eCUDA // The CUDA device.
172 };

Constructor & Destructor Documentation

◆ PyTorchInterpreter()

yolomodel::pytorch::PyTorchInterpreter::PyTorchInterpreter ( std::string  szModelPath,
HardwareDevices  eHardwareDevice = HardwareDevices::eCUDA 
)
inline

Construct a new PyTorchInterpreter object.

Parameters
szModelPath- The path to the model to open and inference.
trDevice- The device to run the model on. Default is CUDA. Other options are CPU and MKLDNN.
Author
clayjay3 (clayt.nosp@m.onra.nosp@m.ycowe.nosp@m.n@gm.nosp@m.ail.c.nosp@m.om)
Date
2025-01-06
188 {
189 // Initialize member variables.
190 m_szModelPath = szModelPath;
191 m_bReady = false;
192 m_cvModelInputSize = cv::Size(640, 640);
193 m_szModelTask = "Unknown";
194 m_vClassLabels = std::vector<std::string>();
195
196 // Translate the hardware device enum to a torch device.
197 switch (eHardwareDevice)
198 {
199 case HardwareDevices::eCPU: m_trDevice = torch::kCPU; break;
200 case HardwareDevices::eCUDA: m_trDevice = torch::kCUDA; break;
201 default: m_trDevice = torch::kCPU; break;
202 }
203
204 // Submit logger message.
205 LOG_INFO(logging::g_qSharedLogger, "Attempting to load model {} onto device {}", szModelPath, m_trDevice.str());
206
207 // Check if the model path is valid.
208 if (!std::filesystem::exists(szModelPath))
209 {
210 // Submit logger message.
211 LOG_ERROR(logging::g_qSharedLogger, "Model path {} does not exist!", szModelPath);
212 return;
213 }
214 // Check if the device is available.
215 if (!torch::cuda::is_available() && m_trDevice == torch::kCUDA)
216 {
217 // Submit logger message.
218 LOG_ERROR(logging::g_qSharedLogger, "CUDA device is not available, falling back to CPU.");
219 m_trDevice = torch::kCPU;
220 return;
221 }
222 else
223 {
224 // Submit logger message.
225 LOG_INFO(logging::g_qSharedLogger, "Using device: {}", m_trDevice.str());
226 }
227
228 // Finally, attempt to load the model.
229 try
230 {
231 // Load the model and set it to eval mode for inference.
232 torch::jit::ExtraFilesMap trExtraConfigFiles{{"config.txt", ""}};
233 m_trModel = torch::jit::load(szModelPath, m_trDevice, trExtraConfigFiles);
234 m_trModel.eval();
235
236 // Use nlohmann json to parse the config file.
237 nlohmann::json jConfig = nlohmann::json::parse(trExtraConfigFiles.at("config.txt"));
238 // Get the input image size for the model.
239 m_cvModelInputSize = cv::Size(jConfig["imgsz"][0], jConfig["imgsz"][1]);
240 m_szModelTask = jConfig["task"];
241 for (const auto& item : jConfig["names"].items())
242 {
243 m_vClassLabels.push_back(item.value());
244 }
245 // Submit the config json as a debug message.
246 LOG_DEBUG(logging::g_qSharedLogger, "Model config: {}", jConfig.dump(4));
247
248 // Check if the model is empty.
249 if (m_trModel.get_methods().empty())
250 {
251 LOG_ERROR(logging::g_qSharedLogger, "Model is empty! Check if the correct model file was provided.");
252 return;
253 }
254 // Check if the model did not move to the expected device.
255 if (m_trModel.buffers().size() > 0)
256 {
257 // Get the device of the model.
258 torch::Device model_device = m_trModel.buffers().begin().operator->().device();
259 if (model_device != m_trDevice)
260 {
261 LOG_ERROR(logging::g_qSharedLogger, "Model did not move to the expected device! Model is on: {}", model_device.str());
262 return;
263 }
264 }
265
266 // Model is ready for inference.
267 LOG_INFO(logging::g_qSharedLogger,
268 "Model successfully loaded and set to eval mode. The model is a {} model, and has {} classes.",
269 m_szModelTask,
270 m_vClassLabels.size());
271
272 // Set flag saying we are ready for inference.
273 m_bReady = true;
274 }
275 catch (const c10::Error& trError)
276 {
277 LOG_ERROR(logging::g_qSharedLogger, "Error loading model: {}", trError.what());
278 }
279 }
Size2i Size

◆ ~PyTorchInterpreter()

yolomodel::pytorch::PyTorchInterpreter::~PyTorchInterpreter ( )
inline

Destroy the PyTorchInterpreter object.

Author
clayjay3 (clayt.nosp@m.onra.nosp@m.ycowe.nosp@m.n@gm.nosp@m.ail.c.nosp@m.om)
Date
2025-01-06
289 {
290 // Nothing to destroy.
291 }

Member Function Documentation

◆ Inference()

std::vector< Detection > yolomodel::pytorch::PyTorchInterpreter::Inference ( const cv::Mat cvInputFrame,
const float  fMinObjectConfidence = 0.85,
const float  fNMSThreshold = 0.6 
)
inline

Given an input image forward the image through the YOLO model to run inference on the PyTorch model, then parse and repackage the output tensor data into a vector of easy-to-use Detection structs.

Parameters
cvInputFrame- The RGB camera frame to run detection on.
fMinObjectConfidence- Minimum confidence required for an object to be considered a valid detection
fNMSThreshold- Threshold for Non-Maximum Suppression, controlling overlap between bounding box predictions.
Returns
std::vector<Detection> - A vector of structs containing information about the valid object detections in the given image.
Note
The input image MUST BE RGB format, otherwise you will likely experience prediction accuracy problems.
Author
clayjay3 (clayt.nosp@m.onra.nosp@m.ycowe.nosp@m.n@gm.nosp@m.ail.c.nosp@m.om)
Date
2025-01-06
309 {
310 // Force single-threaded execution (if acceptable for your workload)
311 torch::set_num_threads(1);
312 // Create instance variables.
313 std::vector<Detection> vObjects;
314
315 // Preprocess the given image and pack int into an image.
316 torch::Tensor trTensorImage = PreprocessImage(cvInputFrame, m_trDevice);
317
318 // Perform inference.
319 std::vector<torch::jit::IValue> vInputs;
320 vInputs.push_back(trTensorImage);
321 torch::Tensor trOutputTensor;
322 try
323 {
324 trOutputTensor = m_trModel.forward(vInputs).toTensor();
325 }
326 catch (const c10::Error& trError)
327 {
328 LOG_ERROR(logging::g_qSharedLogger, "Error running inference: {}", trError.what());
329 return vObjects;
330 }
331
332 // Calculate the general stride sizes for YOLO based on input tensor shape.
333 int nImgSize = m_cvModelInputSize.height;
334 int nP3Stride = std::pow((nImgSize / 8), 2);
335 int nP4Stride = std::pow((nImgSize / 16), 2);
336 int nP5Stride = std::pow((nImgSize / 32), 2);
337 // Calculate the proper prediction length for different YOLO versions.
338 int nYOLOv5AnchorsPerGridPoint = 3;
339 int nYOLOv8AnchorsPerGridPoint = 1;
340 int nYOLOv5TotalPredictionLength =
341 (nP3Stride * nYOLOv5AnchorsPerGridPoint) + (nP4Stride * nYOLOv5AnchorsPerGridPoint) + (nP5Stride * nYOLOv5AnchorsPerGridPoint);
342 int nYOLOv8TotalPredictionLength =
343 (nP3Stride * nYOLOv8AnchorsPerGridPoint) + (nP4Stride * nYOLOv8AnchorsPerGridPoint) + (nP5Stride * nYOLOv8AnchorsPerGridPoint);
344
345 // Parse the output tensor.
346 std::vector<int> vClassIDs;
347 std::vector<std::string> vClassLabels;
348 std::vector<float> vClassConfidences;
349 std::vector<cv::Rect> vBoundingBoxes;
350
351 // Get the largest dimension of our output tensor.
352 int nLargestDimension = *std::max_element(trOutputTensor.sizes().begin(), trOutputTensor.sizes().end());
353 // Check if the output tensor is YOLOv5 format.
354 if (nLargestDimension == nYOLOv5TotalPredictionLength)
355 {
356 // Parse inferenced output from tensor.
357 this->ParseTensorOutputYOLOv5(trOutputTensor, vClassIDs, vClassConfidences, vBoundingBoxes, cvInputFrame.size(), fMinObjectConfidence);
358 }
359 // Check if the output tensor is YOLOv8 format.
360 else if (nLargestDimension == nYOLOv8TotalPredictionLength)
361 {
362 // Parse inferenced output from tensor.
363 this->ParseTensorOutputYOLOv8(trOutputTensor, vClassIDs, vClassConfidences, vBoundingBoxes, cvInputFrame.size(), fMinObjectConfidence);
364 }
365
366 // Perform NMS to filter out bad/duplicate detections.
367 NonMaxSuppression(vObjects, vClassIDs, vClassConfidences, vBoundingBoxes, fMinObjectConfidence, fNMSThreshold);
368
369 // Loop through the final detections and set the class names for each detection based on the class ID.
370 for (size_t nIter = 0; nIter < vObjects.size(); ++nIter)
371 {
372 // Check if the class ID is valid.
373 if (vClassIDs[nIter] >= 0 && vClassIDs[nIter] < static_cast<int>(m_vClassLabels.size()))
374 {
375 vObjects[nIter].szClassName = m_vClassLabels[vClassIDs[nIter]];
376 }
377 else
378 {
379 vObjects[nIter].szClassName = "UnknownClass";
380 }
381 }
382
383 return vObjects;
384 }
MatSize size
void ParseTensorOutputYOLOv8(const torch::Tensor &trOutput, std::vector< int > &vClassIDs, std::vector< float > &vClassConfidences, std::vector< cv::Rect > &vBoundingBoxes, const cv::Size &cvInputFrameSize, const float fMinObjectConfidence)
Given a tensor output from a YOLOv5 model, parse it's output into something more usable.
Definition YOLOModel.hpp:550
void ParseTensorOutputYOLOv5(const torch::Tensor &trOutput, std::vector< int > &vClassIDs, std::vector< float > &vClassConfidences, std::vector< cv::Rect > &vBoundingBoxes, const cv::Size &cvInputFrameSize, const float fMinObjectConfidence)
Given a tensor output from a YOLOv5 model, parse it's output into something more usable.
Definition YOLOModel.hpp:442
torch::Tensor PreprocessImage(const cv::Mat &cvInputFrame, const torch::Device &trDevice)
Given an input image, preprocess the image to match the input tensor shape of the model,...
Definition YOLOModel.hpp:413
void NonMaxSuppression(std::vector< Detection > &vObjects, std::vector< int > &vClassIDs, std::vector< float > &vClassConfidences, std::vector< cv::Rect > &vBoundingBoxes, float fMinObjectConfidence, float fNMSThreshold)
Perform non max suppression for the given predictions. This eliminates/combines predictions that over...
Definition YOLOModel.hpp:69
Here is the call graph for this function:
Here is the caller graph for this function:

◆ IsReadyForInference()

bool yolomodel::pytorch::PyTorchInterpreter::IsReadyForInference ( ) const
inline

Check if the model is ready for inference.

Returns
true - Model is ready for inference.
false - Model is not ready for inference.
Author
clayjay3 (clayt.nosp@m.onra.nosp@m.ycowe.nosp@m.n@gm.nosp@m.ail.c.nosp@m.om)
Date
2025-02-13
395{ return m_bReady; }
Here is the caller graph for this function:

◆ PreprocessImage()

torch::Tensor yolomodel::pytorch::PyTorchInterpreter::PreprocessImage ( const cv::Mat cvInputFrame,
const torch::Device &  trDevice 
)
inlineprivate

Given an input image, preprocess the image to match the input tensor shape of the model, then return the preprocessed image as a tensor.

Parameters
cvInputFrame- The input image to preprocess.
trDevice- The device to run the model on.
Returns
torch::Tensor - The preprocessed image as a tensor.
Author
clayjay3 (clayt.nosp@m.onra.nosp@m.ycowe.nosp@m.n@gm.nosp@m.ail.c.nosp@m.om)
Date
2025-03-08
414 {
415 // Resize the input image to match model and normalize it to 0-1.
416 cv::Mat cvResizedImage;
417 cv::resize(cvInputFrame, cvResizedImage, cv::Size(m_cvModelInputSize.width, m_cvModelInputSize.height), cv::INTER_LINEAR);
418 cvResizedImage.convertTo(cvResizedImage, CV_32FC3, 1.0 / 255.0);
419
420 // Convert OpenCV mat to a tensor.
421 torch::Tensor trTensorImage = torch::from_blob(cvResizedImage.data, {1, cvResizedImage.rows, cvResizedImage.cols, 3}, torch::kFloat);
422 trTensorImage = trTensorImage.permute({0, 3, 1, 2}); // Convert to CxHxW format.
423 trTensorImage = trTensorImage.to(trDevice); // Move tensor to the specified hardware device.
424
425 return trTensorImage;
426 }
uchar * data
void convertTo(OutputArray m, int rtype, double alpha=1, double beta=0) const
void resize(InputArray src, OutputArray dst, Size dsize, double fx=0, double fy=0, int interpolation=INTER_LINEAR)
INTER_LINEAR
Here is the call graph for this function:
Here is the caller graph for this function:

◆ ParseTensorOutputYOLOv5()

void yolomodel::pytorch::PyTorchInterpreter::ParseTensorOutputYOLOv5 ( const torch::Tensor &  trOutput,
std::vector< int > &  vClassIDs,
std::vector< float > &  vClassConfidences,
std::vector< cv::Rect > &  vBoundingBoxes,
const cv::Size cvInputFrameSize,
const float  fMinObjectConfidence 
)
inlineprivate

Given a tensor output from a YOLOv5 model, parse it's output into something more usable.

Parameters
trOutput- A reference to the output tensor from the model. The tensor should be of shape [1, 25200, 85] for YOLOv5.
vClassIDs- A reference to a vector that will be filled with class IDs for each prediction. The class ID of a prediction will be chosen
vClassConfidences- A reference to a vector that will be filled with the highest class confidence for
vBoundingBoxes- A reference to a vector that will be filled with cv::Rect bounding box for each prediction.
cvInputFrameSize- The size of the original input frame. This is used to scale the bounding boxes back to the original image size.
fMinObjectConfidence- The minimum confidence for determining which predictions to keep. Predictions with a confidence below this value will be discarded.
Author
clayjay3 (clayt.nosp@m.onra.nosp@m.ycowe.nosp@m.n@gm.nosp@m.ail.c.nosp@m.om)
Date
2025-03-13
448 {
449 /*
450 * For YOLOv5, you divide your image size, i.e. 640 by the P3, P4, P5 output strides of 8, 16, 32 to arrive at grid sizes
451 * of 80x80, 40x40, 20x20. Each grid point has 3 anchors by default (anchor box values: small, medium, large), and each anchor contains a vector 5 +
452 * nc long, where nc is the number of classes the model has. So for a 640 image, the output tensor will be [1, 25200, 85]
453 */
454 // Squeeze the batch dimension from the output tensor.
455 torch::Tensor trSqueezedOutput = trOutput.squeeze(0);
456
457 // Move the tensor to CPU if necessary. If we're using GPU and we don't move the tensor to CPU, we will get an error and it will be slow.
458 if (trSqueezedOutput.device().is_cuda())
459 {
460 trSqueezedOutput = trSqueezedOutput.to(torch::kCPU);
461 }
462 // Convert tensor to float if necessary.
463 if (trSqueezedOutput.scalar_type() != torch::kFloat32)
464 {
465 trSqueezedOutput = trSqueezedOutput.to(torch::kFloat32);
466 }
467 // Ensure tensor is contiguous in memory.
468 if (!trSqueezedOutput.is_contiguous())
469 {
470 trSqueezedOutput = trSqueezedOutput.contiguous();
471 }
472
473 // Create an accessor for fast element-wise access.
474 at::TensorAccessor trAccessor = trSqueezedOutput.accessor<float, 2>();
475 const int nNumDetections = trSqueezedOutput.size(0);
476 const int nTotalValues = trSqueezedOutput.size(1); // equals 5 + number_of_classes
477
478 // Loop through each detection.
479 for (int i = 0; i < nNumDetections; i++)
480 {
481 // Get the objectness confidence. This is the 5th value for each grid/anchor prediction. (4th index)
482 float fObjectnessConfidence = trAccessor[i][4];
483
484 // Check if the object confidence is greater than or equal to the threshold.
485 if (fObjectnessConfidence < fMinObjectConfidence)
486 {
487 continue;
488 }
489
490 // Retrieve bounding box data.
491 float fCenterX = trAccessor[i][0];
492 float fCenterY = trAccessor[i][1];
493 float fWidth = trAccessor[i][2];
494 float fHeight = trAccessor[i][3];
495
496 // Scale bounding box to original image size.
497 int nLeft = static_cast<int>((fCenterX - (0.5 * fWidth)) * cvInputFrameSize.width);
498 int nTop = static_cast<int>((fCenterY - (0.5 * fHeight)) * cvInputFrameSize.height);
499 int nBoundingWidth = static_cast<int>(fWidth * cvInputFrameSize.width);
500 int nBoundingHeight = static_cast<int>(fHeight * cvInputFrameSize.height);
501
502 // Repackaged bounding box data to be more readable.
503 cv::Rect cvBoundingBox(nLeft, nTop, nBoundingWidth, nBoundingHeight);
504
505 // Loop over class confidence values and find the class ID with the highest confidence.
506 float fClassConfidence = -1.0f;
507 int nClassID = -1;
508 for (int j = 5; j < nTotalValues; j++)
509 {
510 float fConfidence = trAccessor[i][j];
511 if (fConfidence > fClassConfidence)
512 {
513 fClassConfidence = fConfidence;
514 nClassID = j - 5;
515 }
516 }
517
518 // Only process detections that meet the minimum confidence.
519 if (fClassConfidence < fMinObjectConfidence)
520 {
521 continue;
522 }
523
524 // Add data to vectors.
525 vClassIDs.emplace_back(nClassID);
526 vClassConfidences.emplace_back(fClassConfidence);
527 vBoundingBoxes.emplace_back(cvBoundingBox);
528 }
529 }
Here is the caller graph for this function:

◆ ParseTensorOutputYOLOv8()

void yolomodel::pytorch::PyTorchInterpreter::ParseTensorOutputYOLOv8 ( const torch::Tensor &  trOutput,
std::vector< int > &  vClassIDs,
std::vector< float > &  vClassConfidences,
std::vector< cv::Rect > &  vBoundingBoxes,
const cv::Size cvInputFrameSize,
const float  fMinObjectConfidence 
)
inlineprivate

Given a tensor output from a YOLOv5 model, parse it's output into something more usable.

Parameters
trOutput- A reference to the output tensor from the model.
vClassIDs- A reference to a vector that will be filled with class IDs for each prediction. The class ID of a prediction will be choosen by the highest class confidence for that prediction.
vClassConfidences- A reference to a vector that will be filled with the highest class confidence for that prediction.
vBoundingBoxes- A reference to a vector that will be filled with cv::Rect bounding box for each prediction.
cvInputFrameSize- The size of the original input frame before resizing. This is used to scale the bounding box back to the original size.
fMinObjectConfidence- The minimum confidence required for an object to be considered a valid detection.
Note
For YOLOv8, you divide your image size, i.e. 640 by the P3, P4, P5 output strides of 8, 16, 32 to arrive at grid sizes of 80x80, 40x40, 20x20. Each grid point has 1 anchor, and each anchor contains a vector 4 + nc long, where nc is the number of classes the model has. So for a 640 image, the output tensor will be [1, 84, 8400] (80 classes). Notice how the larger dimensions is swapped when compared to YOLOv8.
Author
clayjay3 (clayt.nosp@m.onra.nosp@m.ycowe.nosp@m.n@gm.nosp@m.ail.c.nosp@m.om)
Date
2025-03-08
556 {
557 /*
558 * Permute the output tensor shape to match the expected format of the model. If the model is YOLOv8, the output
559 * shape for a 640x640 image will be [1, 4 + nc, 8400] (nc = number of classes). Notice how the larger dimensions is swapped
560 * when compared to YOLOv5. We will permute the tensor to [1, 8400, 4 + nc] to make it easier to parse. Then squeeze the
561 * tensor to remove the batch dimension so the final shape will be [8400, 4 + nc]. Thanks pytorch for being cool with the
562 * permute function.
563 */
564 // Permute the tensor shape from [1, 4 + nc, 8400] to [1, 8400, 4 + nc]
565 // and then squeeze to remove the batch dimension, resulting in [8400, 4 + nc]
566 torch::Tensor trPermuteOutput = trOutput.permute({0, 2, 1}).squeeze(0);
567
568 // Move tensor to CPU if necessary. If we're using GPU and we don't move the tensor to CPU, we will get an error and it will be slow.
569 if (trPermuteOutput.device().is_cuda())
570 {
571 trPermuteOutput = trPermuteOutput.to(torch::kCPU);
572 }
573 // Convert tensor to float if necessary.
574 if (trPermuteOutput.scalar_type() != torch::kFloat32)
575 {
576 trPermuteOutput = trPermuteOutput.to(torch::kFloat32);
577 }
578 // Ensure tensor is contiguous in memory.
579 if (!trPermuteOutput.is_contiguous())
580 {
581 trPermuteOutput = trPermuteOutput.contiguous();
582 }
583
584 // Create an accessor for fast element-wise access.
585 at::TensorAccessor trAccessor = trPermuteOutput.accessor<float, 2>();
586 const int nNumDetections = trPermuteOutput.size(0);
587 const int nTotalValues = trPermuteOutput.size(1); // equals 4 + number_of_classes
588
589 // Loop through each detection.
590 for (int i = 0; i < nNumDetections; i++)
591 {
592 float fClassConfidence = -1.0f;
593 int nClassID = -1;
594
595 // Loop over class confidence values.
596 for (int j = 4; j < nTotalValues; j++)
597 {
598 float fConfidence = trAccessor[i][j];
599 if (fConfidence > fClassConfidence)
600 {
601 fClassConfidence = fConfidence;
602 nClassID = j - 4;
603 }
604 }
605
606 // Only process detections that meet the minimum confidence.
607 if (fClassConfidence < fMinObjectConfidence)
608 {
609 continue;
610 }
611
612 // Retrieve bounding box data.
613 float fCenterX = trAccessor[i][0];
614 float fCenterY = trAccessor[i][1];
615 float fWidth = trAccessor[i][2];
616 float fHeight = trAccessor[i][3];
617
618 // Scale bounding box to original image size.
619 int nLeft = static_cast<int>(fCenterX * cvInputFrameSize.width / 640.0f - (0.5f * fWidth * cvInputFrameSize.width / 640.0f));
620 int nTop = static_cast<int>(fCenterY * cvInputFrameSize.height / 640.0f - (0.5f * fHeight * cvInputFrameSize.height / 640.0f));
621 int nBoxWidth = static_cast<int>(fWidth * cvInputFrameSize.width / 640.0f);
622 int nBoxHeight = static_cast<int>(fHeight * cvInputFrameSize.height / 640.0f);
623 cv::Rect cvBoundingBox(nLeft, nTop, nBoxWidth, nBoxHeight);
624
625 // Append results.
626 vClassIDs.push_back(nClassID);
627 vClassConfidences.push_back(fClassConfidence);
628 vBoundingBoxes.push_back(cvBoundingBox);
629 }
630 }
Here is the caller graph for this function:

The documentation for this class was generated from the following file: