Master's Thesis

FPGA Implementation of a CNN for Oriented Object Detection in Aerial Images

Francisco Miguel Correia Torcato Carrilho2024

Key information

Authors:

Francisco Miguel Correia Torcato Carrilho (Francisco Miguel Correia Torcato Carrilho)

Supervisors:

Mário Pereira Véstias (Mário Pereira Véstias); Horácio Cláudio De Campos Neto (Horácio Cláudio De Campos Neto)

Published in

06/19/2024

Abstract

The objective of this work is to design and implement a hardware/software system for oriented object detection of aerial images. The system is based on a convolutional neural network (CNN) detector and is aimed at processing one aerial image per second using a system-on-chip field-programmable gate array (SoC FPGA). Object detection in aerial images and videos is an important and challenging computer vision problem with important real-world applications such as emergency rescue, disaster relief, and surveillance. Oriented object detection considers both the position of the object and its rotation angle or orientation which makes detections significantly more accurate but more computationally intensive. Most target applications must be locally computed on unmanned aerial vehicles (UAVs), which requires implementing efficient solutions on edge devices, namely SoC FPGAs. The hardware/software system implements an optimized version of an oriented object detection model based on the YOLO object detection algorithm. The original YOLO model was optimized and quantized, with both weights and activations represented with a specific 8-bit fixed-point format, to provide an efficient hardware-friendly solution. The system is composed by a dedicated hardware accelerator, which accelerates the inference of the main layers of the CNN model by executing 256 multiply-accumulate operations in parallel, and by a software processor that executes the less computing intensive functions. The final hardware/software system, implemented in a Zynq SoC FPGA, executes the inference of the R-YOLOv4 with a frame rate close to 1 FPS and a power consumption of only 7.3 W.

Publication details

Authors in the community:

Supervisors of this institution:

Fields of Science and Technology (FOS)

electrical-engineering-electronic-engineering-information-engineering - Electrical engineering, electronic engineering, information engineering

Publication language (ISO code)

eng - English

Rights type:

Embargo lifted

Date available:

04/14/2025

Institution name

Instituto Superior Técnico