الفهرس | Only 14 pages are availabe for public view |
Abstract Deep Convolutional Neural Networks (CNNs) play an important role incomputer vision applications. However, they are computational-intensiveand resource-consuming. Thus, it is hard to integrate them into embeddedplatforms. The FPGA is one of the most promising platforms for acceleratingDeep CNNs due to its configurability, high-performance, low power, andshorter development cycles. In this research, we introduce a diverse HighLevel Synthesis (HLS) C++ Deep CNNs compiler that generates highlyefficient implementation for accelerating the inference task of Deep CNNs onFPGAs. We developed HLS C++ implementations for the Deep CNNs blocksthat efficiently utilizes the FPGA available resources to achieve the maximuminference performance. The developed compiler automates the customizationof the accelerator HLS implementation to best fit the selected Deep CNNmodel. The proposed work is demonstrated with an implementation oftwo different Deep CNN models, Resnet-50 and VGG16, on Xilinx ZynqMPSoC using SDSoC development environment |