Consultancy and Total Solutions Training Provider for Embedded Systems, Electronics and Electrical Engineering, Programming, Computing, Operations, ISO9000, ISO14000 and Management.

Bridging the Gap

Training Courses

Hardware Accelerators for High Performance Computing

Course id: 0018


The digital age heralds the need for vast data processing and number crunching capabilities to satisfy our insatiable needs. Applications such as video compression/ decompression, voice recognition and 3D graphics typically required electronic devices to incorporate some form of hardware acceleration to produce the required processing throughput.

This course covers the essential concepts of structured development of hardware accelerators such as resource management, partitioning, communication and optimization. The course interlaces lectures with hands-on practicals that go from equation to efficient implementation.

Course highlight
Participants will have practical design experience using the Altera DE2 FPGA development board, together with the use of Altera's Quartus II development software.

What you will learn

This course concentrates on the theoretical and practical knowledge to allow participants to achieve the following learning outcomes. Upon completing the course, participants would be able to:
  • Develop FPGA-based systems with soft-core processors using the Altera Quartus II environment
  • Partition computations for execution in software and hardware
  • Use numerical methods to solve complex computations such as differential equations
  • Estimate speed improvements and hardware resource tradeoffs of various computational methods
  • Compute equations purely in hardware
  • Optimize equations to reduce hardware resources and/or improve computation speed
  • Implement partial and complete loop unrolling for maximum performance
  • Implement the pipelining paradigm to maximize computation throughput

Who should attend

Engineers and researchers who are developing, testing and/or debugging heterogeneous computing systems and platforms.


Participants have to be familiar with the C programming language. Basic understanding of HDL design flow and languages such as Verilog and VHDL are beneficial.

Course methodology

This course is presented in a workshop style with lectures interlaced with demonstrations and practicals for maximum understanding.

Course duration

4 days.

Course structure

  • Introduction 1
    • Heterogeneous computing
    • System-on-Chip
    • Hands-on practical 1: Introduction to the NIOS II
  • Introduction 2
    • Hardware-software co-design
    • Hands-on practical 2: Neural Network Implementation (Software)
  • Partitioning 1
    • Amdahl's Law
    • Partitioning
    • Resource estimation
    • Hands-on practical 3: Resource and Performance Estimation
  • Partitioning 2
    • Partitioning
    • Incrementing functionality
    • Decrementing functionality
    • Hands-on practical 4: Neural Network Implementation (Hardware)
  • Communication 1
    • Hardware
    • Signals
    • Hands-on practical 5: Inter-Module Communication
  • Communication 2
    • Bus
    • AMBA
    • Software
    • Shared memory
    • Message passing
    • Hands-on practical 6: Shared Memory
  • Optimisation 1
    • Introduction
    • Constant folding
    • Strength reduction
    • Dead code elimination
    • Dead store elimination
    • Inline expansion
    • Removing recursion
    • Loop reversal
    • Loop unrolling
    • Hands-on practical 7: Loop Unrolling
  • Optimisation 2
    • Chaining
    • Multicycling
    • Pipelining
    • Hands-on practical 8: Pipelining, Chaining and Multicycling


Dr Royan Ong

Course Schedule





News on ProvenPac

  ProvenPac Sdn. Bhd.
  C-4-3 Gembira Park,
  Jalan Riang, 58200
  Kuala Lumpur, Malaysia

  Tel: +603 03 5889 5889

No public course
currently scheduled.


Please inform me when
this course is scheduled.


Please contact me to
arrange in-house training.