Lessons and Learning Resources


Note: Please see Amathuba site for latest lecture slides and course handout. 
The course structure and pracs are being reworked for 2024.


Week 1: Intro to HPES and Essential Terms (12-16 Feb)
Lecture 1 PDF Lecture 01

This is the first lecture for this course, mainly a Meet & Greet and explanation of how the course is structured and what is involved. Have a look though this lecture and then give Quiz0 a try..

articleRead L01: Asanovic et al., The landscape of parallel computing research: A view from Berkeley
Read L01b: Asanovic et al.,A view of the parallel computing landscape (recommended easier read)

Lecture 2 PDF Lecture 02

Terms, Validation vs. Verification, Commonly used verification Amdahl’s Law, Dealing with reading assignments. Then discussion bridging to Prac1 and use of golden measure. Discuss reading R01.

 Slides with notes viewLecture2(notes).pdf : lecture printed with in notes view.

video clip Watch: Linux Magazine Video: Understanding Parallel Computing (Part 1): Amdahl's Law
Watch (optional): Understanding Parallel Computing (Part 2): The Lawn Mower Law
We are here Week 2: Towards Digital Accelerators. Benchmarking Techniques. (19-23 Feb)
Lecture 3 PDF

Lecture 03
Considerations for EDGE computing & relevance of HPES to such system development.
- Microprocessor-based vs. FPGA-based solutions for reconfigurable computing
- Platform & tools we will use
- GPUs (issues and benefits) and programming of these

article Reading tasks

Links (suggested further reading or activities):

Lecture 6 PDF Lecture 04

OpenCL Overview. How to program in OpenCL. You may want to hyperspace to slide 42 if you want to get going with Prac2 asap. (some clickable Voice annotations in the PowerPoint slides updated).
code exampleOpenCL example: OpenCL_mmexample.zip

Lecture 5 PDF

Lecture 05
Performance benchmarking (Part 1). Basic wall-clock benchmarking techniques.

cfile Code mentioned in the slides:
Cycle.h Cycles.c


Week 3: Parallel programming & Performance Benchmarks (27 Feb – 3 Mar)
Lecture 6 PDF Lecture 06

Performance benchmarking (Part 2).
- Metrics of Performance.
- Average Cycles Per Instruction (ACPI)
- Cost vs. Performance
- SWAP
- Profiling code (Reading: Valgrind)

Links (suggested further reading or activities):

Lecture 7 PDF Lecture 07

Parallel Computing Fundamentals. Large Scale Parallelism. Mainstream parallel. Classic Parallel approaches. Flynns Taxonomy. Some Calculations - effective parallelism, parallel Efficiency, Gustafson's Law.

articleL7_learning_activity_A.pdf: Considering if SPMD or MPMD better than SPSD
L7_learning_activity_B.pdf: Estimation for maximum effective parallelism

(Excel document that illustrates to the solution discussed in slides: L7-activity calc.xlsx)

video clip Watch:

Lecture 8 PDF Lecture 08

Amdahl's Law for the Multicore Era and Base Core Equivalents (BCE).

- Understanding Amdahl's Law for the Multicore Era
- Base Core Equivalents (BCEs)
- Calculating performance using BCEs

articleReading: Hill and Marty 2008: "Amdahl's Law in the Multicore Era" source: https://ieeexplore.ieee.org/abstract/document/4563876

Week 4: Design of Parallel Systems and Programs (6-10 Mar)
Lecture 9 PDF Lecture 09

Steps in parallelizing programs. Understand the problem. Partitioning. Granularity. Identify data dependencies. In these presentation covering steps 1-3: 1) Understand the problem, 2) Partitioning (i.e. separation into main tasks), 3) Granularity (assessing granularity of the problem vs. the parallelism).

Note: refined description to explain the difference between 'granularity of problem' vs. 'granularity of parallelism'. (Discussion brings in issues and thinking points of GA2 Conceptual Assignment)
Lecture 10 PDF Lecture 10

Continuing the steps of designing parallel systems. Covering Steps 5-7: 5) Synchronization, 6) Load balancing, 7) Performance analysis and tuning. (Step 4 is a rather elaborate step, involving communication and integration of subsystems and is therefore covered in a lecture on its own).

Week 5: Communication and Memory Architecture (13-17 Mar)
Lecture 11 PDF Lecture 11

Step 4: communication: Factors related to Communication; Cost of communications; Latency vs. Bandwidth; Baud rate vs. Bandwidth; Effective bandwidth; Visibility of Communications; Synchronous vs. asynchronous; Scope of communications; Collective communications; Efficiency of communications.
prescribed articleClass Activity 1 (Effective Bandwidth)

Lecture 12 PDF Lecture 12

Distributed and Shared Memory Architectures: Distributed memory infrastructure; Shared memory infrastructure; Hybrid memory infrastructure; Warming up for MMU, DMA and memory considerations (to be delved into next term).

Week 6: Intro to YODA Project and Verilog Refresher (20-24 Mar)
Lecture 13 PDF Lecture 13

Your Own Digital Accelerator (YODA) Project; Discussion of Digital Accelerators.
(if annotations not working in pptx, try this mp4 version)

Lecture 14 PDF Lecture 14

Programmable Logics and FPGAs Interns. Topics covered:

  • Programmable Logic Devices
  • What is so special about FPGAs
  • FPGA interns
  • Xilinx Slices {know what a slice is conceptually, no details or specifics need be know for tests}.
Recommended preparatory reading (especially if you have not used Verilog previously):
Lecture 15 PDF Lecture 15

Coding in Verilog: Basics of Verilog coding. Exercise in Verilog. Verilog simulators. Intro to Verilog in Vivado/ISE. Test bench. Generating Verilog from Schematic Editors. (This presentation does recap various aspects covered in EEE3096S ES-II).

video clip Watch:

suggested Recommended: give EDA playground (www.edaplayground.com) a try, you may want to set up your own login for using the tool. Take a look at the useful tutorial and support that is available on ASIC World at www.asic-world.com/examples/verilog.

Week 7: Class Test. YODA Project Planning & Term 1 catch-up (3-7 Apr)

Test 1. Progress and planing of your YODA projects. Catch up on practical assignments.
Week 8: More Verilog HDL and Simulation Techniques (10-14 Apr)
Lecture 16 PDF Lecture 16

Topics include: Brief recap (of Verilog items from lect 15); Busses and Endianness; Functions in Verilog; Implementing state machines;
The zip file contains the code for AliveFSM mentioned in the slides and the 'CosmicDetector' C starting point for the voluntary learning activity described at end of lecture.
AliveFSM and activity (statemachine example and more involved CosmicDetector class activity)

Blocking and non-blocking simulation example.

Lecture 19 PDF Lecture 17

Topics covered: Attributes; Constraints; UCF Files; Xilinx Design Constraints

{many of these slides won't be examinable, slides relevant to exams: 4-7, don't need to know syntax for UCF (skip 8-13), be familar with what XDC is but won't be expected to know use this tool and don't need to worry with the UCF tutorial, although useful to try it if you plan to use a Xilinx FPGA board.}

Week 9: FPGA systems and related architecture (17-21 Apr)
Lecture 18 PDF Lecture 18

FPGA and CPU Performance Comparison (audio annotations in pptx).
Topics covered: FPGA performance evaluation; FPGA vs CPU performance; FPGA families; YODA issues (not examined).
Class Activity (manually place and route a bitfile)
Class Activity Solution

Lecture 19 PDF Lecture 19

HDL Imitation method. Using standard benchmarks for FPGAs. Amdahl's Law for FPGAs.

The 'HDL Imitation' method is a suggested approach to quickly get together both a golden measure and a start on a HDL solution. Various case studies are used in these slides for examples of how the methods are applied.

Project Tips PDF Project MS-1

Tips on strategizing your YODA projects, and deciding roles for the team members. Plans on marking and review of MS1 submission and onward to MS2 and expectations for that milestone.

Week 10: Config architecture, Interconnects, Apr (24-28 May)
Lecture 20 PDF Lecture 20

More Verilog; Configuration Architectures; RC Building Blocks (for IP Cores); Basic handshaking, latches and other interface ingredients.
This lecture provides theories to assist in how subsystems might cooperate in more robust and dependable ways.(Includes non-examined case studies of unusual FPGA platform designs). This might feel like 3 lectures in one... it effectively is, but has been streamlined and the exam syllabus will identify aspects of this lecture that should be focused on for preparing for the exam.
Example of handshaking in Verilog: GenResult
GenResults.v testbench: GenResults_tb.v

Week 11: More HDL methods; RC Design Process (1-5 May)
Lecture 21 PDF Lecture 21

On-chip interconnection bus topologies; Interfacing standards; Memory types; Memory Control Unit (MCU) (Part 1 of 2); Using Memory and MCU in Verilog

Example single-port memory unit in Verilog: RAM MCU
ramc.v testbench: ramc_tb.v
Lecture 22 PDF Lecture 22

Memory Control Units (part 2 of 2): Dual-port memory control unit, Setting up memory in code and in simulation; FIFO and LIFO. {optional reading: On-chip Interfacing Standards: Wishbone and how it works, The Altera/Intel Avalon Bus}.

Example half-duplex single-port memory unit in Verilog: HDP-RAM MCU
hdp_ram.v testbench: hdp_ram_tb.v

Example full-duplex single-port memory unit in Verilog: FDP-RAM MCU
fdp_ram.v testbench: fdp_ram_tb.v
Week 12: HPES and Reconfigurable Computing Design Process; Softcores (8-12 May)
Lecture 23 PDF Lecture 23

HPES Development Process and Management Aspects: Where work is done; Division of Labour (DoL); HPES development process; Setting system objectives; Costs and risks; Monitoring progress; Documentation; Effort, Productivity and Progress. (Indeed, design and development team factors that university graduates working on advanced embedded systems should be aware of.)

(Optional extra slides re intro to Doxygen)

Lecture 24 PDF Lecture 24

Note that slide 15 onward are optional reading and will not be examined this year.

Comments/study tips: '[Software] Quality Assurance'. NB: For the exam, you can skip most of this lecture, it is largely optional additional reading and will not be examined! However, as per my earlier comment, it is good for a university graduate to know about these issues. The '[software]' is in square brackets because, while much of these slides do related to software, much of the aspects also relate to HPES and FPGA-based HDL disign work.

Topics: What is quality?; Software quality; Software quality assurance (SQA); Software quality systems; Consequences of bad SQA; An evolutionary model of SQA in organizations; Kaizen of software and Formal Specifications as getting closer to accurately specifying quality.

Summary: Summary and study tips (17 May)

Summary slides will to be provided soon