The FPGA-Ignite Summer School series is kindly supported by Heidelberg University, Carl-Zeiss-Stiftung, the FORTE Project and BMBF

The summer school is one week opportunity for networking, exciting lectures and a hackathon on designing a custom RISC-V SoC (that will be subsequently taped-out thanks to the kind support of efabless).

Registration

Thanks to support from Bundesministerium für Bildung und Forschung and Carl-Zeiss-Stiftung, FPGA Ignite 2024 is free of charge to attend. (Attendees will have to arrange/pay travel and accommodation on their own..) The Central Hotel and the Ibis are in walking distance to the main train station and the venue and usually reasonably priced)

Please fill in the form in this link as soon as possible as the number of seats is limited.

Registration-Form

Important Update: All in-person and online seats are now taken and unfortunately we can not accept more attendees at this time.

Venue

FPGA Ignite 2024 will take place in the European Institute for Neuromorphic Computing (EINC). Street address: Im Neuenheimer Feld 225a, 69120 Heidelberg

For public transport: leave at bus/tram stop Bunsengymnasium (Bus 31, 37; Tram 21, 24, 25). Google maps works reasonably well or install the VGN app on your phone.

map venue venue

Hotels

It is usually best to directly book a hotel through a major portal. Hotel Central and the IBIS are close to the main train station and in walking distance to the campus.

Organizers

Dirk Koch, Riadh Ben Abdelhamid Novel Computing Technologies, ZITI, Heidelberg University

Stefan Wallentowitz Stefan's home page, Hochschule München University of Applied Sciences

Instructions

The virtual machine image contains all the required software for the summer school. Here, you can find the instructions on downloading and installing Virtual Machine software like VirtualBox and KVM/QEMU as well as the corresponding VM-images.

Quick notes on the installation of Virtual Machines (Virtual Box and KVM/QEMU)
  1. Virtual Machine Software (Host)

    The virtual machine (VM) image is available in VDI-format for VirtualBox and QCOW2-Format for KVM/QEMU. If you have already VirtualBox or KVM/QEMU installed, continue to (2). Decide which virtual machine software to install. If unsure, use VirtualBox (easier installation). Download VirtualBox for your system here: Download links for VirtualBox (Install and follow the instructions in the link provided).
  2. Link for the Virtual Machine Images:
    The VM-image is available in two formats. You need only the one that fits to the virtual machine software that you installed. The Virtual Machine image is compressed in 7zip format and available as a Virtual Box Image (VDI) and KVM (QCOW), get it here:
    Compressed KVM VM-image: KVM link
    Compressed VDI VM-image: VDI link
  3. Extract Virtual Machine

    The VM needs to be extracted before starting. If you haven't 7zip installed, install it using the appropriate binary provided in the package (linux, macos, windows, 32- and 64-bit). You can download 7-zip from here: Download links for 7-zip Then extract the image using 7zip.
  4. Start Virtual Machine

    Using the virtual machine software you chose, create a new virtual machine wich has at least Two CPUs and 20GB of RAM (more is better), and open the corresponding image. Then, start the virtual machine and you should see the Debian-system booting to the graphical login screen. Done!
    To log into the VM use:
    Username: user
    Password/root password: FPGAIgnite

Program

Day 1 (Monday, August 5th)
Day 2 (Tuesday, August 6th)
Day 3 (Wednesday, August 7th)
  • 9:00 – 10:30

    RISC-V Customization - Software Analysis Tooling (identify custom instruction), Extending LLVM and Cores (Stefan Wallentowitz) lab materials
  • 10:30 – 11:00

    Coffee break
  • 11:00 – 12:30

    Reconfigurable Instruction Set Extensions using FABulous eFPGAs - when and how (introduction and hands-on tutorial) (Dirk Koch) slides and Tutorial
  • 12:30 – 13:30

    Lunch
  • 13:30 – 15:00

    Boosting the efficiency of RISC-V cores: Fine-grain multi-threading and custom instructions, from concepts to implementation (Riadh Ben Abdelhamid) Tutorial-and-Labs
  • 15:00 – 15:30

    Coffee break
  • 15:30 – 17:00

    Boosting the efficiency of RISC-V cores: Fine-grain multi-threading and custom instructions, from concepts to implementation (Riadh Ben Abdelhamid)
  • 19:00

    Surprise Reception
Day 4 (Thursday, August 8th)
Hackathon Day 1
Day 5 (Friday, August 9th)
Hackathon Day 2
  • 9:00 – 16:30

    Hackathon
  • 16:30

    Best Project Award and Wrap-Up

Presentations

Poster Number Poster Authors Affiliation Title Abstract PDF
1 Chao Qian and Professor Gregor Schiele University of Duisburg-Essen, Germany Energy-Efficient Deep Learning Accelerators with Workload Awareness for Embedded FPGAs My poster is about my PhD dissertation, which focuses on developing an energy-efficient deep learning accelerator on embedded FPGAs. Central to my work is the hypothesis that integrating efficient inference with workload awareness is essential for optimizing energy efficiency in embedded systems. Workload awareness guides the selection of FPGA models, optimization techniques during accelerator design, and adaptation of FPGA behavior during idle periods. PDF
2 Meinhard Kissich and Marcel Baunach Graz University of Technology, Austria FazyRV - A Scalable RISC-V Core FazyRV is an inherently scalable RISC-V RV32I core for control-oriented applications and the IoT. It aims to minimize the area demand while fulfilling performance requirements and to close the gap between prevalent 32-bit and 1-bit-serial RISC-V cores. Thus, the data path can be synthesized to a width of either 1, 2, 4, or 8 bits to process smaller ``chunks'' of the operands in each clock cycle. PDF
3 Czea Sie Chuah DENSO Automotive Deutschland GmbH, Technical University of Munich Pre-Silicon Formal Verification and Post-Silicon Assertion Checker on RISC-V Processor Due to high security demands of upcoming applications, several famous bugs and security-vulnerabilities in processors have been found in the past years, and openness of RISC-V Instruction Set Architecture, we are researching methods to improve and harden hardware security. We identify and formally verify security-critical properties on a RISC-V processor for pre-silicon verification and subsequently adapt and optimize those properties to be converted into synthesizable for fabrication along with design for post-silicon verification. PDF
4 Dr. Hossam O. Ahmed American University of the Middle East (AUM) This project presents novel True Random Number Generator (TRNG) modules implemented on various Intel FPGA chips. The verified output throughput ranges from 300 Mbps to 2.4 Gbps. These TRNG modules comply with both the US NIST SP 800-90B standard and the BSI AIS-31 PTG.2 standard, ensuring high-quality secure random number generation. PDF (Online)
5 Hugh Squires-Parkin Newcastle University Active Autonomy Drone Module An autonomous drone module as part of an undergraduate project. It used an FPGA onboard as an accelerator, but it is mostly about the mmWave radar and is dated midway through the project. Onsite
6 Tim Fernandez-Hart, Dr James Knight, Prof. Tatiana Kalganova Brunel University London This study compares a new arithmetic system - posits - against the established Floating-Point (FP) when training Spiking Neural Networks (SNNs) using only 8-bits. To emulate a future, reduced precision SNN training accelerator, we quantise not just network parameters, but also the gradients, loss, accumulators, momentum and the neuron state variable membrane voltage at every stage of the training process. We not only show that posits are able to do this out-of-the-box, where FP fails, but we also show why this is the case. Onsite
7 Zheyu Liu, Christos Kotselidis Advanced Processor Technology Research Group, Department of Computer Science, University of Manchester Graph Neural Networks Acceleration with Adaptive Dataflow Architectures & FPGA FPGA and reconfigurable dataflow architectures (RDAs) provide superior solutions for accelerating machine learning at various granularities. By leveraging heterogeneous reconfigurability, higher computational capabilities and energy efficiency can be achieved in complex graph tasks. PDF
8 Jan Zielasko(1,2), Rune Krauss(1), Rolf Drechsler
  1. Institute of Computer Science, University of Bremen, Germany
  2. Cyber-Physical Systems, DFKI GmbH, Germany
RISC-V Opt-VP: An Application Analysis Platform Using Bounded Execution Trees. RISC-V Opt-VP is a Virtual Prototype driven binary analysis platform. By analyzing the execution, it identifies instruction sequences that are promising candidates for application specific hardware optimization. The poster gives an overview of the tool and presents our current and planned future work. PDF
9 Georgios Mentzos, Prof. Jörg Henkel Karlsruhe Institute of Technology Approximate Tiny Machine Learning on Lightweight FPGAs The proliferation of tiny devices performing complex machine learning tasks has necessitated the development of new acceleration methods that meet strict energy and latency requirements, while maintaining acceptable accuracy. TinyML has emerged as the perfect candidate for bringing computations closer to the edge allowing secure, private and near-instant responses. The inherently error-resilient nature of ML tasks forms a perfect match with Approximate Computing, enabling performance optimizations across the stack. At the same time, the flexible and efficient FPGA architecture is an excellent candidate to exploit the benefits of Approximate Computing potentially offering significant optimization opportunities for lightweight embedded FPGAs. PDF
10 Morteza Rezaalipour(1), Marco Biasion(1), Cristian Tirelli(1), Mohammad Atwi(1), Cristian Tirelli(1), Lorenzo Ferretti(2), Rodrigo Otoni(1), Professor. George A. Constantinides(3), Laura Pozzi(1)
  1. Università della Svizzera italiana, Switzerland
  2. Micron Technology, US
  3. Imperial College London
Approximate Logic Synthesis via Iterative SMT-based Subcircuit Rewriting and Through a Prametrizable FPGA or ASIC Template Approximate Logic Synthesis via Iterative SMT-based Subcircuit Rewriting and Through a Prametrizable FPGA or ASIC Template PDF
11 M. Andjelkovic, J. Chen, R. T. Syed, F. Vargas, M. Ulbricht IHP – Leibniz-Institut für innovative Mikroelektronik, Im Technologie Park 25, Frankfurt (Oder), Germany Resilient Processing Platform for 6G Mobile Networks Resilient Processing Platform for 6G Mobile Networks PDF
12 Bea Healy and Tobias Grosser University of Cambridge Arcilator is an open-source hardware simulator in the CIRCT project built around the Arc Dialect, a multi-level intermediate representation that gradually lowers designs from hardware to software. The first level in this representation refines designs into a flattened graph of state elements (a generalization of registers) and the pure combinational transform arcs between them. This poster will discuss (a purely hypothetical idea for) how this structure could be used to accelerate simulation on processors with reconfigurable instruction extensions by outsourcing the most expensive transform arcs to dedicated instructions. Onsite
13 Farhad EbrahimiAzandaryani and Dietmar Fey Friedrich-Alexander-Universität Erlangen-Nürnberg Boosting Computations in RISC-V Cores with CSD-Encoded Operands Performance bottlenecks, particularly those associated with the binary representation carry chain problem, hinder RISC-V cores’ efficiency. In our research, we are working on a synthesizable design method that integrates ternary encoding into the μ-architecture of a RISC-V processor to address the computations bottleneck. Onsite
14 Manuel Jirsak, Adrian Pitterling, Jonas Lienke, Georg Gläser IMMS Institut für Mikroelektronik- und Mechatronik-Systeme gemeinnützige GmbH (IMMS GmbH) To Clock or not to Clock: Clock Gate Insertion with a Yosys-based Netlist Modification Tool Conventional tools automatically insert clock gates during the synthesis process, but often rely on heuristics for the identification of insertion points and control signals. We present a more aggressive approach for hierarchically gating the clock of all FFs. This allows for better power reduction or on-demand clocking at the expense of higher area overhead. Using the open-source synthesis tool Yosys, we translate the circuit into a generic JSON netlist, insert the clock gates, and convert back to Verilog. PDF
15 Qaisar Farooq Efficiency and Performance Tradeoffs in FPGA based Embedded Computer Vision Applications PDF
16 Tarik Ibrahimović, Chili.CHIPS*ba, doc. dr. Nedim Osmić Chili.CHIPS*ba, Faculty of Electrical Engineering, Department of Automatic Control and Electronics, University of Sarajevo eduBOS5: RISC-V uContriller Dev Experience: From Combo to Pipelined, optimising PPA metrics for FPGA This poster highlights the architecture and features of a RISC-V micro-controller of configurable pipeline length, custom-tailored for FPGAs. The poster demonstrates development process, thinking and motivation that led to gradual transition from combo to pipelined design, optimizing Performance/Power/Area (PPA) metrics. The key takeaway is its impressive throughput for the miniscule physical footprint. PDF
17 Justin Cott Silicon Implementation of a RISC-V Processor for the AstroPix Multi-Channel Readout Controller PDF
18 Filip Brkic und Ruediger Willenberg Mannheim University of Applied Sciences, Germany It's a Bug, not a Feature: Pitfalls of initialized BlockRAM use in soft processor systems Dedicated RAM blocks on configured FPGAs are already initialized when booting on-chip embedded systems. This potential benefit can lead to software malfunction if used improperly. We identify how problematic design choices in the AMD/Xilinx toolflow for the Microblaze soft processor can lead to corruption of initial program state. We suggest fixes to prevent this flawed behaviour. Furthermore, we survey other FPGA vendors' soft processor toolchains to probe for similar problems. Onsite
19 Al-Harith Farhad, Fatos Gashi Embedded AI Optimization Approaches and Best Practices PDF
20 Vijay Srinivas Tida (1), Thomas VanDenEinde (1), Sai Venkatesh Chilukoti (2), Md. Imran Hossen (2), Liqun Shan (2), Sonya Hsu (2) and Xiali Hei (2)
  1. (College of St. Benedict and St. John's University
  2. University of Louisiana at Lafayette
Implementation of Kernel Segregated Transpose Convolution Operation Transpose convolution has shown prominence in many deep-learning applications. However, transpose convolution layers are computationally intensive due to the increased feature map size, which is due to adding zeros after each element in each row and column. Thus, convolution operation on the expanded input feature map leads to poor utilization of hardware resources. The main reason for unnecessary multiplication operations is zeros at predefined positions in the input feature map. Thus, we propose a kernel-segregated transpose convolution mechanism to overcome this problem. The implementation of the proposed method shows substantial savings in area and power consumption with reduced delay in the computation process. In the future, we will implement the transpose convolution layer using Xilinx Artix-7 and Intel Agilex-7 FPGA boards. PDF (Online)

Slides and Materials

FPGA Ignite 2024 Opening and Program Introduction | Slides | Video

FOSSi Foundation + Open Source Chip Design (Philipp Wagner) | Slides | Video

Cocotb (Marc Andre) | Slides | Lab Materials | Video

RISC-V Customization - Basics (Guy Lemieux and Stefan Wallentowitz) | Slides | Lab Materials | Video

Testing Instruction Extensions with Cocotb & Icarus (Stefan Wallentowitz and Philipp Wagner) | Video

RISC-V Customization - Software Analysis Tooling (identify custom instruction), Extending LLVM and Cores (Stefan Wallentowitz) | Lab Materials | Video

Reconfigurable Instruction Set Extensions using FABulous eFPGAs - when and how (introduction and hands-on tutorial) (Dirk Koch) | Slides | Tutorial | Video

Boosting the efficiency of RISC-V cores: Fine-grain multi-threading and custom instructions, from concepts to implementation (Riadh Ben Abdelhamid) | Slides | Video

OpenLane Intro and Demo (Nguyen Dao and Asma Mohsin) | Slides | Lab Materials 1 | Lab Materials 2 | Video

Playlist Recordings

Past FPGA Ignite Summer School Events

FPGA Ignite Summer School 2023, ZITI, Heidelberg University

FPGA Ignite Summer School 2022, ZITI, Heidelberg University

FPGA Ignite 2024 Award

The best project award for the FPGA Ignite 2024 Summer School and Hackathon went to the project "VGA Ignite" designed by
Hugh Squires-Parkin, Chao Qian, Florian Feltz, Mengbi Yu, Justin Cott, Al-Harith Farhad, Jan Zielasko, Qaisar Farooq, Jelle Biesmans, Asma Mohsin
Congratulations to the whole team!
award

Contact

Prof. Dirk Koch
Novel Computing Technologies
Universität Heidelberg
Institut für Technische Informatik (ZITI)
Im Neuenheimer Feld 368, 69120 Heidelberg
dirk.koch@ziti.uni-heidelberg.de

Dr. Riadh Ben Abdelhamid
Novel Computing Technologies
Universität Heidelberg
Institut für Technische Informatik (ZITI)
Im Neuenheimer Feld 368, 69120 Heidelberg
riadh.benabdelhamid@ziti.uni-heidelberg.de

Keynote

keynote-Lecturer 1
Guy Lemieux
VectorBlox: A Startup Story

Following the life cycle of a startup, we will explore application acceleration, business and technical requirements for success, custom instructions in Nios-II, and then extending these to MicroBlaze, MIPS, ARM, and RISC-V. An introduction to vector instructions and vector programming in RISC-V and VectorBlox MXP will also be given.

keynote-Lecturer 2
Philipp Wagner
Free and Open Source Silicon is here to stay!

Long gone are the days where open source software was a niche. Today, even most risk-averse companies use open source software. A little bit unnoticed by some, the same success story is repeating itself for chip design right now. In this talk, Philipp will introduce the ideas around free and open source silicon, where we are today, and what a future might look like. You’ll also learn some best practices on how to make the most out of free and open source silicon.

Lecturers

Lecturer 1

Guy Lemieux

Guy is a Professor in Computer Engineering at the University of British Columbia where he teaches advanced digital design and computer systems/architecture related courses. His research focuses on improving FPGA devices and CAD tools, in particular making them easier to use and more efficient for computing tasks. His research has shown how to design FPGA interconnect to be more efficient, that CAD tool performance can be enhanced through parallelism, and that overlays are a much easier way to program FPGAs . His latest research is attempting to make machine learning and artificial intelligence applications more efficient on FPGAs through low-precision arithmetic, such as TinBiNN for binary neural networks, and using custom accelerator interfaces.

Prof. Lemieux graduated from University of Toronto where he was part of a team that designed NUMAchine, a cache-coherent multiprocessor built from scratch using MIPS R4400 CPUs, custom PCBs and FPGAs. Throughout his career, he's designed or co-designed various soft processors and accelerators (MIPS and NIOS clones, VIPERS, VEGAS, VENICE, ORCA RISC-V, and Saturn-V) and co-founded VectorBlox Computing which developed the MXP (Matrix Processor), a tensor accelerator. Within RISC-V International, he is an elected member of the Technical Steering Committee and chair of the SoftCPU Special Interest Group . Throughout the COVID-19 pandemic, he worked with a small team from the SIG to develop new technology for managing custom instructions without namespace collisions in the ISA, resulting in Google's CFU-Playground as well as CFU interfaces in Lattice's RISC-V RX IP core and Efinix's Titanium FPGAs. After extending this further to provide virtualization and protection (like virtual memory), it is being offered as a basis specification for the new Composable Extensions (CX) Task Group which he helped launch. He was also a member of the RISC-V Vector and the early Cache Management Operations committees. Finally, he serves as a Voting Member and Editor within the Working Group for IEEE P3109 Standard for Arithmetic Formats for Machine Learning which has been releasing updates through its Interim Report.

Lecturer 2

Stefan Wallentowitz

Stefan Wallentowitz is a professor at Munich University of Applied Sciences. He holds a PhD in Electrical Engineering from Technische Universität München and an MBA from RWTH Aachen University. He is currently a member of RISC-V Board of Directors, a Director at the Free and Open Source Silicon Foundation, a Member of the Board of the Security Network Munich, a Advisor on “chip innovation” at TUM Venture Labs and a member of the Steering Committee of the RISC-V Summit Europe held in Munich in June 2024.

Stefan will talk about identifying candidates for ISA extensions and how to integrate those into the RISC-V ecosystem.

Lecturer 3

Marc-André Tétrault

Marc-André Tétrault is a professor of electrical and computer engineering at Université de Sherbrooke. He has built an expertise in the design, modeling and implementation of distributed data acquisition systems, which include digital/mixed signal ASICs, FPGAs, system on chip (SoC) and control software. His main application field involves radiation instrumentation for medical imaging systems. His current interests are developing time-of-flight (ToF) capable detectors and associated DAQ, targeting applications such as ToF-Positron Emission Tomography and ToF-Computed Tomography.
His class on Cocotb aims to provide a primer on using Python as a verification language for HDL designs. The major part of the class will be hands-on exercises, exploring how to modify the environment for a custom project, basic Cocotb features, how to set up a modern IDE to debug the test bench, and introduce simple verification methodology concepts.

Lecturer 4

Philipp Wagner

Philipp Wagner is a verification engineer at IBM during daytime, and Director at the FOSSi Foundation and cocotb maintainer at night. He’s been involved in open source software and hardware since he can type and still can’t get enough of it. Philipp was awarded a PhD (Dr.-Ing.) in Electrical Engineering from Technical University Munich for work on software observability in embedded systems.

Lecturer 5

Riadh Ben Abdelhamid

Riadh is an Ex-Synopsys FPGA engineer on their flagship FPGA emulation system ZEBU. In 2017, he was a Japanese Government Scholarship (MEXT) recipient where he obtained his Master and PhD of Engineering in Computer Science from the University of Tsukuba in March 2020 and 2023 respectively. His research revolves around many-core processor architectures and overlays, High-Performance Computing and reconfigurable accelerators. He is also enthusiast about making his own many-core processor chip start-up. Riadh is currently a postdoctoral researcher at the Novel Computing Technologies Group at Heidelberg University, Germany, where he designed the SPARKLE architecture that was implemented on a VU9P FPGA with 1,024 RISC-V Barrel Processor cores and 16,384 interleaved Hardware Threads, delivering 400 GIPS on a single FPGA device.

Lecturer 6

Nguyen Dao

Nguyen Dao obtained a Bachelor's degree in Vietnam in 2007. He began his professional journey at Renesas Electronic Vietnam, where he served as a senior hardware design engineer from 2007 to 2010. He then earned both a Master's degree (2012) and a Ph.D. degree (2017) in Sydney, Australia. In 2018, Nguyen joined The University of Manchester as a Research Associate. During this period, his focus was on the integration of memristor ReRAM on Digital Reconfiguration Systems - FPGAs. After his tenure at the university, he worked with Withsecure Ltd. for a year (2022) before joining Agile Analog Ltd. as a staff hardware design engineer. His research and work focus on ASIC designs and implementations, aiming to make significant contributions to the field of ASIC-FPGA designs and continue driving innovation in the industry.

Lecturer 7

Dirk Koch

Dirk Koch is an expert in FPGA technology. His group maintains the FABulous open-source eFPGA framework and various other FPGA-related projects.

His class will briefly introduce the basic concepts of eFPGAs and the FABulous framework. We will then investigate the methods of using ISA subsetting, hardened custom instructions, reconfigurable custom instructions and software emulated instructions. We will also learn in particular how reconfigurable custom instructions can be made available in a CPU core. With this knowledge, we will run a lab where we will define a custom eFPGA eFPGA and implement custom instructions on that fabric using the open-source tools Yosys and nextpnr.

Impressions

impressions