949-824-9127
Loading Events

« All Events

  • This event has passed.

Hardware/Software Co-design Methodologies for Efficient AI Systems and Applications

August 9, 2024 @ 2:30 pm - 3:30 pm PDT

Name: Mohanad Odema

Chair: Prof. Mohammad Abdullah Al Faruque

Date: August 9, 2024

Time:  2:30 PM

Location: EH 2430

Committee: Prof. Marco Levorato, Prof. Hyouk Jun Kwon

Title: Hardware/Software Co-design Methodologies for Efficient AI Systems and Applications

Abstract: The landscape of AI research is dominated by the search for powerful deep learning models and architectures that enable fascinating applications from the edge to the cloud. Indeed, we have witnessed the emergence of efficient, on-device deep learning models that facilitate smart edge applications (autonomous vehicles, AR/VR systems), and the emergence of billion parameter foundation/LLM models that excel at tasks thought achievable only through human-level understanding. On the other hand, the calls for more advanced hardware and systems continue to grow considering the scale at which deep learning model workloads evolve, and to facilitate sustainable, efficient model operation across the various application contexts.

This suggests a natural way to design deep learning models and their systems: viz, through hardware/software co-design methodologies, capturing the interplay and mutual dependencies across various HW/SW layers of the computing stack to guide different design choices. From the algorithmic side, an awareness of the target platform’s compute capabilities and resources guides the deep learning model architectural and optimization choices (e.g., compression) towards maximizing performance efficiency on the target hardware at deployment time. From the hardware side, understanding the deep learning workloads and computing kernels can shape future architectures of AI hardware that improves on efficiency from the lower levels (as seen through customized accelerators trend).

As hardware and software continue to undergo continuous innovation, this dissertation aims to investigate relevant emergent technologies and challenges at this unified research frontier to guide the design of future AI systems and models. The dissertation focuses on characterizing nascent design spaces, exploring various optimization opportunities, and developing new methodologies to maximize the impact of such innovations. In brief, this dissertation goes over the following topics:

– Understanding the benefits of dynamic neural networks for efficient inference, and how to optimize their design for target platform deployment

– Understanding how multi-model workloads can be scheduled and co-located on multi-chip AI Accelerator modules based on 2.5D chiplets technology while accounting for workloads’ diversity, affinities, and memory access patterns

– Exploring new methodologies to maximize the impact of split computing inference in edge-cloud architectures, and elevate resource efficiency of edge devices

Short Bio: Mohanad received a B.Sc degree in Electronics and Communications and a M.Sc degree in Computer Engineering from Ain Shams University, Egypt. He is a PhD Candidate in Computer Engineering at UCI. He was an intern at MediaTek with the AI Architecture and Algorithm team. His research interests are focused on HW/SW co-design solutions for efficient AI, bridging model- and system-level optimizations to enhance performance for core industry AI workloads and applications from the edge to the cloud.

Details

Date:
August 9, 2024
Time:
2:30 pm - 3:30 pm PDT
Event Category:

Venue

EH 2430
Engineering Hall, University of California, Irvine
Irvine, CA 92697 United States
+ Google Map
View Venue Website