Dr. Leo

Dr. Fu Li (Leo)

LinkedIn Profile: www.linkedin.com/in/leoustc
GitHub: https://github.com/leoustc

Professional Summary

Dr. Leo is an AI Infrastructure Architect with 15+ years of experience in AI infrastructure, high-performance computing, distributed systems, and advanced networking. His work has focused on architecting and benchmarking large-scale AI systems across GPU clusters, RDMA fabrics, NVLink environments, and cloud superclusters, with hands-on experience in CUDA, MPI, FPGA, PCIe, and CXL-based architectures. He is currently experimenting with a practical approach he refers to as Guided AI Engineering, which combines progressive context and goal setting, harness engineering, and deterministic automated workflows to help AI coding systems address real-world problems and frontier engineering challenges.

Core Skills

Problem Solving • Guided AI Engineering • AI Infrastructure • Heterogeneous Computing • CUDA/MPI GPU Computing • RDMA/NVLink Fabrics • PCIe/CXL Architectures • Performance Benchmarking • Superclusters • Systems Innovation

Selected Projects

Professional Experience

  • March 2025 – present, Singapore, AI Infra Architect at Center of Excellence, Oracle
  • 2022 – 2023, Technical Steering Committee Member, Akraino, Linux Foundation
  • March 2014 - Oct 2024, Startups and Projects
    • Project: High-performance interconnection protocol using CXL and CXL Switch (RDMA over CXL)
      Directed a skilled team to resolve PCIe and CXL memory mapping challenges and developed PCIe/CXL routing modules enabling networking functionality for next-generation AI systems.
    • Project: Scalable memory pooling system using CXL Switch
      Led the hardware and logic team creating a scalable FPGA-based memory pooling system with two-layer networking architecture and lightweight protocol akin to CXL 3.1.
    • Project: High-performance interconnection protocol using PCIe switch SoC (RDMA over PCIe)
      Pioneered protocols including PCIe Net (TCP/IP over PCIe Fabric), RDMA over PCIe up to 64 nodes. NVLink system with PCIe SoC
    • Project: Next-generation Server with Shared NIC (AWS Graviton Servers)
      Designed and led the hardware team for multi-way CPU systems with shared NIC architecture, alternative to AWS Graviton Servers.
    • Project: High-Performance Computing for Media Creation
  • May 2012 – March 2014, Cisco Systems, Inc., Software Engineer III
    • Project: Cat4500 Switch ASIC Bringup
  • Jul 2011 – May 2012, FutureWei Technologies Inc.
    • Project: GPU-Accelerated LPM Switch Design and Implementation

Education

  • Dec. 2011, Ph.D., University of Wisconsin – Madison
  • Dec. 2008, M.S., University of Wisconsin – Madison
  • Jul. 2006, B.S., University of Science and Technology of China

Professional Affiliations and Memberships

  • Voting member of Linux Foundation Edge and Akraino Project 2022, 2023
  • Vice Director of the Film Advanced Technology Committee of CSMPTE 2018
  • Industry Professorship of Jiangnan University 2017-2021

Patents

NO ID Title
1 US20120290696 Method and System for Longest Prefix Matching of Variable-Sized Hierarchial Names by Treelets
2 CN20150605 Method and System for File Transfer based on Named Data Networking Caching Algorithm
3 CN114827151A Method and Device for Heterogenous clustered devices and servers based on PCIe CXL and UCIe Physical Links
4 CN114745325A Method and Device for MAC in MAC Network Encoding Based on PCIe, CXL, and UCIe Physical Links
5 CN110891081A Method and Device for Packet Sending, Routing, Broadcasting and Receiving Method and Device for Packet Sending, Routing, Broadcasting and Receiving
6 CN 20150605CN Method and Device for Vehicular Networking Based on Content-Centric Networking
7 CN111027396A Method and System for Assisted Driving, Apparatus, Onboard Terminal and Cloud Server
8 CN110929087A Method and System for Audio Classification, Apparatus, Electronic Device, and Storage Medium
9 CN106708749A Method and System for Fast Searching based on Fractional Algorithms
10 CN109688204A Method and System for File Download Based on Named Data Networking, Node, and Terminal
11 CN109448684A Method and System for Intelligent Music Composition
12 CN111209098A Method and System for Intelligent Rendering Scheduling, Server, Management Node and Storage Medium
13 CN110955515A Method and System for Processing Files, Apparatus, Electronic Devices, and Storage Medium
14 CN111178151A Method and System for Recognizing Micro-Expressions in Facial Changes Based on AI Technology
15 CN106095996B Method and System for Text and Content Classification
16 CN110944034A Method and System for Web-based Resumable Transmission, Device, Electronic Device and Storage Medium
17 CN111125045A Method and System for a Lightweight ETL Processing Platform
18 2017211770803 Method and System for a Portable Mobile Video Content Accelerated Transmission Device
19 CN107819704A Method and System of Scalable Wireless Media Application for Edge Computing