Cheng Tan
Google & ASU
I am currently a software engineer at Google working on Edge TPU machine learning compiler. I am also faculty in the School of Electrical, Computer and Energy Engineering at ASU. I received my Ph.D. from the National University of Singapore (co-supervised by Prof. Tulika Mitra and Prof. Li-Shiuan Peh) and B.E. from Shandong University, both in Computer Science. I was employed by Microsoft, Pacific Northwest National Laboratory, and Cornell University, working on Brainwave Machine Learning Accelerator/Compiler, HW/SW co-design, and Network-on-Chip, respectively. I have published papers as first author across all the top-tier architecture conferences. My research interests include:
- Machine Learning Compilation
- Many-Core Architecture
- Hardware/Software Co-Design
- Reconfigurable Accelerator
- Network-on-Chip
I dedicate to democratizing domain-specific reconfigurable acceleration, aiming at a push-button solution towards compilation, architecture design, and synthesis. Most of my research works are open-source at https://github.com/tancheng. Let me know if you are interested!
Selected Publications
- [ASPLOS] Enhancing CGRA Efficiency through Aligned Compute and Communication Provisioning. Zhaoying Li, Pranav Dangi, Chenyang Yin, Thilini Kaushalya Bandara, Rohan Juneja, Cheng Tan, Tulika Mitra. ACM International Conference on Architectural Support for Programming Languages and Operating Systems. Rotterdam, The Netherlands, April 2025.
- [MICRO] ICED: An Integrated CGRA Framework Enabling DFVS-Aware Acceleration. Cheng Tan, Miaomiao Jiang, Deepak Patil, Yanghui Ou, Zhaoying Li, Lei Ju, Tulika Mitra, Hyunchul Park, Antonino Tumeo, Jeff (Jun) Zhang. IEEE/ACM International Symposium on Microarchitecture. Austin, TX, Nov 2024.
- [IEEE Micro] Bridging Python to Silicon: The SODA Toolchain. Nicolas Bohm Agostini, Serena Curzel, Jeff (Jun) Zhang, Ankur Limaye, Cheng Tan, Vinay Amatya, Marco Minutoli, Vito Giovanni Castellana, Joseph Manzano, David Brooks, Gu-Yeon Wei, Antonino Tumeo. IEEE Micro, 2022. Best Paper Award.
- [HPCA] DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs. Cheng Tan, Nicolas Bohm Agostini, Tong Geng, Chenhao Xie, Jiajia Li, Ang Li, Kevin Barker, Antonino Tumeo. The 28th IEEE International Symposium on High-Performance Computer Architecture, Seoul, South Korea, February 2022.
- [ICCD] DynPaC: Coarse-Grained, Dynamic, and Partially Reconfigurable Array for Streaming Applications. Cheng Tan, Tong Geng, Chenhao Xie, Nicolas Bohm Agostini, Jiajia Li, Ang Li, Kevin Barker, Antonino Tumeo. The 39th IEEE International Conference on Computer Design, October 2021. Best Paper Award.
- [ISCA] Stitch: Fusible Heterogeneous Accelerators Enmeshed with Many-Core Architecture for Wearables. Cheng Tan, Manupa Karunaratne, Tulika Mitra, Li-Shiuan Peh. 45th ACM/IEEE International Symposium on Computer Architecture, June 2018.
- [CASES] LOCUS: Low-Power Customizable Many-Core Architecture for Wearables. Cheng Tan, Aditi Kulkarni, Vanchinathan Venkataramani, Manupa Karunaratne, Tulika Mitra, Li-Shiuan Peh. ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, October 2016. Best Paper Nomination.
Open-Source Projects
- CGRA-Flow, an integrated framework for compilation, exploration, synthesis, and development of spatial accelerators.
- CGRA-Mapper, an LLVM pass that generates CDFG and maps them onto a customizable CGRA.
- MLIR-CGRA, MLIR dialects/passes to enable the efficient acceleration of ML models on CGRAs.
- OpenCGRA, a CGRA framework for modeling, testing, and evaluating CGRAs.
- VectorCGRA, an updated version of OpenCGRA, which supports customized/fusible vectorization in the tiles.
- PyMTL3-net, a unified Network-on-Chip framework.
- ARENA, an asynchronous data-centric programming model.
- IoT-kernels, an IoT application benchmark suite.
Academic Service
- Program Committee: HPCA’25/’24, MICRO’22, ICCAD’24/’23/’22/’21, ISPASS’25, CODES+ISSS’24/’23/’22, ICCD’23/’22/’21, RAW’24, CGRA4HPC’24/’23/’22, CFW’24.
- External Review Committee: ISCA’24, MICRO’24, HPCA’22, ASPLOS’22.
- Artifact Evaluation Committee: PPOPP’21, MICRO’21.
- Session Chair: MICRO’24, ICCAD’23/’22/’21, ICCD’21/’19, ISQED’24.
- Journal Reviewer: CAL’24, TC’22, TECS’23/’21, MicroSI’22/’21, TSUSC’21, TCAD’21, TPDS’23/’21/’20, TACO’23/’21, TNNLS’21/’20, TVLSI’24/’23/’22/’21/’19, PARCO’21/’20, JSA’23, JCSC’24, SUSCOM’22.
- Secondary Reviewer: IPDPS’22, MLBench’22, SC’21, LCTES’21, ICS’20, FPT’18, ICPADS’18, DAC’17, ISCA’17, CASES’16, MICRO’16.
- Student Volunteer: ASP-DAC’14.
Teaching
- CS4223 Multi-Core Architectures, Teaching Assistant, School of Computing, National University of Singapore, Fall’15.
Awards and Honors
- Best Paper Award, IEEE Micro, 2022.
- Outstanding Performance Award, PNNL, 2022
- Best Paper Award, The 39th IEEE International Conference on Computer Design (ICCD), 2021.
- Outstanding Postdoc, PNNL, 2020.
- Best Paper Nomination, ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), 2016.
- NUS Research Scholarship, National University of Singapore, 2013.
- Excellent Undergraduate Thesis Award, Shandong University, 2013.
- National Scholarship, China, 2012.
- Google Scholarship, Google, China, 2011.
- The First Prize Scholarship, Shandong University, 2010.
Talks
- ICED: An Integrated CGRA Framework Enabling DFVS-Aware Acceleration, MICRO’24.
- VecPAC: A Vectorizable and Precision-Aware CGRA, ICCAD’23.
- ASAP: Automatic Synthesis of Area-Efficient and Precision-Aware CGRAs, ICS’22.
- DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs, HPCA’22.
- DynPaC: Coarse-Grained, Dynamic, and Partially Reconfigurable Array for Streaming Applications, ICCD’21.
- Democratizing Coarse-Grained Reconfigurable Arrays, NUS’21.
- OpenCGRA: Democratizing Coarse-Grained Reconfigurable Arrays, ASAP’21.
- AURORA: Automated Refinement of Coarse-Grained Reconfigurable Accelerators, DATE’21.
- OpenCGRA: An Open-Source Unified Framework for Modeling, Testing, and Evaluating CGRAs, CIRCT’20.
- OpenCGRA: An Open-Source Unified Framework for Modeling, Testing, and Evaluating CGRAs, ICCD’20.
- PyOCN: A Unified Framework for Modeling, Testing, and Evaluating On-Chip Networks, ICCD’19.
- Low-Power Many-Core Architectures for the Next-Generation Wearables, Cornell’19.
- Stitch: Fusible Heterogeneous Accelerators Enmeshed with Many-Core Architecture for Wearables, ISCA’18.
- LOCUS: Low-Power Customizable Many-Core Architecture for Wearables, ESWEEK’16.
- Approximation-Aware Scheduling on Heterogeneous Multi-core Architectures, ASP-DAC’15.