PACT 2022   October 10–12, 2022

Main Conference Program

Monday, October 10, 2022

Back to navigation
Time What Where

Registration opens

Please allow sufficient time to clear building security. See here for instructions.

7:30–8:20 Continental Breakfast Discovery Room, DPI
8:20–8:30 Welcome from the Chairs Discovery Room, DPI
8:30–9:30 Keynote: Closing the Gap between Quantum Algorithms and Machines with Hardware-Software Co-Design Discovery Room, DPI
9:30–10:00 Coffee Break Discovery Room, DPI

Track 1: Compilers for ever

Session Chair: Nelson Amaral

  • 10:00–10:30: ReACT: Redundancy-Aware Code Generation for Tensor Expressions  (#295) T. Zhou, R. Tian, R. Ashraf, R. Gioiosa, G. Kestor, V. Sarkar
  • 10:30–11:00: Com-CAS: Effective Cache Apportioning Under Compiler Guidance  (#12) B. Chatterjee, S. Khan, S. Pande
  • 11:00–11:30: Transfer-Tuning: Reusing Auto-Schedules for Efficient Tensor Program Code Generation  (#267) P. Gibson, J. Cano
  • 11:30–12:00: HBMax: Optimizing Memory Efficiency for Parallel Influence Maximization on Multicore Architectures  (#26) X. Chen, M. Minutoli, J. Tian, M. Halappanavar, A. Kalyanaraman, D. Tao
Discovery Room, DPI

Track 2: Optimizing the execution of GNNs

Session Chair: Antonino Tumeo

  • 10:00–10:30: Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network Accelerators  (#520) M. Yoo, J. Song, H. Lee, J. Lee, N. Kim, Y. Kim, J. Lee
  • 10:30–11:00: GNNear: Accelerating Full-Batch Training of Graph Neural Networks with Near-Memory Processing  (#148) Z. Zhou, C. Li, X. Wei, X. Wang, G. Sun
  • 11:00–11:30: T-GCN: A Sampling Based Streaming Graph Neural Network System With Hybrid Architecture  (#31) C. Huan, S. Song, Y. Liu, H. Zhang, H. Liu, C. He, K. Chen, J. Jiang, Y. Wu
  • 11:30–12:00: Optimizing Aggregate Computation of Graph Neural Networks with On-GPU Interpreter-Style Programming  (#403) Z. Ji, C. Wang
Orange & Blue Room, IC
12:00–13:30 Lunch (Attendees on their own)
12:30–13:30 Steering Committee Meeting Illini Room, IC

Track 1: Getting more out of your memory

Session Chair: Jose Moreira

  • 13:30–14:00: FlatPack: Flexible Compaction of Compressed Memory  (#66) A. Eldstål-Ahrens, A. Arelakis, I. Sourdis
  • 14:00–14:30: Pavise: Integrating Fault Tolerance Support for Persistent Memory Applications  (#118) H. Qiu, S. Liu, X. Song, S. Khan, G. Pekhimenko
  • 14:30–15:00: Efficient Atomic Durability on eADR-enabled Persistent Memory  (#199) T. Zhou, Y. Du, F. Yang, X. Liao, Y. Lu
Discovery Room, DPI

Track 2: Sparse matrix computations

Session Chair: Gagan Agrawal

  • 13:30–14:00: Probing the Efficacy of Hardware-Aware Weight Pruning to Optimize the SpMM routine on Ampere GPUs  (#416) R. Castro, D. Andrade, B. Fraguela
  • 14:00–14:30: Squaring the circle: Executing Sparse Matrix Computations on FlexTPU—a TPU-like processor  (#133) X. He, K. Chen, S. Feng, H. Kim, D. Blaauw, R. Dreslinski, T. Mudge
  • 14:30–15:00: Custom High-Performance Vector Code Generation for Data-Specific Sparse Computations  (#139) M. Horro, L. Pouchet, G. Rodríguez, J. Tourino
Orange & Blue Room, IC
15:00–15:30 Coffee Break Discovery Room, DPI

Track 1: Graph processing

Session Chair: Vivek Sarkar

  • 15:30–16:00: Batched Graph Community Detection on GPUs  (#72) H. Chou, S. Ghosh
  • 16:00–16:30: SampleMine: A Framework for Applying Random Sampling to Subgraph Pattern Mining through Loop Perforation  (#85) P. Jiang, Y. Wei, J. Su, R. Wang, B. Wu
  • 16:30–17:00: Decoupling Scheduler, Topology Layout, and Algorithm to Easily Enlarge the Tuning Space of GPU Graph Processing  (#308) S. Jeong, Y. Lee, J. Lee, H. Choi, S. Song, J. Lee, Y. Kim, H. Kim
Discovery Room, DPI

Track 2: Miscellaneous

Session Chair: Jose Moreira

  • 15:30–16:00: Tiered Hashing: Revamping Hash Indexing under a Unified Memory-Storage Hierarchy  (#58) J. Zhou, J. Wu, W. Huang, Y. Zhou, F. Wu, L. Shi, X. Zhang, K. Wang, F. Zhu, S. Li, W. Wang
  • 16:00–16:30: Understanding and Reaching the Performance Limit of Schedule Tuning on Stable Synchronization Determinism  (#145) Q. Zhao, Z. Qiu, S. Shao, X. Hui, H. Khan, G. Jin
  • 16:30–17:00: VoxelCache: Accelerating Online Mapping in Robotics and 3D Reconstruction Tasks  (#183) S. Durvasula, R. Kiguru, S. Mathur, J. Xu, J. Lin, N. Vijaykumar
Orange & Blue Room, IC
17:00–19:00 Poster Session / Reception Classroom B, DPI

Keynote: Closing the Gap between Quantum Algorithms and Machines with Hardware-Software Co-Design

Fred Chong (Department of Computer Science, University of Chicago, Chicago, IL)

Quantum computing is at an inflection point, where 127-qubit machines are deployed, and 1000-qubit machines are perhaps only a few years away. These machines have the potential to fundamentally change our concept of what is computable and demonstrate practical applications in areas such as quantum chemistry, optimization, and quantum simulation. Yet a significant resource gap remains between practical quantum algorithms and real machines. A promising approach to closing this gap is to design software that is aware of the key physical properties of emerging quantum technologies. I will illustrate this approach with some of our recent work that focuses on techniques that break traditional abstractions and inform hardware design, including compiling programs directly to analog control pulses, computing with ternary quantum bits, 2.5D architectures for surface codes, and exploiting long-distance communication and tolerating atom loss in neutral-atom machines.

Fred Chong

Fred Chong is the Seymour Goodman Professor in the Department of Computer Science at the University of Chicago and the Chief Scientist for Quantum Software at ColdQuanta. He is also Lead Principal Investigator for the EPiQC Project (Enabling Practical-scale Quantum Computing), an NSF Expedition in Computing. Chong is a member of the National Quantum Advisory Committee (NQIAC) which provides advice to the President and Secretary of Energy on the National Quantum Initiative Program. In 2020, he co-founded, a quantum software company, which was acquired by ColdQuanta in 2022. Chong received his Ph.D. from MIT in 1996 and was a faculty member and Chancellor's fellow at UC Davis from 1997-2005. He was also a Professor of Computer Science, Director of Computer Engineering, and Director of the Greenscale Center for Energy-Efficient Computing at UCSB from 2005-2015. He is a recipient of the NSF CAREER award, the Intel Outstanding Researcher Award, and 13 best paper awards.

Tuesday, October 11, 2022

Back to navigation
Time What Where

Registration opens

Please allow sufficient time to clear building security. See here for instructions.

7:30–8:25 Continental Breakfast Discovery Room, DPI
8:25–8:30 PACT 2023 in Vienna: A Preview Discovery Room, DPI
8:30–9:30 Keynote: MemComputing: Fundamentals and Applications Discovery Room, DPI
9:30–10:00 Coffee Break Discovery Room, DPI
10:00–12:00 ACM SRC Poster Session Discovery Room, DPI

Track 1: Better neural networks

Session Chair: Jose Cano Reyes

  • 10:00–10:30: Effective Performance Modeling and Domain-Specific Compiler Optimization of CNNs for GPUs  (#178) Y. Xu, Q. Yuan, E. Barton, R. Li, P. Sadayappan, A. Sukumaran-Rajam
  • 10:30–11:00: High-performance Architecture Aware Sparse Convolutional Neural Networks for GPUs  (#136) L. Xiang, P. Sadayappan, A. Sukumaran-Rajam
  • 11:00–11:30: Weightless Neural Networks for Efficient Edge Inference  (#256) Z. Susskind, A. Arora, I. Miranda, L. Villon, R. Katopodis, L. de Araújo, D. Dutra, P. Lima, F. França, M. Breternitz Jr., L. John
  • 11:30–12:00: Q-gym: An Equality Saturation Framework for DNN Inference Exploiting Weight Repetition  (#176) C. Fu, H. Huang, B. Wasti, C. Cummins, R. Baghdadi, K. Hazelwood, Y. Tian, J. Zhao, H. Leather
Orange & Blue Room, IC
12:00–13:30 Lunch (Attendees on their own)

Track 1: Getting more out of your GPU

Session Chair: Perry Gibson

  • 13:30–14:00: Locality-aware Optimizations for Improving Remote Memory Latency in Multi-GPU Systems  (#43) L. Belayneh, H. Ye, K. Chen, D. Blaauw, T. Mudge, R. Dreslinski, N. Talati
  • 14:00–14:30: GPUPool: A Holistic Approach to Fine-Grained GPU Sharing in the Cloud  (#50) X. Tan, P. Golikov, N. Vijaykumar, G. Pekhimenko
  • 14:30–15:00: NaviSim: A Highly Accurate GPU Simulator for AMD RDNA GPUs  (#135) Y. Bao, Y. Sun, Z. Feric, M. Shen, M. Weston, J. Abellán, T. Baruah, J. Kim, A. Joshi, D. Kaeli
Discovery Room, DPI

Track 2: Better hardware

Session Chair: Sushant Kondguli

  • 13:30–14:00: mu-grind: A Framework for Dynamically Instrumenting HLS generated RTL  (#158) P. Vahdatnia, A. sharifian, R. Hojabr, A. Shriraman
  • 14:00–14:30: Athena: An Early-Fetch Architecture To Reduce On-Chip Page Walk Latencies  (#276) S. Ghahani, S. Khadirsharbiyani, J. Kotra, M. Kandemir
  • 14:30–15:00: DSDP: Dual Stream Data Prefetcher  (#204) M. He, H. Wang, K. Zhou, K. Cui, H. Yan, C. Guo, R. He
Orange & Blue Room, IC
15:00–15:30 Coffee Break Discovery Room, DPI

Track 1: Task parallelism

Session Chair: Santosh Pande

  • 15:30–16:00: Efficient task-mapping of parallel applications using a space-filling curve  (#83) O. Kwon, J. Kang, S. Lee, W. Kim, J. Song
  • 16:00–16:30: Auto-Partitioning Heterogeneous Task-Parallel Programs with StreamBlocks  (#103) M. Emami, E. Bezati, J. Janneck, J. Larus
Discovery Room, DPI

Track 2: Optimization

Session Chair: Nicolas Agostini

  • 15:30–16:00: Optimizing Regular Expressions via Rewrite-Guided Synthesis  (#127) J. McClurg, M. Claver, J. Garner, J. Vossen, J. Schmerge, M. Belviranli
  • 16:00–16:30: Combining Run-time Checks and Compile-time Analysis to Improve Control Flow Auto-Vectorization  (#120) B. Liu, A. Laird, W. Tsang, B. Mahjour, M. Dehnavi
Orange & Blue Room, IC

Travel to boat dock

The dock is a 30-minute walk from DPI. Please make sure to allow sufficient time.
(Attendees on their own)
17:00–20:30 Banquet / Excursion: Architecture Boat Tour (boarding starts 17:15, vessel departs 17:30 sharp) Wendella West Dock 4

Keynote: MemComputing: Fundamentals and Applications

Massimiliano Di Ventra (Department of Physics, University of California San Diego, La Jolla, CA)

MemComputing is a new physics-based approach to computation that employs time non-locality (memory) to both process and store information on the same physical location. (M. Di Ventra, MemComputing: Fundamentals and Applications, Oxford University Press, 2022.) Its digital version is designed to solve combinatorial optimization problems. A practical realization of digital memcomputing machines (DMMs) can be accomplished via circuits of non-linear dynamical systems with memory engineered so that periodic orbits and chaos can be avoided. A given logic (or algebraic) problem is first mapped into this type of dynamical system whose point attractors represent the solutions of the original problem. A DMM then finds the solution via a succession of elementary avalanches (instantons) whose role is to eliminate configurations of logical inconsistency ("logical defects") from the circuit. I will discuss the physics behind MemComputing and show many examples of its applicability to various combinatorial optimization problems, Machine Learning, and Quantum Mechanics, demonstrating its advantages over traditional approaches and even quantum computing. Work supported by DARPA, DOE, NSF, CMRR, and MemComputing, Inc.

Massimiliano Di Ventra

Massimiliano Di Ventra obtained his undergraduate degree in Physics summa cum laude from the University of Trieste (Italy) in 1991 and did his PhD studies at the Swiss Federal Institute of Technology in Lausanne in 1993-1997. He is now professor of Physics at the University of California, San Diego. Di Ventra's research interests are in condensed-matter theory and unconventional computing. He has been invited to deliver more than 300 talks worldwide on these topics. He has published more than 200 papers in refereed journals, 4 textbooks, and has 7 granted patents (3 foreign). He is a fellow of the IEEE, the American Physical Society, the Institute of Physics, and a foreign member of Academia Europaea. In 2018 he was named Highly Cited Researcher by Clarivate Analytics, he is the recipient of the 2020 Feynman Prize for theory in Nanotechnology, and is a 2022 IEEE Nanotechnology Council Distinguished Lecturer. He is the co-founder of MemComputing, Inc.

Wednesday, October 12, 2022

Back to navigation
Time What Where

Registration opens

Please allow sufficient time to clear building security. See here for instructions.

7:30–8:30 Continental Breakfast Discovery Room, DPI
8:30–9:30 Keynote: AI Acceleration: Co-optimizing Algorithms, Hardware, and Software Discovery Room, DPI

Talks: ACM SRC Finalists

  • Understanding Correlated Error Events in Quantum Computers Michael Schleppy & Arpan Gupta (undergrad)
  • Independent Tenancy Model Boyang Wang (undergrad)
  • A GPU Acceleration Flow for Parallel RTL Simulation and Hardware Testing Dian-Lun Lin (grad)
  • SuperB-NoC: A Superconducting Buffering NoC Rhys Gretsch (grad)
  • Automatically Translating Non-Affine Codes Avery Laird (grad)
Discovery Room, DPI
10:30–11:00 Coffee Break Discovery Room, DPI

Track 1: GPU algorithms

Session Chair: Jose Moreira

  • 11:00–11:30: Parallelizing Neural Network Models Effectively on GPU by Implementing Reductions Atomically  (#78) J. Zhao, C. Bastoul, Y. Yi, J. Hu, W. Nie, R. Zhang, Z. Geng, C. Li, T. Tachon, Z. Gan
  • 11:30–12:00: GAP: GPU Adaptive In-situ Parallel Analytics  (#114) H. Xing, G. Agrawal, R. Ramnath
  • 12:00–12:30: A GPU Multiversion B-Tree  (#258) M. Awad, S. Porumbescu, J. Owens
Discovery Room, DPI

Track 2: Portable performance

Session Chair: P. Sadayappan

  • 11:00–11:30: Breaking the Vendor Lock --- Performance Portable Programming Through OpenMP as Target Independent Runtime Layer  (#312) J. Doerfert, M. Jasper, J. Huber, K. Abdelaal, G. Georgakoudis, T. Scogland, K. Parasyris
  • 11:30–12:00: BenchPress: A Deep Active Benchmark Generator  (#10) F. Tsimpourlas, P. Petoumenos, M. Xu, C. Cummins, K. Hazelwood, A. Rajan, H. Leather
  • 12:00–12:30: Collage: Seamless Integration of Deep Learning Backends with Automatic Placement  (#52) B. Jeon, S. Park, P. Liao, S. Xu, T. Chen, Z. Jia
Orange & Blue Room, IC
12:30–12:45 Conference Closing Discovery Room, DPI

Keynote: AI Acceleration: Co-optimizing Algorithms, Hardware, and Software

Vijayalakshmi Srinivasan (IBM Research, Yorktown Heights, NY)

The combination of growth in compute capabilities and availability of large datasets has led to a re-birth of deep learning. Deep Neural Networks (DNNs) have become state-of-the-art in a variety of machine learning tasks spanning domains across vision, speech, and machine translation. Deep Learning (DL) achieves high accuracy in these tasks at the expense of 100s of ExaOps of computation. Hardware specialization and acceleration is a key enabler to improve operational efficiency of DNNs, in turn requiring synergistic cross-layer design across algorithms, hardware, and software.

In this talk I will present this holistic approach adopted in the design of a multi-TOPs AI hardware accelerator. Key advances in the AI algorithm/application-level exploiting approximate computing techniques enable deriving low-precision DNNs models that maintain the same level of accuracy. Hardware performance-aware design space exploration is critical during compilation to map DNNs with diverse computational characteristics systematically and optimally while preserving familiar programming and user interfaces. The opportunities to co-optimize the algorithms, hardware, and the software provides the roadmap to continue to deliver superior performance over the next decade.

Vijayalakshmi Srinivasan

Viji Srinivasan is a Distinguished Research Staff Member and a manager of the accelerator architectures and compilers group at the IBM T.J. Watson Research Center in Yorktown Heights. At IBM, she has worked on various aspects of data management including energy-efficient processor designs, microarchitecture of the memory hierarchies of large-scale servers, cache coherence management of symmetric multiprocessors, accelerators for data analytics applications and more recently end-to-end accelerator solutions for AI. Many of her research contributions have been incorporated into IBM's Power and System-z Enterprise-class servers.

Important Dates and Deadlines

Conference Papers:

  • Abstracts: April 18, 2022
  • Full Papers: April 25, 2022
  • Round 1 Rebuttal: June 6–9, 2022
  • Round 2 Rebuttal: July 11–14, 2022
  • Author Notification: July 29, 2022
  • Camera Ready Papers: August 26, 2022


  • Poster Submission Deadline: September 1, 2022
  • Author Notification: September 15, 2022
  • Extended Abstract: September 29, 2022
  • Poster Session: October 10, 2022

ACM Student Research Competition:

  • Abstract Submission Deadline: September 8, 2022
  • Author Notification: September 16, 2022
  • SRC Poster Session: October 11, 2022
  • SRC Finalist Presentations: October 12, 2022

Student Travel Awards:

  • Application Deadline: October 5, 2022

Workshops and Tutorials:

  • Workshops/Tutorials: October 8–9, 2022

Conference: October 10–12, 2022

Previous PACTs

Earlier PACTs








IEEE Computer Society