With the advent of global-scale data-processing platforms, understanding how distributed systems work is essential.
This year, the program includes the following topics:
All classes are in English, some including exercises and hands-on projects.
We are delighted to see the keen interest in the theory and practice of distributed computing. While there are no strict prerequisites, basic math and programming skills would enhance your ability to grasp the material.
A journey into the World of Persistent Memory
Persistent memory is a new byte-addressable durable device included in the cache-consistency domain of processors. Since persistent memory is almost as efficient as volatile memory, using a file system interface to access persistent memory is inefficient. In this lesson, we will study how we can efficiently access persistent memory without using a file system interface. This is difficult because a crash may happen at any time, and the state stored in persistent memory has to remain consistent after the crash. The lesson will introduce new programming abstractions for persistent memory, and present how we can build these abstractions by starting from simple low-level instructions. We will also study how we can implement these abstractions efficiently in a practical lab (you will only need a Linux system with a C compiler).
Efficient Algorithms for Machine Learning
In this talk, I will delve into the pivotal role efficient algorithms play in machine learning, with particular emphasis on deep neural network inference and training.
Specifically, parallelism and concurrency act as some of the main driving forces behind efficient DNN training and inference, reducing both the resources used and time to result. Another pivotal aspect, for both training and inference tasks, is given by the memory hierarchy and efficient communication. I will emphasize methods for asynchronous and communication-efficient distributed training, convergence guarantees, and methods for model compression to enable faster inference. The talk is self-contained and only requires basic knowledge of mathematical and distributed computing concepts.
Cache-Conscious Concurrent Data Structures for Near-Memory Computing
The performance gap between memory and CPU has grown exponentially. To bridge this gap, hardware architects have proposed near-memory computing (NMC), where a lightweight processor (called an NMC core) is located close to memory. Due to its proximity to memory, memory access from an NMC core is much faster than from a CPU core. New advances in 3D integration and die-stacked memory make NMC viable in the near future. Prior work has shown significant performance improvements by using NMC for embarrassingly parallel and data-intensive applications, as well as for pointer-chasing traversals in sequential data structures. However, current server machines have hundreds of cores, and algorithms for concurrent data structures exploit these cores to achieve high throughput and scalability, with significant benefits over sequential data structures. Thus, it is important to examine how NMC performs with respect to modern concurrent data structures and understand how concurrent data structures can be developed to take advantage of NMC.
This talk focuses on specific examples of cache-optimized data structures, such as skiplists and B+ trees, where lookups begin at a small number of top-level nodes, and diverge to many different node paths as they move down the hierarchy. These data structures expoint a memory structure split into a host-managed portion consisting of higher-level nodes and an NMP-managed portion consisting of the remaining lower-level nodes.
Joint work with Jiwon Choe, Andriew Crotty, Tali Moreshet, and Iris Bahar.
Modern Techniques for Data Availability and Durability
The scale and intensity of today's economic exchanges require modern web services to be always-on and responsive. One of the key concerns when building these services is the durability and availability of their data. Data must remain accessible despite network, machine, or data center-scale outages, either temporary or permanent. This lecture explores some of the most recent techniques to achieve this.
First, we will focus on geo-replicated state machines. Geo-replication places several copies of a logical data item across different data centers to improve access locality and fault-tolerance. State-machine replication (SMR) ensures that these copies stay in sync. Recent advances in SMR focus on leaderless protocols that sidestep the availability and performance limitations of traditional Paxos-based solutions. We will also detail several leaderless protocols and compare their pros and cons under different applicative workloads.
The second part of this lecture focuses on non-volatile main memory (NVMM). NVMM is a new tier in the memory hierarchy that offers the durability of spinning disks, near-DRAM speed, and byte addressability. We present J-NVM, a fully-fledged interface to use NVMM in Java. Internally, J-NVM relies on proxy objects that intermediate direct off-heap access to NVMM. We will present the internals of J-NVM and how to use it in the context of a modern distributed data store.
Database Isolation Levels
Transaction isolation specifications determine which database states transactions can observe during their execution, and thus, which results queries are allowed to return. This isolation corresponds to the letter "I" in the famous ACID acronym. Different "ACID" systems actually provide a spectrum of isolation levels, which expose the concurrency inside the database to the application programmer to a lesser or greater extent. Examples of such levels are serializability, snapshot isolation, and read committed.
Historically, the definitions of isolation levels have often been ambiguous or tied to database implementation internals. As a result, transaction isolation remains a difficult topic, even for seasoned database professionals. This is a problem because understanding the isolation level provided by a given database is necessary to use it correctly. In this tutorial, I will explain the concept of transaction isolation and highlight pitfalls to be wary of when navigating the spectrum of isolation levels while also explaining some of the levels provided by modern database systems.
A Universal, Sound, and Complete Forward Reasoning Technique for Machine-Verified Proofs of Linearizability
I will present a simple, universal, sound, and complete proof of methods for producing machine-verifiable proofs of linearizability and strong linearizability. Universality means that our method works for any object type, soundness means that an algorithm can be proved correct by our method only if it is linearizable, and completeness means that any linearizable implementation can be proved using our method. We will demonstrate the simplicity and power of our method by producing proofs of linearizability for the Herlihy-Wing queue and Jayanti's single-scanner snapshot, as well as a proof of strong linearizability of the Jayanti-Tarjan union-find object. All three of these proofs are machine-verified by TLAPS (the Temporal Logic of Actions Proof System).
Modular Construction of Live Byzantine Consensus Protocols
In this tutorial, we will consider the question of constructing fault-tolerant distributed protocols in the presence of arbitrary (or Byzantine) process failures and timing uncertainty. This class of protocols is an attractive design choice for mission-critical systems, such as distributed ledgers, thanks to their ability to deal with a wide variety of adversarial behaviors. Our specific focus will be on Byzantine fault-tolerant (BFT) protocols for consensus and state-machine replication (SMR), which are widely used by the distributed ledger systems to construct and maintain a replicated sequence of transaction blocks or blockchain. While many existing BFT consensus or SMR protocols do a good job of guaranteeing that the blockchain is always consistent and never forks, until recently, the process of ensuring that it also makes progress (i.e. an honest participant is eventually able to add a new block) has been poorly understood. This led to implementations that were unnecessarily complex and prone to subtle bugs.
As part of the tutorial, I will introduce a new modular framework, which facilitates constructing BFT consensus and SMR protocols with well-defined progress guarantees. This is achieved by delegating the intricacies of handling timing uncertainties (which is key for ensuring progress) into a new pluggable abstraction of a view synchronizer. I will discuss the view synchronizer properties and its implementation, and then show how it can be used to obtain versions of several well-known BFT consensus and SMR protocols (such as PBFT, HotStuff, and Tendermint), which are simple, efficient, and provably correct.
Lincheck: A Practical Framework for Testing Concurrent Data Structures on the JVM
This talk will present Lincheck, a practical and user-friendly framework for testing concurrent algorithms on the JVM. Lincheck provides a simple and declarative way to write concurrent tests. Instead of describing how to perform the test, users specify what to test by declaring all the operations to examine, and the framework automatically handles the rest. As a result, tests written with Lincheck are concise and easy to understand. The framework automatically generates a set of concurrent scenarios, examines them using stress-testing or bounded model checking, and verifies that the results of each invocation are correct. Notably, if an error is detected via model checking, Lincheck provides an easy-to-follow trace to reproduce it, significantly simplifying bug investigation. During the talk, we will discuss the main Lincheck features through various real-world examples.
October 30 – November 2, 2023
All classes will be held at the Coral Beach Hotel & Resort. The hotel provides a discount on accommodation specifically for the school’s student participants. More details regarding this discount will be communicated by email after registration.