The school covers both fundamental and state-of-the-art topics on the theory and practice of distributed systems.
Speakers: Gael Thomas, Dan Alistarh, Gregory Chockler, Alexey Gotsman, Pierre Sutra, Maurice Herlihy, Prasad Jayanti, Nikita Koval, Sergio Rajsbaum, and Panagiota Faturu.
With the advent of global-scale data-processing platforms, understanding how distributed systems work is essential.
This year, the program includes the following topics:
All classes are in English, some including exercises and hands-on projects.
We are delighted to see the keen interest in the theory and practice of distributed computing. While there are no strict prerequisites, basic math and programming skills would enhance your ability to grasp the material.
A journey into the World of Persistent Memory
Persistent memory is a new byte-addressable durable device included in the cache-consistency domain of processors. Since persistent memory is almost as efficient as volatile memory, using a file system interface to access persistent memory is inefficient. In this lesson, we will study how we can efficiently access persistent memory without using a file system interface. This is difficult because a crash may happen at any time, and the state stored in persistent memory has to remain consistent after the crash. The lesson will introduce new programming abstractions for persistent memory, and present how we can build these abstractions by starting from simple low-level instructions. We will also study how we can implement these abstractions efficiently in a practical lab (you will only need a Linux system with a C compiler).
Efficient Algorithms for Machine Learning
In this talk, I will delve into the pivotal role efficient algorithms play in machine learning, with particular emphasis on deep neural network inference and training.
Specifically, parallelism and concurrency act as some of the main driving forces behind efficient DNN training and inference, reducing both the resources used and time to result. Another pivotal aspect, for both training and inference tasks, is given by the memory hierarchy and efficient communication. I will emphasize methods for asynchronous and communication-efficient distributed training, convergence guarantees, and methods for model compression to enable faster inference. The talk is self-contained and only requires basic knowledge of mathematical and distributed computing concepts.
Cache-Conscious Concurrent Data Structures for Near-Memory Computing
The performance gap between memory and CPU has grown exponentially. To bridge this gap, hardware architects have proposed near-memory computing (NMC), where a lightweight processor (called an NMC core) is located close to memory. Due to its proximity to memory, memory access from an NMC core is much faster than from a CPU core. New advances in 3D integration and die-stacked memory make NMC viable in the near future. Prior work has shown significant performance improvements by using NMC for embarrassingly parallel and data-intensive applications, as well as for pointer-chasing traversals in sequential data structures. However, current server machines have hundreds of cores, and algorithms for concurrent data structures exploit these cores to achieve high throughput and scalability, with significant benefits over sequential data structures. Thus, it is important to examine how NMC performs with respect to modern concurrent data structures and understand how concurrent data structures can be developed to take advantage of NMC.
This talk focuses on specific examples of cache-optimized data structures, such as skiplists and B+ trees, where lookups begin at a small number of top-level nodes, and diverge to many different node paths as they move down the hierarchy. These data structures expoint a memory structure split into a host-managed portion consisting of higher-level nodes and an NMP-managed portion consisting of the remaining lower-level nodes.
Joint work with Jiwon Choe, Andriew Crotty, Tali Moreshet, and Iris Bahar.
Modern Techniques for Data Availability and Durability
The scale and intensity of today's economic exchanges require modern web services to be always-on and responsive. One of the key concerns when building these services is the durability and availability of their data. Data must remain accessible despite network, machine, or data center-scale outages, either temporary or permanent. This lecture explores some of the most recent techniques to achieve this.
First, we will focus on geo-replicated state machines. Geo-replication places several copies of a logical data item across different data centers to improve access locality and fault-tolerance. State-machine replication (SMR) ensures that these copies stay in sync. Recent advances in SMR focus on leaderless protocols that sidestep the availability and performance limitations of traditional Paxos-based solutions. We will also detail several leaderless protocols and compare their pros and cons under different applicative workloads.
The second part of this lecture focuses on non-volatile main memory (NVMM). NVMM is a new tier in the memory hierarchy that offers the durability of spinning disks, near-DRAM speed, and byte addressability. We present J-NVM, a fully-fledged interface to use NVMM in Java. Internally, J-NVM relies on proxy objects that intermediate direct off-heap access to NVMM. We will present the internals of J-NVM and how to use it in the context of a modern distributed data store.
Database Isolation Levels
Transaction isolation specifications determine which database states transactions can observe during their execution, and thus, which results queries are allowed to return. This isolation corresponds to the letter "I" in the famous ACID acronym. Different "ACID" systems actually provide a spectrum of isolation levels, which expose the concurrency inside the database to the application programmer to a lesser or greater extent. Examples of such levels are serializability, snapshot isolation, and read committed.
Historically, the definitions of isolation levels have often been ambiguous or tied to database implementation internals. As a result, transaction isolation remains a difficult topic, even for seasoned database professionals. This is a problem because understanding the isolation level provided by a given database is necessary to use it correctly. In this tutorial, I will explain the concept of transaction isolation and highlight pitfalls to be wary of when navigating the spectrum of isolation levels while also explaining some of the levels provided by modern database systems.
A Universal, Sound, and Complete Forward Reasoning Technique for Machine-Verified Proofs of Linearizability
I will present a simple, universal, sound, and complete proof of methods for producing machine-verifiable proofs of linearizability and strong linearizability. Universality means that our method works for any object type, soundness means that an algorithm can be proved correct by our method only if it is linearizable, and completeness means that any linearizable implementation can be proved using our method. We will demonstrate the simplicity and power of our method by producing proofs of linearizability for the Herlihy-Wing queue and Jayanti's single-scanner snapshot, as well as a proof of strong linearizability of the Jayanti-Tarjan union-find object. All three of these proofs are machine-verified by TLAPS (the Temporal Logic of Actions Proof System).
Modular Construction of Live Byzantine Consensus Protocols
In this tutorial, we will consider the question of constructing fault-tolerant distributed protocols in the presence of arbitrary (or Byzantine) process failures and timing uncertainty. This class of protocols is an attractive design choice for mission-critical systems, such as distributed ledgers, thanks to their ability to deal with a wide variety of adversarial behaviors. Our specific focus will be on Byzantine fault-tolerant (BFT) protocols for consensus and state-machine replication (SMR), which are widely used by the distributed ledger systems to construct and maintain a replicated sequence of transaction blocks or blockchain. While many existing BFT consensus or SMR protocols do a good job of guaranteeing that the blockchain is always consistent and never forks, until recently, the process of ensuring that it also makes progress (i.e. an honest participant is eventually able to add a new block) has been poorly understood. This led to implementations that were unnecessarily complex and prone to subtle bugs.
As part of the tutorial, I will introduce a new modular framework, which facilitates constructing BFT consensus and SMR protocols with well-defined progress guarantees. This is achieved by delegating the intricacies of handling timing uncertainties (which is key for ensuring progress) into a new pluggable abstraction of a view synchronizer. I will discuss the view synchronizer properties and its implementation, and then show how it can be used to obtain versions of several well-known BFT consensus and SMR protocols (such as PBFT, HotStuff, and Tendermint), which are simple, efficient, and provably correct.
Lincheck: A Practical Framework for Testing Concurrent Data Structures on the JVM
This talk will present Lincheck, a practical and user-friendly framework for testing concurrent algorithms on the JVM. Lincheck provides a simple and declarative way to write concurrent tests. Instead of describing how to perform the test, users specify what to test by declaring all the operations to examine, and the framework automatically handles the rest. As a result, tests written with Lincheck are concise and easy to understand. The framework automatically generates a set of concurrent scenarios, examines them using stress-testing or bounded model checking, and verifies that the results of each invocation are correct. Notably, if an error is detected via model checking, Lincheck provides an easy-to-follow trace to reproduce it, significantly simplifying bug investigation. During the talk, we will discuss the main Lincheck features through various real-world examples.
On the Origins of Blockchains
Blockchains have served as decentralized platforms to transform the way we think about databases, distributed systems, and finance in a world where people can interact without knowing each other and across borders without government intermediaries. Nevertheless, a historic perspective of blockchains can be traced back to the birth of decentralized systems in the 1970s, with roots in research in cryptology, distributed systems, and databases. This talk will examine the fascinating history of how new notions were invented along the way, requiring the design of new algorithms and the understanding of scientific principles about what can and cannot be done by a collection of computers communicating with each other.
Multi-Version Concurrent Data Structures
Multi-versioning is widely used in databases, transactional memory, and other domains. This talk will examine the power of using multi-versioning to support multi-point queries on top of concurrent data structures.
Any system that maintains multiple versions of each object needs a way of efficiently reclaiming them. In this talk, you’ll learn theoretically and practically efficient schemes for multi-version garbage collection.
October 30 – November 2, 2023
Paphos, Cyprus
All classes will be held at the Coral Beach Hotel & Resort. The hotel provides a discount on accommodation specifically for the school’s student participants. More details regarding this discount will be communicated by email after registration.
Registration completed.
About school
The school will take place from October 30 – November 2, 2023.
The classes will be held daily from 10 am until the evening. On one of the days, a social program will be planned for the morning. More specific information on the schedule will be provided closer to the start of the school.
Absolutely anyone who is interested can participate. We are delighted to see keen interest in the theory and practice of distributed computing. There will be no entrance exams.
The fee includes two coffee breaks per day, one dinner for all participants, and one social activity.
Slides from presentations will be published. We will also do our best to upload videos.
All school activities will be held at the Coral Beach Hotel & Resort in Paphos, Cyprus.
Registration fees for school participation are:
Students: €100
Non-students: €200
While there are no strict prerequisites, having basic math and programming skills would enhance your ability to grasp the material.
No, but a special offer for accommodation in a double room at the Coral Beach Hotel & Resort will be available to all participants. Each participant will receive this information after registration, and we will also post a spreadsheet to help with searching for roommates.
Please fill out a short form on the main page. After submitting the form, you will receive information on how to pay the registration fee. We will also provide details regarding the accommodation options at the hotel where the school will be held, as well as other options available to the school’s participants.
About Cyprus
From the UNESCO World Heritage Sites in Paphos to the Troodos Mountain hiking trails, Cyprus offers travelers a plethora of historic sites, beautiful beaches, and stunning landscapes. For a comprehensive list of attractions, visit the official Cyprus Tourism Portal.
You should plan your trip to land at an airport in the southern part of Cyprus, which is under the control of the Republic of Cyprus. These airports are Larnaca International Airport and Paphos International Airport. Entry into the Republic of Cyprus via the northern part of the island is considered illegal by Cypriot authorities. Please plan your travels accordingly.
The options to get from Larnaca International Airport to the Coral Beach Hotel & Resort are similar to those from Paphos. The primary difference is the travel time, as Larnaca is further from the Coral Beach Hotel & Resort than Paphos. Please consider this when planning your journey and check the most recent schedules and prices online before your arrival.
Cyprus is part of the European Union, and EU roaming regulations apply, meaning citizens of EU countries can use their mobile phones at domestic prices. If you're not from an EU country or if your plan doesn't include roaming, you might consider getting an eSIM from a universal service provider ahead of your trip. As always, it's a good idea to check with your home carrier about any potential roaming charges before traveling.
Visa requirements vary by nationality. Many countries, including EU members, the UK, USA, Canada, and Australia, typically do not require a visa for short stays. However, always check the current visa policies for your country. If a visa is required, obtaining it is the participant's responsibility, but we can provide a participation confirmation letter if needed. Check the Cyprus Ministry of Foreign Affairs website for the most up-to-date information.
There are several options to get from Paphos International Airport to the Coral Beach Hotel & Resort:
– Taxi: The most convenient option is to take a taxi. Taxis can be found directly at the airport, or you can book one in advance via online services. The journey typically takes about 30 minutes.
– Bus: There are bus routes connecting the airport with the city center of Paphos. From the city center, you can then take another bus to the hotel. You can check the bus routes and schedules on the Cyprus By Bus website.
– Car rental: If you have a driver’s license, you can rent a car through any available online service. Please bear in mind that this should be arranged in advance.
While most places in Cyprus accept credit cards, we recommend having some cash on hand for small expenses, such as public transportation or small shops and cafes that may not accept cards. It's also good to be prepared in case of any unforeseen circumstances. Please note that the currency in Cyprus is the Euro (€).
If you have other questions, please write to us at info@sptdc.org
Connect with us