The papers will be available online to everyone beginning on the first day of the conference, July 14, 2021. Jaehyun Hwang and Midhul Vuppalapati, Cornell University; Simon Peter, UT Austin; Rachit Agarwal, Cornell University. Using this property, MAGE calculates the memory access pattern ahead of time and uses it to produce a memory management plan. J.P. Morgan AI Research partners with applied data analytics teams across the firm as well as with leading academic institutions globally. We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. For example, traditional compute resources are replenishable while privacy is not: a CPU can be regained after a model finishes execution while privacy budget cannot. Commonly used log archival and compression tools like Gzip provide high compression ratio, yet searching archived logs is a slow and painful process as it first requires decompressing the logs. In the Ethereum network, decentralized Ethereum clients reach consensus through transitioning to the same blockchain states according to the Ethereum specification. Poor data locality hurts an application's performance. We present application studies for 8 applications, improving requests-per-second (RPS) by 7.7% and reducing RAM usage 2.4%. Consensus bugs are extremely rare but can be exploited for network split and theft, which cause reliability and security-critical issues in the Ethereum ecosystem. To resolve the problem, we propose a new LFS-aware ZNS interface, called ZNS+, and its implementation, where the host can offload data copy operations to the SSD to accelerate segment compaction. Pollux simultaneously considers both aspects. This is unfortunate because good OS design has always been driven by the underlying hardware, and right now that hardware is almost unrecognizable from ten years ago, let alone from the 1960s when Unix was written. Here, we focus on hugepage coverage. For conference information, . Based on the observation that invariants are often concise in practice, DistAI starts with small invariant formulas and enumerates all strongest possible invariants that hold for all samples. The program co-chairs will use this information at their discretion to preserve the anonymity of the review process without jeopardizing the outcome of the current OSDI submission. Zeph executes privacy-adhering data transformations in real-time and scales to thousands of data sources, allowing it to support large-scale low-latency data stream analytics. Log search and log archiving, despite being critical problems, are mutually exclusive. This is the first OSDI in an odd year as OSDI moves to a yearly cadence. The copyback-aware block allocation considers different copy costs at different copy paths within the SSD. With her students, she had led research in AI, with a focus on robotics and machine learning, having concretely researched and developed a variety of autonomous robots, including teams of soccer robots, and mobile service robots. The paper review process is double-blind. While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. A PC member is a conflict if any of the following three circumstances applies: Institution: You are currently employed at the same institution, have been previously employed at the same institution within the past two years (not counting concluded internships), or are going to begin employment at the same institution during the review period. Storm ensures security using a Security Typed ORM that refines the (type) abstractions of each layer of the MVC API with logical assertions that describe the data produced and consumed by the underlying operation and the users allowed access to that data. We develop rigorous theoretical foundations to simplify equivalence examination and correction for partially equivalent transformations, and design an efficient search algorithm to quickly discover highly optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. This distinction forces a re-design of the scheduler. Collaboration: You have a collaboration on a project, publication, grant proposal, program co-chairship, or editorship within the past two years (December 2018 through March 2021). In particular, I'll argue for re-engaging with what computer hardware really is today and give two suggestions (among many) about how the OS research community can usefully do this, and exploit what is actually a tremendous opportunity. Concretely, Dorylus is 1.22 faster and 4.83 cheaper than GPU servers for massive sparse graphs. Foreshadow was chosen as an IEEE Micro Top Pick. This talk will discuss several examples with very different solutions. Authors must make a good faith effort to anonymize their submissions, and they should not identify themselves or their institutions either explicitly or by implication (e.g., through the references or acknowledgments). The blockchain community considers this hard fork the greatest challenge since the infamous 2016 DAO hack. Session Chairs: Sebastian Angel, University of Pennsylvania, and Malte Schwarzkopf, Brown University, Ishtiyaque Ahmad, Yuntian Yang, Divyakant Agrawal, Amr El Abbadi, and Trinabh Gupta, University of California Santa Barbara. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 1416, 2021. Papers not meeting these criteria will be rejected without review, and no deadline extensions will be granted for reformatting. Computation separation makes it possible to construct a deep, bounded-asynchronous pipeline where graph and tensor parallel tasks can fully overlap, effectively hiding the network latency incurred by Lambdas. See the USENIX Conference Submissions Policy for details. With the help of thousands of Lambda threads, Dorylus scales GNN training to billion-edge graphs. Nico Lehmann and Rose Kunkel, UC San Diego; Jordan Brown, Independent; Jean Yang, Akita Software; Niki Vazou, IMDEA Software Institute; Nadia Polikarpova, Deian Stefan, and Ranjit Jhala, UC San Diego. Qing Wang, Youyou Lu, Junru Li, and Jiwu Shu, Tsinghua University. Please identify yourself as a presenter and include your mailing address in your email. Compared to a state-of-the-art fuzzer, Fluffy improves the fuzzing throughput by 510 and the code coverage by 2.7 with various optimizations: in-process fuzzing, fuzzing harnesses for Ethereum clients, and semantic-aware mutation that reduces erroneous test cases. Advisor: You have a past or present association as thesis advisor or advisee. We demonstrate that KEVIN reduces the amount of I/O traffic between the host and the device, and remains particularly robust as the system ages and the data become fragmented. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency. Furthermore, by combining SanRazor with an existing sanitizer reduction tool ASAP, we show synergistic effect by reducing the runtime cost to only 7.0% with a reasonable tradeoff of security. Sam Kumar, David E. Culler, and Raluca Ada Popa, University of California, Berkeley. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. If you have any questions about conflicts, please contact the program co-chairs. We built a functional NFSv3 server, called GoNFS, to use GoJournal. The chairs may reject abstracts or papers on the basis of egregious missing or extraneous conflicts. Manuela will present examples and discuss the scope of AI in her research in the finance domain. Calibrated interrupts increase throughput by up to 35%, reduce CPU consumption by as much as 30%, and achieve up to 37% lower latency when interrupts are coalesced. Just using Lambdas on top of CPU servers offers up to 2.75 more performance-per-dollar than training only with CPU servers. KEVIN combines a fast, lightweight, and POSIX compliant file system with a key-value storage device that performs in-storage indexing. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. Jiang Zhang, University of Southern California; Shuai Wang, HKUST; Manuel Rigger, Pinjia He, and Zhendong Su, ETH Zurich. The wire-to-wire RPC response time through the nanoPU is just 69ns, an order of magnitude quicker than the best-of-breed, low latency, commercial NICs. DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh . Cores can safely and concurrently read from their local kernel replica, eliminating remote NUMA accesses. First, it enables a caller to push a message to a callee in two hops, using a new way of assigning mailboxes to users that resembles how a post office assigns PO boxes to its customers. Novel system designs, thorough empirical work, well-motivated theoretical results, and new application areas are all . PET discovers and applies program transformations that improve computation efficiency but only maintain partial functional equivalence. Additionally, there is no assurance that data processing and handling comply with the claimed privacy policies. Camera-ready submission (all accepted papers): 2 April 2021; Main conference program: 27-28 April 2021; All deadline times are . Evaluations show that Vegito can perform 1.9 million TPC-C NewOrder transactions and 24 TPC-H-equivalent queries per second simultaneously, which retain the excellent performance of specialized OLTP and OLAP counterparts (e.g., DrTM+H and MonetDB). Grand Rapids, Michigan, United States . All papers will be available online to registered attendees before the conference. VLDB 2021: Venue Tivoli Hotel & Congress Center Arni Magnussons Gade 2 1577 Copenhagen, Denmark +45 3268 4300 In-person attendees can purchase tickets for the park / gardens with a 15% discount, which is a special offer by Tivoli Hotel & Congress Center to VLDB 2021 attendees. Notification of conditional accept/reject for revisions: 3 March 2022. 64 papers accepted out of 341 submitted. For conference information, see: . Alas, existing profiling techniques incur high overhead when used to identify data locality problems and cannot be deployed in production, where programs may exhibit previously-unseen performance problems. She developed the technology for making network routing self-stabilizing, largely self-managing, and scalable. We implement DeSearch for two existing decentralized services that handle over 80 million records and 240 GBs of data, and show that DeSearch can scale horizontally with the number of workers and can process 128 million search queries per day. Accepted papers will be allowed 14 pages in the proceedings, plus references. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. Based on this observation, P3 proposes a new approach for distributed GNN training. Submitted papers must be no longer than 12 single-spaced 8.5 x 11 pages, including figures and tables, plus as many pages as needed for references, using 10-point type on 12-point (single-spaced) leading, two-column format, Times Roman or a similar font, within a text block 7 wide x 9 deep. Second, it innovates on the underlying cryptographic machinery and constructs a new private information retrieval scheme, FastPIR, that reduces the time to process oblivious access requests for mailboxes. This formulation of memory management, which we call memory programming, is a generalization of paging that allows MAGE to provide a highly efficient virtual memory abstraction for SC. We also propose two file system techniques for ZNS+-aware LFS. An evaluation of Addra on a cluster of 80 machines on AWS demonstrates that it can serve 32K users with a 99-th percentile message latency of 726 msa 7 improvement over a prior system for text messaging in the same threat model. USENIX NSDI, 2021 Acceptance Rate: 15.99% Fluid: Resource-Aware Hyperparameter Tuning Engine P. Yu*, J. Liu*, M. Chowdhury (*Equal contribution) MLSys, 2021 Acceptance Rate: 23.53% NetLock: Fast, Centralized Lock Management Using Programmable Switches Z. Yu, Y. Zhang, V. Braverman, M. Chowdhury, X. Jin ACM SIGCOMM, 2020 Acceptance Rate: 21.6% When registering your abstract, you must provide information about conflicts with PC members. Her robot soccer teams have been RoboCup world champions several times, and the CoBot mobile robots have autonomously navigated for more than 1,000km in university buildings. The novel aspect of the nanoPU is the design of a fast path between the network and applications---bypassing the cache and memory hierarchy, and placing arriving messages directly into the CPU register file. Existing decentralized systems like Steemit, OpenBazaar, and the growing number of blockchain apps provide alternatives to existing services. Professor Veloso is on leave from Carnegie Mellon University as the Herbert A. Simon University Professor in the School of Computer Science, and the past Head of the Machine Learning Department. AI enables principled representation of knowledge, complex strategy optimization, learning from data, and support to human decision making. It then feeds those invariants and the desired safety properties to an SMT solver to check if the conjunction of the invariants and the safety properties is inductive. We propose Marius, a system for efficient training of graph embeddings that leverages partition caching and buffer-aware data orderings to minimize disk access and interleaves data movement with computation to maximize utilization. Taking place in Carlsbad, CA from 11-13 July, OSDI is a highly selective flagship conference in computer science, especially on the topic of computer systems. These results outperform state-of-the-art HTAP systems by several orders of magnitude on transactional performance, while just incurring little performance slowdown (5% over pure OLTP workloads) and still enjoying data freshness for analytical queries (less than 20 ms of maximum delay) in the failure-free case. sosp ACM Symposium on Operating Systems Principles. This approach misses possible optimization opportunities as transformations that only preserve equivalence on subsets of the output tensors are excluded. Fortunately, we observe that the backups for high availability in modern distributed OLTP systems can be retrofitted to bridge the analytical queries and transactions in HTAP workloads. USENIX, like other scientific and technical conferences and journals, prohibits these practices and may, on the recommendation of a program chair, take action against authors who have committed them. Mothy received a PhD in 1995 from the Computer Laboratory of the University of Cambridge, where he was a principal designer and builder of the Nemesis OS. Table of Contents | In particular, responses must not include new experiments or data, describe additional work completed since submission, or promise additional work to follow. How can we design systems that will be reliable despite misbehaving participants? First, GNNAdvisor explores and identifies several performance-relevant features from both the GNN model and the input graph, and use them as a new driving force for GNN acceleration. However, the existing one-size-fits-all GNN implementations are insufficient to catch up with the evolving GNN architectures, the ever-increasing graph size, and the diverse node embedding dimensionality. (Visa applications can take at least 30 working days to process.) Papers must be in PDF format and must be submitted via the submission form. Writing a correct operating system kernel is notoriously hard. Message from the Program Co-Chairs. Her specialties include network routing protocols and network security. In some cases, the quality of these artifacts is as important as that of the document itself. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. All submissions will be treated as confidential prior to publication on the USENIX OSDI 21 website; rejected submissions will be permanently treated as confidential. Of the 26 submitted artifacts: 26 artifacts received the Artifacts Available badge (100%). 23 artifacts received the Artifacts Functional badge (88%). Furthermore, such performance can be achieved without any modification in applications, network hardware, kernel CPU schedulers and/or kernel network stack. Typically, monolithic kernels share state across cores and rely on one-off synchronization patterns that are specialized for each kernel structure or subsystem. This motivates the need for a new approach to data privacy that can provide strong assurance and control to users. Pollux promotes fairness among DL jobs competing for resources based on a more meaningful measure of useful job progress, and reveals a new opportunity for reducing DL cost in cloud environments. They collectively make the backup fresh, columnar, and fault-tolerant, even facing millions of concurrent transactions per second. Call for Papers. Overall, the OSDI PC accepted 31 out of 165 submissions. Differential privacy (DP) enables model training with a guaranteed bound on this leakage. Moreover, as of October 2020, a review of the 50 most cited empirical papers that list personality as a keyword indicates that all 50 papers were authored by people with insti tutional affiliations in the United States, Canada, Germany, the UK, and New Zealand, and only three papers included samples outside of these regions (see Supplementary We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. After three years working on web-based collaboration systems at a startup in North Carolina, he joined Sprint's Advanced Technology Lab in Burlingame, California, in 1998, working on cloud computing and network monitoring. We convert five state-of-the-art PM indexes using Nap. Authors may upload supplementary material in files separate from their submissions. We present Storm, a web framework that allows developers to build MVC applications with compile-time enforcement of centrally specified data-dependent security policies. Lukas Burkhalter, Nicolas Kchler, Alexander Viand, Hossein Shafagh, and Anwar Hithnawi, ETH Zrich. Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning, Oort: Efficient Federated Learning via Guided Participant Selection, PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections, Modernizing File System through In-Storage Indexing, Nap: A Black-Box Approach to NUMA-Aware Persistent Memory Indexes, Rearchitecting Linux Storage Stack for s Latency and High Throughput, Optimizing Storage Performance with Calibrated Interrupts, ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction, DMon: Efficient Detection and Correction of Data Locality Problems Using Selective Profiling, CLP: Efficient and Scalable Search on Compressed Text Logs, Polyjuice: High-Performance Transactions via Learned Concurrency Control, Retrofitting High Availability Mechanism to Tame Hybrid Transaction/Analytical Processing, The nanoPU: A Nanosecond Network Stack for Datacenters, Beyond malloc efficiency to fleet efficiency: a hugepage-aware memory allocator, Scalable Memory Protection in the PENGLAI Enclave, NrOS: Effective Replication and Sharing in an Operating System, Addra: Metadata-private voice communication over fully untrusted infrastructure, Bringing Decentralized Search to Decentralized Services, Finding Consensus Bugs in Ethereum via Multi-transaction Differential Fuzzing, MAGE: Nearly Zero-Cost Virtual Memory for Secure Computation, Zeph: Cryptographic Enforcement of End-to-End Data Privacy, It's Time for Operating Systems to Rediscover Hardware, DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols, GoJournal: a verified, concurrent, crash-safe journaling system, STORM: Refinement Types for Secure Web Applications, Horcrux: Automatic JavaScript Parallelism for Resource-Efficient Web Computation, SANRAZOR: Reducing Redundant Sanitizer Checks in C/C++ Programs, Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads, GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs, Marius: Learning Massive Graph Embeddings on a Single Machine, P3: Distributed Deep Graph Learning at Scale. Simultaneous submission of the same work to multiple venues, submission of previously published work, or plagiarism constitutes dishonesty or fraud. PET then automatically corrects results to restore full equivalence. P3 exposes a simple API that captures many different classes of GNN architectures for generality. This kernel is scaled across NUMA nodes using node replication, a scheme inspired by state machine replication in distributed systems. In this paper, we propose Oort to improve the performance of federated training and testing with guided participant selection. Sat, Aug 7, 2021 3 min read researches review. Authors are also encouraged to contact the program co-chairs, osdi21chairs@usenix.org, if needed to relate their OSDI submissions to relevant submissions of their own that are simultaneously under review or awaiting publication at other venues. For general conference information, see https://www . Our evaluation on the SPEC benchmarks shows that SanRazor can reduce the overhead of sanitizers significantly, from 73.8% to 28.062.0% for AddressSanitizer, and from 160.1% to 36.6124.4% for UndefinedBehaviorSanitizer (depending on the applied reduction scheme). We propose a new framework for computing the embeddings of large-scale graphs on a single machine. We compare Marius against two state-of-the-art industrial systems on a diverse array of benchmarks. The file system performance of the proposed ZNS+ storage system was 1.33--2.91 times better than that of the normal ZNS-based storage system. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. His work has included the Barrelfish multikernel research OS, as well as work on distributed stream processors, and using formal specifications to describe the hardware/software interfaces of modern computer systems. Important Dates Abstract registrations due: Thursday, December 3, 2020, 3:00 pm PST Complete paper submissions due: Thursday, December 10, 2020, 3:00pm PST Author Response Period The OSDI Symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation.