Skip to content

SOSP

2023

One Simple API Can Cause Hundreds of Bugs An Analysis of Refcounting Bugs in All Modern Linux Kernels

reference counting, bug, Linux kernel

Acto: Automatic End-to-End Testing for Operation Correctness of Cloud System Management

Kubernetes, operation, system management, cloud, reliability, operation correctness, operator

state-reconciliation: An operation declares a desired system state and the operator automatically reconciles the system to that declared state.

acto models operations as state transitions and realizes state-transition sequences.

motivation study:

  1. Unit tests cannot validate operation correctness. Only 34% contains e2e tests.
  2. Existing e2e tests cover only 1.27 - 2.15% of supported properties.
  3. State-based assertions in existing e2e tests cover only 0.24–10.90% of managed systems’ state-object fields.

design:

acto support black box mode and white box mode. Each takes two inputs: 1) a building minifest, 2) specification of state declaration. white box mode additionally requires the source code for program analysis.

Black box mode infer the relationship between interface declaration and resource based on composite types (e.g. class). However, such design cannot infer primitive types like int, float etc. White box mode build the relationship based on taint analysis

After building relationship between interface declaration and resource objects, acto implement consistency oracle and differential oracle to find false values and inconsistent items respectively.

2021

Boki: Stateful Serverless Computing with Shared Logs

TODO

ghOSt: Fast & Flexible User-Space Delegation of Linux Scheduling

TODO

Exploiting Nil-Externality for Fast Replicated Storage

TODO

Syrup: User-Defined Scheduling Across the Stack

TODO

TwinVisor: Hardware-isolated Confidential Virtual Machines for ARM

TODO

Formal Verification of a Multiprocessor Hypervisor on Arm Relaxed Memory Hardware

TODO

Rudra: Finding Memory Safety Bugs in Rust at the Ecosystem Scale

Faster and Cheaper Serverless Computing on Harvested Resources

Harvest VM: defined by its minimum size and how many harvested physical cores they are capable of using. The workload running on the Harvest VM can query the number of physical cores assigned to it in /proc in Linux

SoK of improving the serverless infrastructure (performance up & cost down):

  1. scheduling policies for making serverless platforms cost-effective and performant
  2. performance-aware and cost-effective storage
  3. secure and lightweight container infrastructure
  4. characterization of existing serverless workloads
  5. enabling applications to run in a serverless-native manner

TODO

The Demikernel Datapath OS Architecture for Microsecond-scale Datacenter Systems

TODO

An Analysis of Performance Evolution of Linux’s Core Operations

TODO

2017

Komodo: Using verification to disentangle secure-enclave hardware from software

TODO

2014

Eidetic Systems

TODO

2011

An Empirical Study on Configuration Errors in Commercial and Open Source Systems