Blogs

Occasionally, I write down what I learned, what I thought, what surprised me, and what I wanted to remember. I’ve been sloppy on these blogs, but still trying my best to keep up!

Technical

An Effective Algorithm for On-line Linearizability Checking
28 Jul 2024 - Guanzhou Hu

This post describes a simple yet effective algorithm of an on-line linearizability checker for concurrent Put/Get operations from a known number of nodes. The core idea is to maintain a set of still-possible states (i.e., possibilities) given the operation results observed. If this set ever becomes empty after feeding an operation result in, then linearizability has been violated. Check out this repo for a Rust crate implementation of this algorithm.
Practical SMR-style TLA+ Specification of the MultiPaxos Protocol
19 Feb 2024 - Guanzhou Hu

The attached files present a practical TLA+ specification of MultiPaxos that very closely models how a real state machine replication (SMR) system would implement this protocol. I did not find anything similar on the web, so I’d like to share it with anyone interested.
Emulating a Distributed Network on a Single Linux Host
28 Oct 2023 - Guanzhou Hu

Recently, I need to benchmark a lightweight distributed system codebase on a single host for my current research project. I want to have control over the network performance parameters (including delay, jitter distribution, rate, loss, etc.) and test a wide range of parameter values; meanwhile, I want to avoid pure software-based simulation. Thus, I opt in for using kernel-supported network emulation. In this post, I document what I tried and what finally worked.
Revisiting My Distributed Replication Consistency Models Post
05 Apr 2023 - Guanzhou Hu

Previously, I made a blog post about common consistency models in distributed state machine replication (SMR). As I am recently picking up my scattered knowledge about distributed replication systems, I found some inaccuracy and ambiguity in that old post. This short post lists some patches and complementary material I revisited on this convoluted topic.
Understanding Hierarchical Locking in Database Systems
06 Oct 2022 - Guanzhou Hu

Described in this classic paper by Jim Gray et. al, hierarchical locking has been a well-studied idea in database management systems (DBMS). Despite its long history, I found the theoretical notion of lock modes less intuitive and hard to understand upon first encounter. This post tries to distill the core motivations of hierarchical locking, break its design down into three pieces, and describe them progressively, to hopefully clarify this beautiful idea.
Systems for AI and AI for Systems: Some Chitter-Chatter
21 May 2022 - Guanzhou Hu

This is a short post where I note down some of my insignificant thoughts about the interaction between AI and systems. With the rapid evolution of AI technologies, especially in the field of machine learning (ML), there is now a rising interest in studying the intersection between AI and computer systems design. The combination of the two can further be categorized into two directions: building systems for AI applications (Sys for AI) and using AI to empower smarter systems (AI for Sys).
Formal Description of File System Crash Consistency Techniques
26 Dec 2021 - Guanzhou Hu

Crash consistency is one of the most essential guarantees that a storage system needs to make to ensure correctness. In a file system (FS) setting, consistency techniques must be carefully designed, integrated with the layout of blocks, and deployed in the procedure of updates. This post summarizes the three classic FS consistency techniques: journaling, shadow paging (CoW), and log-structuring, in a formal way and analyzes their pros & cons.
Multicore Locking Design & A Partial List of Lock Implementations
31 May 2021 - Guanzhou Hu

Concurrency plays a significant role in modern multi-core operating systems. We want a locking mechanism that is efficient (low latency), scalable (increasing the number of threads does not degrade performance too badly), and fair (considers the order of acquirement and does not make any one thread wait too long). This post summarizes a bit on hardware atomic instructions which modern locks are built upon, a comparison between spinning and blocking locks, and a partial list of representative lock implementations.
Types of Operating System Kernel Structures and Virtual Machines
24 May 2021 - Guanzhou Hu

This post summarizes the different types of operating system kernel structures (kernel models) and virtual machine structures. Apart from the best-known monolithic kernel model, OS kernels may also take the form of microkernel, semi-microkernel, exokernel, kernel bypassing library for certain subsystems, or disaggregated kernel. Virtualization of OS environment as a whole (i.e., virtual machines) has become popular with the rapid trend towards cloud computing. Virtual machines can be categorized as type-1a vs. type-1b vs. type-2.
Building a Custom Linux Kernel & Debugging via QEMU + GDB
02 Jan 2021 - Guanzhou Hu

When doing systems research, we sometimes need to modify/add new stuff into the Linux kernel. This post lists a successful workflow of building and installing a custom Linux kernel under a Ubuntu 18.04/20.04 environment (deb), along with steps to debug the Linux kernel by running it over the QEMU emulator and attaching to GDB.
Modern Storage Hierarchy: From NAND SSD to 3D XPoint (Optane) PM
01 Jan 2021 - Guanzhou Hu

As minimization and cell density of traditional 2D NAND SSDs reach a manufacturing bottleneck, 3D NAND SSDs come on the market. They push block capacity a little bit forward, but suffer from severer write amplification and are more expensive, thus are not a perfect solution. Intel 3D XPoint (official brand name as Optane), a hybrid design sitting in-between DRAM and NAND flash storage, adds a new possibility in the storage hierarchy.
Brief Summary of Cache Modes & Cache Eviction Algorithms
07 Aug 2020 - Guanzhou Hu

Caching is an essential technique used broadly in computer system hierarchies. This post briefly summarizes existing cache mode configurations and cache eviction algorithms. This serves as a shallow review of cache systems before I go deeper into this field.
Decentralized Trust: Essential Ideas Behind Blockchain Technology
05 Jul 2020 - Guanzhou Hu

The name Blockchain has been a hot word in the past few years. Despite the controversy behind some of its applications such as virtual currency, blockchain itself is actually an appealing proposal towards decentralized trust over the Internet. It is worth looking into when studying modern distributed systems, especially as a good example of the design and implementation of decentralized systems.
Write Buffering, LSM Tree, & Journaling Summarized
13 Jun 2020 - Guanzhou Hu

In file system & database design, write buffering (write grouping or coalescing) is a commonly-used technology to avoid in-place updates and only expose sequential writes to disks. Log-Structured Merge Tree (LSM tree) is a modern practical solution which sacrifices a little bit of read performance to enable efficient write buffering. Journaling (write-ahead logging) is another file system terminology which is sometimes confused with write buffering. In short, write buffering is for write performance and journaling is for crash recovery - they are different, but can be combined.
CPU Cache Side-Channel Attacks: Meltdown & Spectre
10 Jun 2020 - Guanzhou Hu

One of the most dangerous kinds of security attacks is side-channel attacks since they are not part of the designed threat model. Meltdown & Spectre, the most recent side-channel vulnerabilities found on modern microprocessors, are good demonstration of the sneakiness and danger of side-channel attacks. These attacks combine CPU speculative execution + cache timing side-channel.
Serializable Distributed Transactions over Sharded Scenario
31 May 2020 - Guanzhou Hu

Sharding is a common distributed system design to scale out and achieve better performance. Distributed transactions (concurrency control + atomic commits) are used to coordinate sharded nodes. It is important to implement serializable distributed transactions for such a system to act correctly.
Consistency Models for Distributed Replicated State Machines
23 May 2020 - Guanzhou Hu

NOTE: this post is outdated and contains some of my early misunderstandings, so please read skeptically. A new post series on understandable categorization and in-depth analysis of consistency models is coming out soon, which will serve as the theoretical foundation of my ongoing research.
Notes on Consensus Algorithms - Paxos, Multi-Paxos, and Raft
04 Apr 2020 - Guanzhou Hu

分布式系统中，基础的共识算法（Consensus Algorithms）希望解决的是在节点可能 crash / restart、节点间网络消息可能乱序、丢失、重复的情况下，让所有节点对 clients 一串提案达成 strong consistency (linearizability)，从而实现 Replicated State Machines，做到有效的 fault-tolerence。
Brief Summary of I/O Interface and Some Related Acronyms
17 Oct 2019 - Guanzhou Hu

学习存储系统的过程中不可避免地会接触到许多硬件层面的术语简称，包括硬件设备、接口、传输和控制协议等。在打超算比赛时想起，应该把这些整理成文以做总结。原写于 3 月，10 月再次修改如下。图片地址仍然在原 CNBlogs 站上没有迁移，等哪天链接崩了再换成更新的图吧。
About the Design of the Rust Language and Its Pros & Cons
03 Sep 2019 - Guanzhou Hu

Rust 作为一门新兴的 system programming 语言，其设计参考了各 system programming 语言的优劣势，以安全、同时快速为目标，开创了 compile-time 实现几乎一切安全检查的新颖的编程语言模式。可以说，Rust 承载了作者 Graydon Hoare 和新互联网时代逐渐关注起安全的众多开发者们的理想，也吸引了如 M 校前沿系统研究者们的关注（参考 PDOS 博士生 Jon Gjengset)。
GDB Installation & Code-sign Steps on macOS X
21 Apr 2019 - Guanzhou Hu

TL;DR: Use lldb instead of GNU gdb on macOS >= 10.14 Mojave directly (app verification scheme on newer macOS gets really complicated). If you really wanna make it, the following procedure is what finally worked or me. 如下是在 Mojave 上 GDB debugger 安装使用踩坑后，最终成功的步骤总结。

Personal

Raspberry Pi As Campus GlobalProtect VPN Proxy Server
22 Dec 2021 - Guanzhou Hu

Wisc campus VPN and our CS departmental VPN both use GlobalProtect. On the user side, GlobalProtect clients cannot configure VPN split tunneling, meaning that once connected, all outbound traffic from my host machine goes through the VPN. I have a daily need to access my lab machine sitting behind the departmental VPN, yet I would like all other traffic (e.g., searching Google) to bypass the VPN. I came up with a solution of using one or two Raspberry Pi chips as an always-on SSH proxy server.
System Building Rules & Tips from the OSTEP Book
15 Jun 2021 - Guanzhou Hu

This short post is a summary list of all the system building tips/rules/laws boxes in the OSTEP book (also see my reading note). Without proper context, these tips make little sense, so I included the chapter numbers as well for easier back-tracing.
Noting Down Some Knowledge Fragments Encountered
11 Jun 2020 - Guanzhou Hu

Memory fragments encountered, mostly not in my major fields. Noting them down just for a memorandum. 这篇用于记录一些学习中遇到的细碎知识。大多不是主要领域的知识，所以并未系统地学习和整理，权当备忘和随笔啦。
Personal Dev Environment & Tools Configuration Record
10 Feb 2020 - Guanzhou Hu

WARNING: this post is seriously outdated, and my dev setup has been evolving a lot with time. For a partial glance, you may check out my assets/dev-env/auto-setup.sh.

【游戏+工作站】M-ATX 装机记录 2020
17 Jun 2020 - Guanzhou Hu

在美国勉强算是安顿下来了。这个小公寓可能一呆就是 5 年，故干脆狠下心配了一套 2020 年中高配的【游戏+工作站】的 PC，作为自己 5 年学习生涯的小家。在此将自己第一次亲力亲为的装机过程，尽可能详细地记录下来。
隔离在异乡小房间时的一些胡思乱想（#3）
01 Apr 2020 - Guanzhou Hu

本系列三篇是 2020 COVID-19 疫情时期，隔离在波士顿租住的小房间里的一些胡思乱想。世界和人生都在经历重要的转变期，故做此反思文字，以期将来能够回望，看到进步。此为 #3 篇。
隔离在异乡小房间时的一些胡思乱想（#2）
31 Mar 2020 - Guanzhou Hu

本系列三篇是 2020 COVID-19 疫情时期，隔离在波士顿租住的小房间里的一些胡思乱想。世界和人生都在经历重要的转变期，故做此反思文字，以期将来能够回望，看到进步。此为 #2 篇。
隔离在异乡小房间时的一些胡思乱想（#1）
27 Mar 2020 - Guanzhou Hu

本系列三篇是 2020 COVID-19 疫情时期，隔离在波士顿租住的小房间里的一些胡思乱想。世界和人生都在经历重要的转变期，故做此反思文字，以期将来能够回望，看到进步。此为 #1 篇。

	guanzhou.hu (at) wisc.edu
	josehu (at) cs.wisc.edu