My name is Zach. My pronouns are he/him/his. I’m currently a researcher at MongoDB in the Cryptography Research Group.
My research goal is to advance the state-of-the-art in privacy-preserving, efficient systems for multi-party computation, data processing, and retrieval. To do this, I combine a broad set of interests in cryptography, data structures, formal methods and optimization, and algorithms. I also like to study cryptanalysis attacks that are motivated by practical, real-world systems but grounded in theoretically interesting formalisms. As a corollary to all of these goals, I also like to ask questions about the practicality and usability of cryptography research itself—specifically, how we can build practical tooling and infrastructure to enable researchers to perform research (both in theory and in practice).
I completed a concurrent Sc.B. and Sc.M. at Brown University with a focus in security, systems, and theory. My studies were generously supported by grants from Brown CS, CrowdStrike, (ISC)2, and the CIT Group. I was also affiliated with the following research groups:
I am passionate about teaching computer security. While at Brown, I spent a lot of time thinking about creatively designing effective mechanisms for developing security mindsets. I was the course designer and Head Teaching Assistant of the Computer Science department’s flagship computer systems security course from 2019 to 2021.
Previously, I worked as a security engineer at D. E. Shaw. & Co. and interned at Google and Order.
I spend a lot of time long distance running and thinking about things like: theatrical lighting design, board games, public transit, speech and debate, and Dance Dance Revolution.
⁂ denotes authors listed alphabetically. Click abstracts to expand.
Encrypted data structures are essential for designing efficient encrypted search algorithms and secure databases. However, a critical aspect that has not been adequately addressed is the concurrent nature of modern databases, which allow multiple operations to be executed simultaneously. Agarwal, Kamara, and Moataz (Asiacrypt 2024) recently initiated the study of concurrent encrypted data structures, introducing formalisms for their design and analysis.
Building on their foundational work, we adapt their security definitions to support sequential consistency instead of linearizability. While linearizability offers a strong correctness guarantee by ensuring operations appear to occur instantaneously, sequential consistency allows operations to be executed in a consistent order without immediate synchronization across clients, making it more efficient for concurrent environments. We present a new concurrent encrypted multimap (EMM), denoted as SCM, which achieves sequential consistency and provides significant improvements in both asymptotic and empirical efficiency compared to their linearizable EMM scheme, TST.
We introduce a framework based on Bayesian statistical inference for analyzing leakage and its vulnerability to inference attacks. Our framework naturally integrates auxiliary information, defines a notion of adversarial advantage, and provides information-theoretic measures that capture the security of leakage patterns against both full and functional recovery attacks.
We present two main theorems that bound the advantage of powerful inference techniques: the maximum a posteriori (MAP), the maximum likelihood estimate (MLE) and the MAP test. Specifically, we show that the advantage of these methods is exponentially bounded by new entropy measures that capture the susceptibility of leakage patterns to inference.
To demonstrate the applicability of our framework, we design and implement an automated leakage attack engine, Bleak, which leverages a novel inference algorithm that efficiently computes MAP estimates for a large class of i.i.d. leakage models. These models include, for example, query equality, the combination of query equality and volume, and leakage patterns arising from naive conjunctions.
Data analytics is a core part of modern decision making, especially in public policy. However, there exists a tension between data privacy and otherwise socially beneficial analytics when data sources contain personal information. We design Synq, a system that supports analytics over encrypted data while accounting for the usability considerations institutions may have when conducting studies that affect public policy. We specifically use an application-centric approach and model Synq’s design requirements from a large-scale series of studies conducted on the opioid epidemic in Massachusetts. We systematize the design considerations of the public policy context and demonstrate how the combination of design considerations that Synq addresses is novel through a survey of the literature. We then present our protocol which combines structured encryption, somewhat homomorphic encryption, and oblivious pseudorandom functions to support a complex query language that includes filtering (retrieving rows by attribute/value pairs), linking (merging rows from different tables that represent the same individual) and aggregate functions (sum, count, average, variance, regression). We formally express the security of our protocol and show that Synq is efficient in practice while satisfying usability considerations that are critical to deployment in the setting of public policy studies.
This work addresses expressive queries over encrypted data by presenting the first systematic study of multi-attribute range search on a symmetrically encrypted database outsourced to an honest-but-curious server. Prior work includes a thorough analysis of single-attribute range search schemes (e.g. Demertzis et al. 2016) and a proposed high-level approach for multi-attribute schemes (De Capitani di Vimercati et al. 2021). We first introduce a flexible framework for building secure range search schemes over multiple attributes (dimensions) by adapting a broad class of geometric search data structures to operate on encrypted data. Our framework encompasses widely used data structures such as multi-dimensional range trees and quadtrees, and has strong security properties that we formally prove. We then develop six concrete highly parallelizable range search schemes within our framework that offer a sliding scale of efficiency and security tradeoffs to suit the needs of the application. We evaluate our schemes with a formal complexity and security analysis, a prototype implementation, and an experimental evaluation on real-world datasets.
In this work, we present the first database reconstruction attacks against response-hiding private range search schemes on encrypted databases of arbitrary dimensions. Falzon et al. (VLDB 2022) present a number of range-supporting schemes on arbitrary dimensions exhibiting different security and efficiency trade-offs. Additionally, they characterize a form of leakage, structure pattern leakage, also present in many one-dimensional schemes e.g., Demertzis et al. (SIGMOD 2016) and Faber et al. (ESORICS 2015). We present the first systematic study of this leakage and attack a broad collection of schemes, including schemes that allow the responses to contain false-positives (often considered the gold standard in security). We characterize the information theoretic limitations of a passive persistent adversary. Our work shows that for range queries, structure pattern leakage can be as vulnerable to attacks as access pattern leakage. We give a comprehensive evaluation of our attacks with a complexity analysis, a prototype implementation, and an experimental assessment on real-world datasets.
We present ARQ, a systematic framework for creating cryptographic schemes that handle range aggregate queries (sum, minimum, median, and mode) over encrypted datasets. Our framework does not rely on trusted hardware or specialized cryptographic primitives such as property-preserving or homomorphic encryption. Instead, ARQ unifies structures from the plaintext data management community with existing structured encryption primitives. We prove how such combinations yield efficient (and secure) constructions in the encrypted setting. We also propose a series of domain reduction techniques that can improve the space efficiency of our schemes against sparse datasets at the cost of small leakage. As part of this work, we designed and implemented a new, open-source, encrypted search library called Arca and implemented the ARQ framework using this library in order to evaluate ARQ’s practicality. Our experiments on real-world datasets demonstrate the efficiency of the schemes derived from ARQ in comparison to prior work.
I served as a teaching assistant every semester I was at Brown University, sometimes even during semesters I wasn't enrolled. ⁂ denotes a Head Teaching Assistant role.
Software exploitation techniques and state-of-the-art mechanisms for hardening software. With Vasileios Kemerlis.
An introduction to principles of computer security from an applied viewpoint and provides hands-on experience on security threats and countermeasures. Topics include cryptosystems, web security, network security, malware, code execution vulnerabilities, access control, cryptocurrencies, machine learning, and human and social issues. With Roberto Tamassia (2019, 2020) and Bernardo Palazzi (2021).
Explores the principles of modern programming languages by implementation; studies data and their types, including polymorphism, type inference, and type soundness; examines compiler and run-time system topics: continuation-passing style and garbage collection. With Shriram Krishnamurthi.
Functional programming, data structures, and algorithms in Racket and Pyret. Includes a summer component taught using the first half of How to Design Programs, then transitions to content based on portions of Programming and Programming Languages during the semester. With Shriram Krishnamurthi.
Introduction to programming in MATLAB and Python, with an emphasis on STEM data analysis and simulation problems. With Dan Potter.
Data-focused introduction to computer science using Pyret. With Kathi Fisler.