CyLab faculty, students to present at NDSS Symposium 2024
By Michael Cunningham
Carnegie Mellon faculty and students will present on a wide range of topics at the 31st Annual Network and Distributed System Security (NDSS) Symposium. Held at the Catamaran Resort Hotel & Spa in San Diego from February 26 through March 1, the event fosters information exchange among researchers and practitioners of network and distributed system security.
Bringing together hundreds of security educators, researchers and practitioners from all over the world, the NDSS Symposium encourages and enables the Internet community to apply, deploy, and advance the state of available security technologies.
Here, we've compiled a list of the papers co-authored by CyLab Security and Privacy Institute members that are being presented at the event.
Attributions for ML-based ICS Anomaly Detection: From Theory to Practice
Clement Fung, Carnegie Mellon University; Eric Zeng, Carnegie Mellon University; Lujo Bauer, Carnegie Mellon University
Abstract: Industrial Control Systems (ICS) govern critical infrastructure like power plants and water treatment plants. ICS can be attacked through manipulations of its sensor or actuator values, causing physical harm. A promising technique for detecting such attacks is machine-learning-based anomaly detection, but it does not identify which sensor or actuator was manipulated and makes it difficult for ICS operators to diagnose the anomaly's root cause. Prior work has proposed using attribution methods to identify what features caused an ICS anomaly-detection model to raise an alarm, but it is unclear how well these attribution methods work in practice. In this paper, we compare state-of-the-art attribution methods for the ICS domain with real attacks from multiple datasets. We find that attribution methods for ICS anomaly detection do not perform as well as suggested in prior work and identify two main reasons. First, anomaly detectors often detect attacks either immediately or significantly after the attack start; we find that attributions computed at these detection points are inaccurate. Second, attribution accuracy varies greatly across attack properties, and attribution methods struggle with attacks on categorical-valued actuators. Despite these challenges, we find that ensembles of attributions can compensate for weaknesses in individual attribution methods. Towards practical use of attributions for ICS anomaly detection, we provide recommendations for researchers and practitioners, such as the need to evaluate attributions with diverse datasets and the potential for attributions in non-real-time workflows.
Flow Correlation Attacks on Tor Onion Service Sessions with Sliding Subset Sum
Daniela Lopes, INESC-ID / IST, Universidade de Lisboa; Jin-Dong Dong, Carnegie Mellon University; Pedro Medeiros, INESC-ID / IST, Universidade de Lisboa; Daniel Castro, INESC-ID / IST, Universidade de Lisboa; Diogo Barradas, University of Waterloo; Bernardo Portela, INESC TEC / Universidade do Porto; João Vinagre, INESC TEC / Universidade do Porto; Bernardo Ferreira, LASIGE, Faculdade de Ciências, Universidade de Lisboa; Nicolas Christin Carnegie Mellon University; Nuno Santos, INESC-ID / IST, Universidade de Lisboa
Abstract: Tor is one of the most popular anonymity networks in use today. Its ability to defend against flow correlation attacks is essential for providing strong anonymity guarantees. However, the feasibility of flow correlation attacks against Tor onion services (formerly known as "hidden services") has remained an open challenge. In this paper, we present an effective flow correlation attack that can deanonymize onion service sessions in the Tor network. Our attack is based on a novel distributed technique named Sliding Subset Sum (SUMo), which can be deployed by a group of colluding ISPs worldwide in a federated fashion. These ISPs collect Tor traffic at multiple vantage points in the network, and analyze it through a pipelined architecture based on machine learning classifiers and a novel similarity function based on the classic subset sum decision problem. These classifiers enable SUMo to deanonymize onion service sessions effectively and efficiently. We also analyze possible countermeasures that the Tor community can adopt to hinder the efficacy of these attacks.
Group-based Robustness: A General Framework for Customized Robustness in the Real World
Weiran Lin, Carnegie Mellon University; Keane Lucas, Carnegie Mellon University; Neo Eyal, Tel Aviv University; Lujo Bauer, Carnegie Mellon University; Michael K. Reiter, Duke University; Mahmood Sharif, Tel Aviv University
Abstract: Machine-learning models are known to be vulnerable to evasion attacks that perturb model inputs to induce misclassifications. In this work, we identify real-world scenarios where the true threat cannot be assessed accurately by existing attacks. Specifically, we find that conventional metrics measuring targeted and untargeted robustness do not appropriately reflect a model's ability to withstand attacks from one set of source classes to another set of target classes. To address the shortcomings of existing methods, we formally define a new metric, group-based robustness, that complements existing metrics and is better-suited for evaluating model performance in certain attack scenarios. We show empirically that group-based robustness allows us to distinguish between models' vulnerability against specific threat models in situations where traditional robustness metrics do not apply. Moreover, to measure group-based robustness efficiently and accurately, we 1) propose two loss functions and 2) identify three new attack strategies. We show empirically that with comparable success rates, finding evasive samples using our new loss functions saves computation by a factor as large as the number of targeted classes, and finding evasive samples using our new attack strategies saves time by up to 99% compared to brute-force search methods. Finally, we propose a defense method that increases group-based robustness by up to 3.52 times.
MPCDiff: Testing and Repairing MPC-Hardened Deep Learning Models
Qi Pang, Carnegie Mellon University; Yuanyuan Yuan, HKUST; Shuai Wang, HKUST
Abstract: Secure multi-party computation (MPC) has recently become prominent as a concept to enable multiple parties to perform privacy-preserving machine learning without leaking sensitive data or details of pre-trained models to the other parties. Industry and the community have been actively developing and promoting high-quality MPC frameworks (e.g., based on TensorFlow and PyTorch) to enable the usage of MPC-hardened models, greatly easing the development cycle of integrating deep learning models with MPC primitives.
Despite the prosperous development and adoption of MPC frameworks, a principled and systematic understanding toward the correctness of those MPC frameworks does not yet exist. To fill this critical gap, this paper introduces MPCDiff, a differential testing framework to effectively uncover inputs that cause deviant outputs of MPC-hardened models and their plaintext versions. We further develop techniques to localize error-causing computation units in MPC-hardened models and automatically repair those defects.
We evaluate MPCDiff using real-world popular MPC frameworks for deep learning developed by Meta (Facebook), Alibaba Group, Cape Privacy, and OpenMined. MPCDiff successfully detected over one thousand inputs that result in largely deviant outputs. These deviation-triggering inputs are (visually) meaningful in comparison to regular inputs, indicating that our findings may cause great confusion in the daily usage of MPC frameworks. After localizing and repairing error-causing computation units, the robustness of MPC-hardened models can be notably enhanced without sacrificing accuracy and with negligible overhead.
TrustSketch: Trustworthy Sketch-based Telemetry on Cloud Hosts
Zhuo Cheng, Carnegie Mellon University; Maria Apostolaki, Princeton University; Zaoxing Liu, University of Maryland; Vyas Sekar, Carnegie Mellon University
Abstract: Cloud providers deploy telemetry tools in software to perform end-host network analytics. Recent efforts show that sketches, a kind of approximate data structure, are a promising basis for software-based telemetry, as they provide high fidelity for many statistics with a low resource footprint. However, an attacker can compromise sketch-based telemetry results via software vulnerabilities. Consequently, they can nullify the use of telemetry; e.g., avoiding attack detection or inducing accounting discrepancies. In this paper, we formally define the requirements for trustworthy sketch-based telemetry and show that prior work cannot meet those due to the sketch's probabilistic nature and performance requirements. We present the design and implementation TRUSTSKETCH, a general framework for trustworthy sketch telemetry that can support a wide spectrum of sketching algorithms. We show that TRUSTSKETCH is able to detect a wide range of attacks on sketch-based telemetry in a timely fashion while incurring only minimal overhead.