Accepted Papers

The following technical papers have been accepted for this year's program.

SESSION: Digital Forensics and Malware Detection

The Tangled Genealogy of IoT Malware

Emanuele Cozzi
Pierre-Antoine Vervier
Matteo Dell'Amico
Yun Shen
Leyla Bilge
Davide Balzarotti

The recent emergence of consumer off-the-shelf embedded (IoT) devices and the rise of large-scale IoT botnets has dramatically increased the volume and sophistication of Linux malware observed in the wild. The security community has put a lot of effort to document these threats but analysts mostly rely on manual work, which makes it difficult to scale and hard to regularly maintain. Moreover, the vast amount of code reuse that characterizes IoT malware calls for an automated approach to detect similarities and identify the phylogenetic tree of each family.

In this paper we present the largest measurement of IoT malware to date. We systematically reconstruct – through the use of binary code similarity – the lineage of IoT malware families, and track their relationships, evolution, and variants. We apply our technique on a dataset of more than 93k samples submitted to VirusTotal over a period of 3.5 years. We discuss the findings of our analysis and present several case studies to highlight the tangled relationships of IoT malware.

Spotlight: Malware Lead Generation at Scale

Fabian Kaczmarczyck
Bernhard Grill
Luca Invernizzi
Jennifer Pullman
Cecilia M. Procopiuc
David Tao
Borbala Benko
Elie Bursztein

Malware is one of the key threats to online security today, with applications ranging from phishing mailers to ransomware and trojans. Due to the sheer size and variety of the malware threat, it is impractical to combat it as a whole. Instead, governments and companies have instituted teams dedicated to identifying, prioritizing, and removing specific malware families that directly affect their population or business model. The identification and prioritization of the most disconcerting malware families (known as malware hunting) is a time-consuming activity, accounting for more than 20% of the work hours of a typical threat intelligence researcher, according to our survey. To save this precious resource and amplify the team’s impact on users’ online safety we present Spotlight, a large-scale malware lead-generation framework. Spotlight first sifts through a large malware data set to remove known malware families, based on first and third-party threat intelligence. It then clusters the remaining malware into potentially-undiscovered families, and prioritizes them for further investigation using a score based on their potential business impact.

We evaluate Spotlight on 67M malware samples, to show that it can produce top-priority clusters with over 99% purity (i.e., homogeneity), which is higher than simpler approaches and prior work. To showcase Spotlight’s effectiveness, we apply it to ad-fraud malware hunting on real-world data. Using Spotlight’s output, threat intelligence researchers were able to quickly identify three large botnets that perform ad fraud.

App-Agnostic Post-Execution Semantic Analysis of Android In-Memory Forensics Artifacts

Aisha Ali-Gombe
Alexandra Tambaoan
Angela Gurfolino
Golden G. Richard III

Over the last decade, userland memory forensics techniques and algorithms have gained popularity among practitioners, as they have proven to be useful in real forensics and cybercrime investigations. These techniques analyze and recover objects and artifacts from process memory space that are of critical importance in investigations. Nonetheless, the major drawback of existing techniques is that they cannot determine the origin and context within which the recovered object exists without prior knowledge of the application logic.

Thus, in this research, we present a solution to close the gap between application-specific and application-generic techniques. We introduce OAGen, a post-execution and app-agnostic semantic analysis approach designed to help investigators establish concrete evidence by identifying the provenance and relationships between in-memory objects in a process memory image. OAGen utilizes Points-to analysis to reconstruct a runtime’s object allocation network. The resulting graph is then fed as an input into our semantic analysis algorithms to determine objects’ origin, context, and scope in the network. The results of our experiments exhibit OAGen’s ability to effectively create an allocation network even for memory-intensive applications with thousands of objects, like Facebook. The performance evaluation of our approach across fourteen different Android apps shows OAGen can efficiently search and decode nodes, and identify their references with a modest throughput rate. Further practical application of OAGen demonstrated in two case studies shows that our approach can aid investigators in the recovery of deleted messages and the detection of malware functionality in post-execution program analysis.

AVclass2: Massive Malware Tag Extraction from AV Labels

Silvia Sebastián
Juan Caballero

Tags can be used by malware repositories and analysis services to enable searches for samples of interest across different dimensions. Automatically extracting tags from AV labels is an efficient approach to categorize and index massive amounts of samples. Recent tools like AVclass and Euphony have demonstrated that, despite their noisy nature, it is possible to extract family names from AV labels. However, beyond the family name, AV labels contain much valuable information such as malware classes, file properties, and behaviors.

This work presents AVclass2, an automatic malware tagging tool that given the AV labels for a potentially massive number of samples, extracts clean tags that categorize the samples. AVclass2 uses, and helps building, an open taxonomy that organizes concepts in AV labels, but is not constrained to a predefined set of tags. To keep itself updated as AV vendors introduce new tags, it provides an update module that automatically identifies new taxonomy entries, as well as tagging and expansion rules that capture relations between tags. We have evaluated AVclass2 on 42M samples and showed how it enables advanced malware searches and to maintain an updated knowledge base of malware concepts in AV labels.

Advanced Windows Methods on Malware Detection and Classification

Dima Rabadi
Sin G. Teo

Application Programming Interfaces (APIs) are still considered the standard accessible data source and core wok of the most widely adopted malware detection and classification techniques. API-based malware detectors highly rely on measuring API’s statistical features, such as calculating the frequency counter of calling specific API calls or finding their malicious sequence pattern (i.e., signature-based detectors). Using simple hooking tools, malware authors would help in failing such detectors by interrupting the sequence and shuffling the API calls or deleting/inserting irrelevant calls (i.e., changing the frequency counter). Moreover, relying on API calls (e.g., function names) alone without taking into account their function parameters is insufficient to understand the purpose of the program. For example, the same API call (e.g., writing on a file) would act in two ways if two different arguments are passed (e.g., writing on a system versus user file). However, because of the heterogeneous nature of API arguments, most of the available API-based malicious behavior detectors would consider only the API calls without taking into account their argument information (e.g., function parameters). Alternatively, other detectors try considering the API arguments in their techniques, but they acquire having proficient knowledge about the API arguments or powerful processors to extract them. Such requirements demand a prohibitive cost and complex operations to deal with the arguments. To overcome the above limitations, with the help of machine learning and without any expert knowledge of the arguments, we propose a light-weight API-based dynamic feature extraction technique, and we use it to implement a malware detection and type classification approach. To evaluate our approach, we use reasonable datasets of 7774 benign and 7105 malicious samples belonging to ten distinct malware types. Experimental results show that our type classification module could achieve an accuracy of , where our malware detection module could reach an accuracy of over , and outperforms many state-of-the-art API-based malware detectors.

SESSION: Anonymity and Privacy

Betrayed by the Guardian: Security and Privacy Risks of Parental Control Solutions

Suzan Ali
Mounir Elgharabawy
Quentin Duchaussoy
Mohammad Mannan
Amr Youssef

For parents of young children and adolescents, the digital age has introduced many new challenges, including excessive screen time, inappropriate online content, cyber predators, and cyberbullying. To address these challenges, many parents rely on numerous parental control solutions on different platforms, including parental control network devices (e.g., WiFi routers) and software applications on mobile devices and laptops. While these parental control solutions may help digital parenting, they may also introduce serious security and privacy risks to children and parents, due to their elevated privileges and having access to a significant amount of privacy-sensitive data. In this paper, we present an experimental framework for systematically evaluating security and privacy issues in parental control software and hardware solutions. Using the developed framework, we provide the first comprehensive study of parental control tools on multiple platforms including network devices, Windows applications, Chrome extensions and Android apps. Our analysis uncovers pervasive security and privacy issues that can lead to leakage of private information, and/or allow an adversary to fully control the parental control solution, and thereby may directly aid cyberbullying and cyber predators.

Talek: Private Group Messaging with Hidden Access Patterns

Raymond Cheng
William Scott
Elisaweta Masserova
Irene Zhang
Vipul Goyal
Thomas Anderson
Arvind Krishnamurthy
Bryan Parno

Talek is a private group messaging system that sends messages through potentially untrustworthy servers, while hiding both data content and the communication patterns among its users. Talek explores a new point in the design space of private messaging; it guarantees access sequence indistinguishability, which is among the strongest guarantees in the space, while assuming an anytrust threat model, which is only slightly weaker than the strongest threat model currently found in related work. Our results suggest that this is a pragmatic point in the design space, since it supports strong privacy and good performance: we demonstrate a 3-server Talek cluster that achieves throughput of 9,433 messages/second for 32,000 active users with 1.7-second end-to-end latency. To achieve its security goals without coordination between clients, Talek relies on information-theoretic private information retrieval. To achieve good performance and minimize server-side storage, Talek introduces new techniques and optimizations that may be of independent interest, e.g., a novel use of blocked cuckoo hashing and support for private notifications. The latter provide a private, efficient mechanism for users to learn, without polling, which logs have new messages.

Towards a Practical Differentially Private Collaborative Phone Blacklisting System

Daniele Ucci
Roberto Perdisci
Jaewoo Lee
Mustaque Ahamad

Spam phone calls have been rapidly growing from nuisance to an increasingly effective scam delivery tool. To counter this increasingly successful attack vector, a number of commercial smartphone apps that promise to block spam phone calls have appeared on app stores, and are now used by hundreds of thousands or even millions of users. However, following a business model similar to some online social network services, these apps often collect call records or other potentially sensitive information from users’ phones with little or no formal privacy guarantees.

In this paper, we study whether it is possible to build a practical collaborative phone blacklisting system that makes use of local differential privacy (LDP) mechanisms to provide clear privacy guarantees. We analyze the challenges and trade-offs related to using LDP, evaluate our LDP-based system on real-world user-reported call records collected by the FTC, and show that it is possible to learn a phone blacklist using a reasonable overall privacy budget and at the same time preserve users’ privacy while maintaining utility for the learned blacklist.

Towards Realistic Membership Inferences: The Case of Survey Data

Luke A. Bauer
Vincent Bindschaedler

We consider the problem of membership inference attacks on aggregate survey data through the use of several real-world datasets and a published study as a model for the survey. We apply membership inference attacks from the literature, and discover that methodological assumptions of existing attacks produce a misleading picture of the risk. When using a more realistic methodology, experiments reveal a more nuanced picture of the risk: membership inferences do succeed, but only a small subset of individuals are highly vulnerable to them. In fact, if the adversary wishes to avoid a high false positive rate, she should perform membership inferences only when she has a reason to believe that the target participated in the survey. However, our results do not imply that publishing survey data is inherently safe. Indeed, when applying membership inference not to individuals but to hospitals, we find that highly accurate inferences are possible.

Quantifying measurement quality and load distribution in Tor

Andre Greubel
Steffen Pohl
Samuel Kounev

Tor is a widely used anonymization network. Traffic is routed over different relay nodes to conceal the communication partners. However, if a single relay handles too much traffic, de-anonymization attacks are possible. The Tor Load Balancing Mechanism (TLBM) is responsible for balanced and secure load distribution. It must verify that relays cannot attract more traffic than they should by lying about their self-reported bandwidth. This work shows that the current bandwidth measurement method used for bandwidth verification is not suitable to verify the bandwidth of many relays. Most importantly, multiple measurements of high-bandwidth relays are uncorrelated to each other. Furthermore, we analyze the current load distribution in Tor. We show that the current load distribution reduces the resources necessary for several large-scale de-anonymization attacks by more than 80%. Additionally, as Tor favors fast relays during path selection, verifiable relays only handle a small fraction of Tor’s traffic. More precisely, we show that only 7.21% of all paths consist of entry and exit relays verifiable by measurements. We discuss these results’ security implications and argue that future TLBM research should focus at least as much on secure load distribution as on high traffic performance.

SESSION: Enterprise Security Management

SAIBERSOC: Synthetic Attack Injection to Benchmark and Evaluate the Performance of Security Operation Centers

Martin Rosso
Michele Campobasso
Ganduulga Gankhuyag
Luca Allodi

In this paper we introduce SAIBERSOC, a tool and methodology enabling security researchers and operators to evaluate the performance of deployed and operational Security Operation Centers (SOCs) (or any other security monitoring infrastructure). The methodology relies on the MITRE ATT&CK Framework to define a procedure to generate and automatically inject synthetic attacks in an operational SOC to evaluate any output metric of interest (e.g., detection accuracy, time-to-investigation, etc.). To evaluate the effectiveness of the proposed methodology, we devise an experiment with n = 124 students playing the role of SOC analysts. The experiment relies on a real SOC infrastructure and assigns students to either a BADSOC or a GOODSOC experimental condition. Our results show that the proposed methodology is effective in identifying variations in SOC performance caused by (minimal) changes in SOC configuration. We release the SAIBERSOC tool implementation as free and open source software.

Measurements of the Most Significant Software Security Weaknesses

Carlos Cardoso Galhardo
Peter Mell
Irena Bojanova
Assane Gueye

In this work, we provide a metric to calculate the most significant software security weaknesses as defined by an aggregate metric of the frequency, exploitability, and impact of related vulnerabilities. The Common Weakness Enumeration (CWE) is a well-known and used list of software security weaknesses. The CWE community publishes such an aggregate metric to calculate the ‘Most Dangerous Software Errors’. However, we find that the published equation highly biases frequency and almost ignores exploitability and impact in generating top lists of varying sizes. This is due to the differences in the distributions of the component metric values. To mitigate this, we linearize the frequency distribution using a double log function. We then propose a variety of other improvements, provide top lists of the most significant CWEs for 2019, provide an analysis of the identified software security weaknesses, and compare them against previously published top lists.

This is Why We Can’t Cache Nice Things: Lightning-Fast Threat Hunting using Suspicion-Based Hierarchical Storage

Wajih Ul Hassan
Ding Li
Kangkook Jee
Xiao Yu
Kexuan Zou
Dawei Wang
Zhengzhang Chen
Zhichun Li
Junghwan Rhee
Jiaping Gui
Adam Bates

Recent advances in the causal analysis can accelerate incident response time, but only after a causal graph of the attack has been constructed. Unfortunately, existing causal graph generation techniques are mainly offline and may take hours or days to respond to investigator queries, creating greater opportunity for attackers to hide their attack footprint, gain persistency, and propagate to other machines. To address that limitation, we present Swift, a threat investigation system that provides high-throughput causality tracking and real-time causal graph generation capabilities. We design an in-memory graph database that enables space-efficient graph storage and online causality tracking with minimal disk operations. We propose a hierarchical storage system that keeps forensically-relevant part of the causal graph in main memory while evicting rest to disk. To identify the causal graph that is likely to be relevant during the investigation, we design an asynchronous cache eviction policy that calculates the most suspicious part of the causal graph and caches only that part in the main memory. We evaluated Swift on a real-world enterprise to demonstrate how our system scales to process typical event loads and how it responds to forensic queries when security alerts occur. Results show that Swift is scalable, modular, and answers forensic queries in real-time even when analyzing audit logs containing tens of millions of events.

CDL: Classified Distributed Learning for Detecting Security Attacks in Containerized Applications

Yuhang Lin
Olufogorehan Tunde-Onadele
Xiaohui Gu

Containers have been widely adopted in production computing environments for its efficiency and low overhead of isolation. However, recent studies have shown that containerized applications are prone to various security attacks. Moreover, containerized applications are often highly dynamic and short-lived, which further exacerbates the problem. In this paper, we present CDL, a classified distributed learning framework to achieve efficient security attack detection for containerized applications. CDL integrates online application classification and anomaly detection to overcome the challenge of lacking sufficient training data for dynamic short-lived containers while considering diversified normal behaviors in different applications. We have implemented a prototype of CDL and evaluated it over 33 real world vulnerability attacks in 24 commonly used server applications. Our experimental results show that CDL can reduce the false positive rate from over 12% to 0.24% compared to traditional anomaly detection schemes without aggregating training data. By introducing application classification into container behavior learning, CDL can improve the detection rate from catching 20 attacks to 31 attacks before those attacks succeed. CDL is light-weight, which can complete application classification and anomaly detection for each data sample within a few milliseconds.

On the Forensic Validity of Approximated Audit Logs

Noor Michael
Jaron Mink
Jason Liu
Sneha Gaur
Wajih Ul Hassan
Adam Bates

Auditing is an increasingly essential tool for the defense of computing systems, but the unwieldy nature of log data imposes significant burdens on administrators and analysts. To address this issue, a variety of techniques have been proposed for approximating the contents of raw audit logs, facilitating efficient storage and analysis. However, the security value of these approximated logs is difficult to measure—relative to the original log, it is unclear if these techniques retain the forensic evidence needed to effectively investigate threats. Unfortunately, prior work has only investigated this issue anecdotally, demonstrating sufficient evidence is retained for specific attack scenarios.

In this work, we address this gap in the literature through formalizing metrics for quantifying the forensic validity of an approximated audit log under differing threat models. In addition to providing quantifiable security arguments for prior work, we also identify a novel point in the approximation design space—that log events describing typical (benign) system activity can be aggressively approximated, while events that encode anomalous behavior should be preserved with lossless fidelity. We instantiate this notion of Attack-Preserving forensic validity in LogApprox, a new approximation technique that eliminates the redundancy of voluminous file I/O associated with benign process activities. We evaluate LogApprox alongside a corpus of exemplar approximation techniques from prior work and demonstrate that LogApprox achieves comparable log reduction rates while retaining 100% of attack-identifying log events. Additionally, we utilize this evaluation to illuminate the inherent trade-off between performance and utility within existing approximation techniques. This work thus establishes trustworthy foundations for the design of the next generation of efficient auditing frameworks.

SESSION: Usability and Human-centric Security

More Than Just Good Passwords? A Study on Usability and Security Perceptions of Risk-based Authentication

Stephan Wiefling
Markus Dürmuth
Luigi Lo Iacono

Risk-based Authentication (RBA) is an adaptive security measure to strengthen password-based authentication. RBA monitors additional features during login, and when observed feature values differ significantly from previously seen ones, users have to provide additional authentication factors such as a verification code. RBA has the potential to offer more usable authentication, but the usability and the security perceptions of RBA are not studied well.

We present the results of a between-group lab study (n=65) to evaluate usability and security perceptions of two RBA variants, one 2FA variant, and password-only authentication. Our study shows with significant results that RBA is considered to be more usable than the studied 2FA variants, while it is perceived as more secure than password-only authentication in general and comparably secure to 2FA in a variety of application types. We also observed RBA usability problems and provide recommendations for mitigation. Our contribution provides a first deeper understanding of the users’ perception of RBA and helps to improve RBA implementations for a broader user acceptance.

Double Patterns: A Usable Solution to Increase the Security of Android Unlock Patterns

Tim Forman
Adam Aviv

Android unlock patterns are still commonly used, and roughly 25% of the respondents to our study use a pattern when unlocking their phone. Despite security issues, the design of the patterns have remained unchanged. We propose Double Patterns (DPatts), a natural advancement on Android unlock patterns that maintains the core design but instead of selecting a single pattern, a user selects two patterns entered one-after-the-other super-imposed on the same 3x3 grid. We evaluated DPatts for both security and usability through an online study (n = 634) with three treatments: a control a first pattern entry blocklist, and a blocklist for both patterns. We find that in all settings, user chosen DPatts are more secure than traditional patterns based on standard guessability metrics, more similar to that of 4-/6-digit PINs, and even more difficult to guess for a simulated attacker. Users express positive sentiments in qualitative feedback, particularly those who currently (or previously) used Android unlock patterns, and overall, participants found the DPatts interface quite usable, with high recall retention and comparable entry times to traditional patterns. In particular, current Android pattern users, the target population for DPatts, reported SUS scores in the 80th percentile and high perceptions of security and usability in responses to open- and closed-questions. Based on these findings, we would recommend adding DPatts as an advancement to Android patterns, much like allowing for added PIN length.

Understanding User Perceptions of Security and Privacy for Group Chat: A Survey of Users in the US and UK

Sean Oesch
Ruba Abu-Salma
Oumar Diallo
Juliane Krämer
James Simmons
Justin Wu
Scott Ruoti

Secure messaging tools are an integral part of modern society. While there is a significant body of secure messaging research generally, there is a lack of information regarding users’ security and privacy perceptions and requirements for secure group chat. To address this gap, we conducted a survey of 996 respondents in the US and UK. The results of our study show that group chat presents important security and privacy challenges, some of which are not present in one-to-one chat. For example, users need to be able to manage and monitor group membership, establish trust for new group members, and filter content that they share in different chat contexts. Similarly, we find that the sheer volume of notifications that occur in group chat makes it extremely likely that users ignore important security or privacy notifications. We also find that respondents lack mechanisms for determining which tools are secure and instead rely on non-technical strategies for protecting their privacy—for example, self-filtering what they post and carefully tracking group membership. Based on these findings, we provide recommendations on how to improve the security and usability of secure group chat.

Widely Reused and Shared, Infrequently Updated, and Sometimes Inherited: A Holistic View of PIN Authentication in Digital Lives and Beyond

Hassan Khan
Jason Ceci
Jonah Stegman
Adam J. Aviv
Rozita Dara
Ravi Kuber

Personal Identification Numbers (PINs) are widely used as an access control mechanism for digital assets (e.g., smartphones), financial assets (e.g., ATM cards), and physical assets (e.g., locks for garage doors or homes). Using semi-structured interviews (n=35), participants reported on PIN usage for different types of assets, including how users choose, share, inherit, and reuse PINs, as well as behaviour following the compromise of a PIN. We find that memorability is the most important criterion when choosing a PIN, more so than security or concerns of reuse. Updating or changing a PIN is very uncommon, even when a PIN is compromised. Participants reported sharing PINs for one type of asset with acquaintances but inadvertently reused them for other assets, thereby subjecting themselves to potential risks. Participants also reported using PINs originally set by previous homeowners for physical devices (e.g., alarm or keypad door entry systems). While aware of the risks of not updating PINs, this did not always deter participants from using inherited PINs, as they were often missing instructions on how to update them. Given the expected increase in PIN-protected assets (e.g., loyalty cards, smart locks, and web apps), we provide suggestions and future research directions to better support users with multiple digital and non-digital assets and more secure human-device interaction when utilizing PINs.

Up2Dep: Android Tool Support to Fix Insecure Code Dependencies

Duc Cuong Cuong Nguyen
Erik Derr
Michael Backes
Sven Bugiel

Third-party libraries, especially outdated versions, can introduce and multiply security & privacy related issues to Android applications. While prior work has shown the need for tool support for developers to avoid libraries with security problems, no such a solution has yet been brought forward to Android. It is unclear how such a solution would work and which challenges need to be solved in realizing it.

In this work, we want to make a step forward in this direction. We propose Up2Dep, an Android Studio extension that supports Android developers in keeping project dependencies up-to-date and in avoiding insecure libraries. To evaluate the technical feasibility of Up2Dep, we publicly released Up2Dep and tested it with Android developers (N=56) in their daily tasks. Up2Dep has delivered quick-fixes that mitigate 108 outdated dependencies and 8 outdated dependencies with security problems in 34 real projects. It was perceived by those developers as being helpful. Our results also highlight technical challenges in realizing such support, for which we provide solutions and new insights.

Our results emphasize the urgent need for designated tool support to detect and update insecure outdated third-party libraries in Android apps. We believe that Up2Dep has provided a valuable step forward to improving the security of the Android ecosystem and encouraging results for tool support with a tangible impact as app developers have an easy means to fix their outdated and insecure dependencies.

SESSION: Network and Wireless Security

On the Feasibility of Automating Stock Market Manipulation

Carter Yagemann
Simon P. Chung
Erkam Uzun
Sai Ragam
Brendan Saltaformaggio
Wenke Lee

This work presents the first findings on the feasibility of using botnets to automate stock market manipulation. Our analysis incorporates data gathered from SEC case files, security surveys of online brokerages, and dark web marketplace data. We address several technical challenges, including how to adapt existing techniques for automation, the cost of hijacking brokerage accounts, avoiding detection, and more. We consolidate our findings into a working proof-of-concept, man-in-the-browser malware, Bot2Stock, capable of controlling victim email and brokerage accounts to commit fraud. We evaluate our bots and protocol using agent-based market simulations, where we find that a 1.5% ratio of bots to benign traders yields a 2.8% return on investment (ROI) per attack. Given the short duration of each attack (< 1 minute), achieving this ratio is trivial, requiring only 4 bots to target stocks like IBM. 1,000 bots, cumulatively gathered over 1 year, can turn $100,000 into $1,022,000, placing Bot2Stock on par with existing botnet scams.

Dragonblood is Still Leaking: Practical Cache-based Side-Channel in the Wild

Daniel De Almeida Braga
Pierre-Alain Fouque
Mohamed Sabt

Recently, the Dragonblood attacks have attracted new interests on the security of WPA-3 implementation and in particular on the Dragonfly code deployed on many open-source libraries. One attack concerns the protection of users passwords during authentication. In the Password Authentication Key Exchange (PAKE) protocol called Dragonfly, the secret, namely the password, is mapped to an elliptic curve point. This operation is sensitive, as it involves the secret password, and therefore its resistance against side-channel attacks is of utmost importance. Following the initial disclosure of Dragonblood, we notice that this particular attack has been partially patched by only a few implementations.

In this work, we show that the patches implemented after the disclosure of Dragonblood are insufficient. We took advantage of state-of-the-art techniques to extend the original attack, demonstrating that we are able to recover the password with only a third of the measurements needed in Dragonblood attack. We mainly apply our attack on two open-source projects: iwd (iNet Wireless Daemon) and FreeRADIUS, in order underline the practicability of our attack. Indeed, the iwd package, written by Intel, is already deployed in the Arch Linux distribution, which is well-known among security experts, and aims to offer an alternative to wpa_supplicant. As for FreeRADIUS, it is widely deployed and well-maintained upstream open-source project. We publish a full Proof of Concept of our attack, and actively participated in the process of patching the vulnerable code. Here, in a backward compatibility perspective, we advise the use of a branch-free implementation as a mitigation technique, as what was used in hostapd, due to its quite simplicity and its negligible incurred overhead.

DeepSIM: GPS Spoofing Detection on UAVs using Satellite Imagery Matching

Nian Xue
Liang Niu
Xianbin Hong
Zhen Li
Larissa Hoffaeller
Christina Pöpper

Unmanned Aerial Vehicles (UAVs), better known as drones, have significantly advanced fields such as aerial surveillance, military reconnaissance, cadastral surveying, disaster monitoring, and delivery services. However, UAVs rely on civilian (unauthenticated) GPS for navigation which can be trivially spoofed.

In this paper, we present DeepSIM, a satellite imagery matching approach to detect GPS spoofing attacks against UAVs based on deep learning. We make use of the camera(s) a typical UAV is equipped with, and present a system that compares historical satellite images of its GPS-based position (spaceborne photography) with real-time aerial images from its cameras (airborne imagery). Historical images are taken from, e. g., Google Earth or NASA WorldWind. To detect GPS spoofing attacks, we investigate different deep neural network models that compare the real-time camera images with the historical satellite images. To train and test the models, we have constructed the SatUAV dataset (consisting of 967 image pairs), partially by using real UAVs such as the DJI Phantom 4 Advanced. Real-world experimental results show that our best model has a success rate of about 95% in detecting GPS spoofing attacks within less than 100 milliseconds. Our approach does not require any modification of the existing GPS infrastructures and relies only on public satellite imagery, making it a practical solution for many everyday scenarios.

Certified Copy? Understanding Security Risks of Wi-Fi Hotspot based Android Data Clone Services

Siqi Ma
Hehao Li
Wenbo Yang
Juanru Li
Surya Nepal
Elisa Bertino

Wi-Fi hotspot-based data clone services are increasingly used by Android users to transfer their user data and preferred configurations while upgrading obsolete phones to new models. Unfortunately, since the data clone services need to manipulate sensitive information protected by the Android system, vulnerabilities in the design or implementation of these services may result in data privacy breaches. In this paper we present an empirical security analysis of eight widely used Wi-Fi hotspot-based data clone services deployed to millions of Android phones. Our study evaluates those services with respect to data export/import, data transmission, and Wi-Fi configuration with respect to security requirements that the data clone procedure should satisfy. Since data clone services are closed source, we design Poirot, an analysis system to recover workflows of the data clone services and detect potential flaws. Our study reveals a series of critical security issues in the data clone services. We demonstrate two types of attacks that exploit the data clone service as a new attack surface. A vulnerable data clone service allows attackers to retrieve sensitive user data without permissions, and even inject malicious contents to compromise the system.

DPIFuzz: A Differential Fuzzing Framework to Detect DPI Elusion Strategies for QUIC

Gaganjeet Singh Reen
Christian Rossow

QUIC is an emerging transport protocol that has the potential to replace TCP in the near future. As such, QUIC will become an important target for Deep Packet Inspection (DPI). Reliable DPI is essential, e.g., for corporate environments, to monitor traffic entering and leaving their networks. However, elusion strategies threaten the validity of DPI systems, as they allow attackers to carefully design traffic to fool and thus evade on-path DPI systems. While such elusion strategies for TCP are well documented, it is unclear if attackers will be able to elude QUIC-based DPI systems. In this paper, we systematically explore elusion methodologies for QUIC. To this end, we present DPIFuzz: a differential fuzzing framework which can automatically detect strategies to elude stateful DPI systems for QUIC. We use DPIFuzz to generate and mutate QUIC streams in order to compare (and find differences in) the server-side interpretations of five popular open-source QUIC implementations. We show that DPIFuzz successfully reveals DPI elusion strategies, such as using packets with duplicate packet numbers or exploiting the diverging handling of overlapping stream offsets by QUIC implementations. DPIFuzz additionally finds four security-critical vulnerabilities in these QUIC implementations.

SESSION: Software Security Techniques

A Flexible Framework for Expediting Bug Finding by Leveraging Past (Mis-)Behavior to Discover New Bugs

Sanjeev Das
Kedrian James
Jan Werner
Manos Antonakakis
Michalis Polychronakis
Fabian Monrose

Among various fuzzing approaches, coverage-guided grey-box fuzzing is perhaps the most prominent, due to its ease of use and effectiveness. Using this approach, the selection of inputs focuses on maximizing program coverage, e.g., in terms of the different branches that have been traversed. In this work, we begin with the observation that selecting any input that explores a new path, and giving equal weight to all paths, can lead to severe inefficiencies. For instance, although seemingly “new” crashes involving previously unexplored paths may be discovered, these often have the same root cause and actually correspond to the same bug.

To address these inefficiencies, we introduce a framework that incorporates a tighter feedback loop to guide the fuzzing process in exploring truly diverse code paths. Our framework employs (i) a vulnerability-aware selection of coverage metrics for enhancing the effectiveness of code exploration, (ii) crash deduplication information for early feedback, and (iii) a configurable input culling strategy that interleaves multiple strategies to achieve comprehensiveness. A novel aspect of our work is the use of hardware performance counters to derive coverage metrics. We present an approach for assessing and selecting the hardware events that can be used as a meaningful coverage metric for a target program. The results of our empirical evaluation using real-world programs demonstrate the effectiveness of our approach: in some cases, we explore fewer than 50% of the paths compared to a base fuzzer (AFL, MOpt, and Fairfuzz), yet on average, we improve new bug discovery by 31%, and find the same bugs (as the base) 3.3 times faster. Moreover, although we specifically chose applications that have been subject to recent fuzzing campaigns, we still discovered 9 new vulnerabilities.

Cupid : Automatic Fuzzer Selection for Collaborative Fuzzing

Emre Güler
Philipp Görz
Elia Geretto
Andrea Jemmett
Sebastian Österlund
Herbert Bos
Cristiano Giuffrida
Thorsten Holz

Combining the strengths of individual fuzzing methods is an appealing idea to find software faults more efficiently, especially when the computing budget is limited. In prior work, EnFuzz introduced the idea of ensemble fuzzing and devised three heuristics to classify properties of fuzzers in terms of diversity. Based on these heuristics, the authors manually picked a combination of different fuzzers that collaborate.

In this paper, we generalize this idea by collecting and applying empirical data from single, isolated fuzzer runs to automatically identify a set of fuzzers that complement each other when executed collaboratively. To this end, we present Cupid, a collaborative fuzzing framework allowing automated, data-driven selection of multiple complementary fuzzers for parallelized and distributed fuzzing. We evaluate the automatically selected target-independent combination of fuzzers by Cupid on Google’s fuzzer-test-suite, a collection of real-world binaries, as well as on the synthetic Lava-M dataset. We find that Cupid outperforms two expert-guided, target-specific and hand-picked combinations on Google’s fuzzer-test-suite in terms of branch coverage, and improves bug finding on Lava-M by 10%. Most importantly, we improve the latency for obtaining 95% and 99% of the coverage by 90% and 64%, respectively. Furthermore, Cupid reduces the amount of CPU hours needed to find a high-performing combination of fuzzers by multiple orders of magnitude compared to an exhaustive evaluation.

Probabilistic Naming of Functions in Stripped Binaries

James Patrick-Evans
Lorenzo Cavallaro
Johannes Kinder

Debugging symbols in binary executables carry the names of functions and global variables. When present, they greatly simplify the process of reverse engineering, but they are almost always removed (stripped) for deployment. We present the design and implementation of punstrip, a tool which combines a probabilistic fingerprint of binary code based on high-level features with a probabilistic graphical model to learn the relationship between function names and program structure. As there are many naming conventions and developer styles, functions from different applications do not necessarily have the exact same name, even if they implement the exact same functionality. We therefore evaluate punstrip across three levels of name matching: exact; an approach based on natural language processing of name components; and using Symbol2Vec, a new embedding of function names based on random walks of function call graphs. We show that our approach is able to recognize functions compiled across different compilers and optimization levels and then demonstrate that punstrip can predict semantically similar function names based on code structure. We evaluate our approach over open source C binaries from the Debian Linux distribution and compare against the state of the art.

Guide Me to Exploit: Assisted ROP Exploit Generation for ActionScript Virtual Machine

Fadi Yilmaz
Meera Sridhar
Wontae Choi

Automatic exploit generation (AEG) is the challenge of determining the exploitability of a given vulnerability by exploring all possible execution paths that can result from triggering the vulnerability. Since typical AEG implementations might need to explore an unbounded number of execution paths, they usually utilize a fuzz tester and a symbolic execution tool to facilitate this task. However, in the case of language virtual machines, such as the ActionScript Virtual Machine (AVM), AEG implementations cannot leverage fuzz testers or symbolic execution tools for generating the exploit script, because of two reasons: (1) fuzz testers cannot efficiently generate grammatically correct executables for the AVM due to the improbability of randomly generating highly-structured executables that follow the complex grammar rules and (2) symbolic execution tools encounter the well-known program-state-explosion problem due to the enormous number of control paths in early processing stages of a language virtual machine (e.g., lexing and parsing).

This paper presents GuidExp, a guided (semi-automatic) exploit generation tool for AVM vulnerabilities. GuidExp synthesizes an exploit script that exploits a given ActionScript vulnerability. Unlike other AEG implementations, GuidExp leverages exploit deconstruction, a technique of splitting the exploit script into many smaller code snippets. GuidExp receives hints from security experts and uses them to determine places where the exploit script can be split. Thus, GuidExp can concentrate on synthesizing these smaller code snippets in sequence to obtain the exploit script instead of synthesizing the entire exploit script at once. GuidExp does not rely on fuzz testers or symbolic execution tools. Instead, GuidExp performs exhaustive search adopting four optimization techniques to facilitate the AEG process: (1) exploit deconstruction, (2) operand stack verification, (3) instruction tiling, and (4) feedback from the AVM. A running example highlights how GuidExp synthesizes the exploit script for a real-world AVM use-after-free vulnerability. In addition, GuidExp’s successful generation of exploits for ten other AVM vulnerabilities is reported.

Practical Fine-Grained Binary Code Randomization†

Soumyakant Priyadarshan
Huan Nguyen
R. Sekar

Despite its effectiveness against code reuse attacks, fine-grained code randomization has not been deployed widely due to compatibility as well as performance concerns. Previous techniques often needed source code access to achieve good performance, but this breaks compatibility with today’s binary-based software distribution and update mechanisms. Moreover, previous techniques break C++ exceptions and stack tracing, which are crucial for practical deployment. In this paper, we first propose a new, tunable randomization technique called LLR(k) that is compatible with these features. Since the metadata needed to support exceptions/stack-tracing can reveal considerable information about code layout, we propose a new entropy metric that accounts for leaks of this metadata. We then present a novel metadata reduction technique to significantly increase entropy without degrading exception handling. This enables LLR(k) to achieve strong entropy with a low overhead of 2.26%.

SESSION: System and Hardware Security

Faulty Point Unit: ABI Poisoning Attacks on Intel SGX

Fritz Alder
Jo Van Bulck
David Oswald
Frank Piessens

This paper analyzes a previously overlooked attack surface that allows unprivileged adversaries to impact supposedly secure floating-point computations in Intel SGX enclaves through the Application Binary Interface (ABI). In a comprehensive study across 7 widely used industry-standard and research enclave shielding runtimes, we show that control and state registers of the x87 Floating-Point Unit (FPU) and Intel Streaming SIMD Extensions (SSE) are not always properly sanitized on enclave entry. First, we abuse the adversary’s control over precision and rounding modes as a novel “ABI-level fault injection” primitive to silently corrupt enclaved floating-point operations, enabling a new class of stealthy, integrity-only attacks that disturb the result of SGX enclave computations. Our analysis reveals that this threat is especially relevant for applications that use the older x87 FPU, which is still being used under certain conditions for high-precision operations by modern compilers like gcc. We exemplify the potential impact of ABI-level quality-degradation attacks in a case study of an enclaved machine learning service and in a larger analysis on the SPEC benchmark programs. Second, we explore the impact on enclave confidentiality by showing that the adversary’s control over floating-point exception masks can be abused as an innovative controlled channel to detect FPU usage and to recover enclaved multiplication operands in certain scenarios. Our findings, affecting 5 out of the 7 studied runtimes, demonstrate the fallacy and challenges of implementing high-assurance trusted execution environments on contemporary x86 hardware. We responsibly disclosed our findings to the vendors and were assigned two CVEs, leading to patches in the Intel SGX-SDK, Microsoft OpenEnclave, the Rust compiler’s SGX target, and Go-TEE.

Reboot-Oriented IoT: Life Cycle Management in Trusted Execution Environment for Disposable IoT devices

Kuniyasu Suzaki
Akira Tsukamoto
Andy Green
Mohammad Mannan

Many IoT devices are geographically distributed without human administrators, which are maintained by a remote server to enforce security updates, ideally through machine-to-machine (M2M) management. However, malware often terminates the remote control mechanism immediately after compromise and hijacks the device completely. The compromised device has no way to recover and becomes part of a botnet. Even if the IoT device remains uncompromised, it is required to update due to recall or other reasons. In addition, the device is desired to be automatically disposable after the expiration of its service, software, or device hardware to prevent being cyber debris.

We present Reboot-Oriented IoT (RO-IoT), which updates the total OS image autonomously to recover from compromise (rootkit or otherwise), and manages the life cycle of the device using Trusted Execution Environment (TEE) and PKI-based certificates (i.e., CA, server, and client certificates which are linked to device, software, and service). RO-IoT is composed of three TEE-protected components: the secure network bootloader, periodic memory forensics, and life cycle management. The secure network bootloader downloads and verifies the OS image by the TEE. The periodic memory forensics causes a hardware system-reset (i.e., reboot) after detecting any un-registered binary or a time-out, which depends on a TEE-protected watchdog timer. The life cycle management checks the expiration of PKI-based certificates for the device, software, and service, and deactivates the device if necessary. These features complement each other, and all binaries and certificates are encrypted or protected by TEE. We implemented a prototype of RO-IoT on an ARM Hikey board with the open source trusted OS OP-TEE. The design and implementation take account of availability (over 99.9%) and scalability (less than 100MB traffic for a full OS update, and estimated at a cent per device), making the current prototype specifically suitable for the AI-Edge (Artificial Intelligence on the Edge) IoT devices.

RusTEE: Developing Memory-Safe ARM TrustZone Applications

Shengye Wan
Mingshen Sun
Kun Sun
Ning Zhang
Xu He

In the past decade, Trusted Execution Environment (TEE) provided by ARM TrustZone is becoming one of the primary techniques for enhancing the security of mobile devices. The isolation enforced by TrustZone can protect the trusted applications running in the TEE against malicious software in the untrusted rich execution environment (REE). However, TrustZone cannot completely prevent vulnerabilities in trusted applications residing in the TEE, which can then be used to attack other trusted applications or even the trusted OS. Previously, a number of memory corruption vulnerabilities have been reported on different TAs, which are written in memory-unsafe languages like C.

Recently, various memory-safe programming languages have emerged to mitigate the prevalent memory corruption bugs. In this paper, we propose RusTEE, a trusted application mechanism that leverages Rust, a newly emerged memory-safe language, to enhance the security of TAs. Though the high-level idea is quite straight-forwarding, we resolve several challenges on adopting Rust in mobile TEEs. Specifically, since Rust currently does not support any TrustZone-assisted TEE systems, we extend the existing Rust compiler for providing such support. Also, we apply comprehensive security mechanisms to resolve two security issues of trusted applications, namely, securely invoking high-privileged system services and securely communicating with untrusted REE. We implement a prototype of RusTEE as the trusted applications’ SDK, which supports both emulator and real hardware devices. The experiment shows that RusTEE can compile applications with close-to-C performance on the evaluated platforms.

HeapExpo: Pinpointing Promoted Pointers to Prevent Use-After-Free Vulnerabilities

Zekun Shen
Brendan Dolan-Gavitt

Use-after-free (UAF) vulnerabilities, in which dangling pointers remain after memory is released, remain a persistent problem for applications written in C and C++. In order to protect legacy code, prior work has attempted to track pointer propagation and invalidate dangling pointers at deallocation time, but this work has gaps in coverage, as it lacks support for tracking program variables promoted to CPU registers. Moreover, we find that these gaps can significantly hamper detection of UAF bugs: in a preliminary study with OSS-Fuzz, we found that more than half of the UAFs in real-world programs we examined (10/19) could not be detected by prior systems due to register promotion. In this paper, we introduce HeapExpo, a new system that fills this gap in coverage by parsimoniously identifying potential dangling pointer variables that may be lifted into registers by the compiler and marking them as volatile. In our experiments, we find that HeapExpo effectively detects UAFs missed by other systems with an overhead of 35% on the majority of SPEC CPU2006 and 66% when including two benchmarks that have high amounts of pointer propagation.

ρFEM: Efficient Backward-edge Protection Using Reversed Forward-edge Mappings

Paul Muntean
Mathias Neumayer
Zhiqiang Lin
Gang Tan
Jens Grossklags
Claudia Eckert

In this paper, we propose reversed forward-edge mapper (ρFEM), a Clang/LLVM compiler-based tool, to protect the backward edges of a program’s control flow graph (CFG) against runtime control-flow hijacking (e.g., code reuse attacks). It protects backward-edge transfers in C/C++ originating from virtual and non-virtual functions by first statically constructing a precise virtual table hierarchy, with which to form a precise forward-edge mapping between callees and non-virtual calltargets based on precise function signatures, and then checks each instrumented callee return against the previously computed set at runtime. We have evaluated ρFEM using the Chrome browser, NodeJS, Nginx, Memcached, and the SPEC CPU2017 benchmark. Our results show that ρFEM enforces less than 2.77 return targets per callee in geomean, even for applications heavily relying on backward edges. ρFEM’s runtime overhead is less than 1% in geomean for the SPEC CPU2017 benchmark and 3.44% in geomean for the Chrome browser.

SESSION: Distributed Systems and Cloud Security

Constrained Concealment Attacks against Reconstruction-based Anomaly Detectors in Industrial Control Systems

Alessandro Erba
Riccardo Taormina
Stefano Galelli
Marcello Pogliani
Michele Carminati
Stefano Zanero
Nils Ole Tippenhauer

Recently, reconstruction-based anomaly detection was proposed as an effective technique to detect attacks in dynamic industrial control networks. Unlike classical network anomaly detectors that observe the network traffic, reconstruction-based detectors operate on the measured sensor data, leveraging physical process models learned a priori.

In this work, we investigate different approaches to evade prior-work reconstruction-based anomaly detectors by manipulating sensor data so that the attack is concealed. We find that replay attacks (commonly assumed to be very strong) show bad performance (i.e., increasing the number of alarms) if the attacker is constrained to manipulate less than 95% of all features in the system, as hidden correlations between the features are not replicated well. To address this, we propose two novel attacks that manipulate a subset of the sensor readings, leveraging learned physical constraints of the system. Our attacks feature two different attacker models: A white box attacker, which uses an optimization approach with a detection oracle, and a black box attacker, which uses an autoencoder to translate anomalous data into normal data. We evaluate our implementation on two different datasets from the water distribution domain, showing that the detector’s Recall drops from 0.68 to 0.12 by manipulating 4 sensors out of 82 in WADI dataset. In addition, we show that our black box attacks are transferable to different detectors: They work against autoencoder-, LSTM-, and CNN-based detectors. Finally, we implement and demonstrate our attacks on a real industrial testbed to demonstrate their feasibility in real-time.

Workflow Integration Alleviates Identity and Access Management in Serverless Computing

Arnav Sankaran
Pubali Datta
Adam Bates

As serverless computing continues to revolutionize the design and deployment of web services, it has become an increasingly attractive target to attackers. These adversaries are developing novel tactics for circumventing the ephemeral nature of serverless functions, exploiting container reuse optimizations and achieving lateral movement by “living off the land” provided by legitimate serverless workflows. Unfortunately, the traditional security controls currently offered by cloud providers are inadequate to counter these new threats.

In this work, we propose will.iam,1 a workflow-aware access control model and reference monitor that satisfies the functional requirements of the serverless computing paradigm. will.iam encodes the protection state of a serverless application as a permissions graph that describes the permissible transitions of its workflows, associating web requests with a permissions set at the point of ingress according to a graph-based labeling state. By proactively enforcing the permissions requirements of downstream workflow components, will.iam is able to avoid the costs of partially processing unauthorized requests and reduce the attack surface of the application. We implement the will.iam framework in Go and evaluate its performance as compared to recent related work against the well-established Nordstrom “Hello, Retail!” application. We demonstrate that will.iam imposes minimal burden to requests, averaging 0.51% overhead across representative workflows, but dramatically improves performance when handling unauthorized requests (e.g., DDoS attacks) as compared to past solutions. will.iam thus demonstrates an effective and practical alternative for authorization in the serverless paradigm.

Privacy-Preserving Production Process Parameter Exchange

Jan Pennekamp
Erik Buchholz
Yannik Lockner
Markus Dahlmanns
Tiandong Xi
Marcel Fey
Christian Brecher
Christian Hopmann
Klaus Wehrle

Nowadays, collaborations between industrial companies always go hand in hand with trust issues, i.e., exchanging valuable production data entails the risk of improper use of potentially sensitive information. Therefore, companies hesitate to offer their production data, e.g., process parameters that would allow other companies to establish new production lines faster, against a quid pro quo. Nevertheless, the expected benefits of industrial collaboration, data exchanges, and the utilization of external knowledge are significant.

In this paper, we introduce our Bloom filter-based Parameter Exchange (BPE), which enables companies to exchange process parameters privacy-preservingly. We demonstrate the applicability of our platform based on two distinct real-world use cases: injection molding and machine tools. We show that BPE is both scalable and deployable for different needs to foster industrial collaborations. Thereby, we reward data-providing companies with payments while preserving their valuable data and reducing the risks of data leakage.

Efficient Oblivious Substring Search via Architectural Support

Nicholas Mainardi
Davide Sampietro
Alessandro Barenghi
Gerardo Pelosi

Performing private and efficient searches over encrypted outsourced data enables a flourishing growth of cloud based services managing sensitive data as the genomic, medical and financial ones. We tackle the problem of building an efficient indexing data structure, enabling the secure and private execution of substring search queries over an outsourced document collection. Our solution combines the efficiency of an index-based substring search algorithm with the secure-execution features provided by the SGX technology and the access pattern indistinguishability guarantees provided by an Oblivious RAM. To prevent the information leakage from the access pattern side-channel vulnerabilities affecting SGX based applications, we redesign three ORAM algorithms, and perform a comparative evaluation to find the best engineering trade-offs for a privacy-preserving index-based substring search protocol. The practicality of our solution is supported by a response time of about 1 second to retrieve all the positions of a protein in the 3 GB string of the human genome.

SERENIoT: Distributed Network Security Policy Management and Enforcement for Smart Homes

Corentin Thomasset
David Barrera

Selectively allowing network traffic has emerged as a dominant approach for securing consumer IoT devices. However, determining what the allowed behavior of an IoT device should be remains an open challenge. Proposals to date have relied on manufacturers and trusted parties to provide allow lists of network traffic, but these proposals require manufacturer involvement or placing trust in an additional stakeholder. Alternatively, locally monitoring devices can allow building allow lists of observed behavior, but devices may not exhaust their functionality set during the observation period, and the behavior may change following a software update which requires re-training. This paper proposes a blockchain-based system for determining whether an IoT device is behaving like other devices of the same type. Our system, SERENIoT, overcomes the challenge of initially determining the correct behavior for a device. Nodes in the SERENIoT public blockchain submit summaries of the network behavior observed for connected IoT devices and build allow lists of behavior observed by the majority of nodes. Changes in behavior through software updates are automatically added to the allow list once the update is broadly deployed. Through a proof-of-concept implementation of SERENIoT on a small IoT network and a large-scale Amazon EC2 simulation, we evaluate the security, scalability, and performance of our system.

SESSION: Software Security and Data Protection

Effect of Security Controls on Patching Window: A Causal Inference based Approach

Aditya Kuppa
Lamine Aouad
Nhien-An Le-Khac

In many organisations there are up to 15 security controls that help defenders accurately identify and prioritise information security risks. Due to the lack of clarity into the effectiveness and capabilities of these defences, and poor visibility to overall risk posture has led to a crisis of prioritisation. Lately, organisations rely on scenario based red teaming exercises which test the contribution of a security control to the security preparedness of the organisation, and testing the resilience of a control. However, these assessments don’t quantify the effect of controls on the security policies already in place. Measuring this effect can help stakeholders to re-calibrate and effectively prioritise their risks.

In this work, we propose a causal inference based approach to understand the influence of security control on patching behaviour in the organisations. We introduce a novel scoring function for security controls based on 6 criteria to evaluate its effectiveness. Utilising the scoring function and state of art causal inference methods we estimate the average effect (in days) of a control in patching policy of an organisation. We also assess the influence of individual control for CVE’s which have high vs low CVSS scores.

We validate the proposed method on observational data collected from 2000 organisations with varied asset sizes. We estimate that on an average there is a delay of 9.5 days in the patching of a CVE due to the presence of security controls on an asset. We also analyse the assumptions and algorithms with refuting methods to validate the predicted estimates and generalisation of the observed outcomes.

NoSQL Breakdown: A Large-scale Analysis of Misconfigured NoSQL Services

Dario Ferrari
Michele Carminati
Mario Polino
Stefano Zanero

In the last years, NoSQL databases have grown in popularity due to their easy-to-deploy, reliable, and scalable storage mechanism. While most NoSQL services offer access control mechanisms, their default configurations grant access without any form of authentication, resulting in misconfigurations that may expose data to the Internet, as demonstrated by the recent high-profile data leaks.

In this paper, we investigate the usage of the most popular NoSQL databases, focusing on automatically analyzing and discovering misconfigurations that may lead to security and privacy issues. We developed a tool that automatically scans large IP subnets to detect the exposed services and performs security analyses without storing nor exposing sensitive data.

We analyzed 67,725,641 IP addresses between October 2019 and March 2020, spread across several Cloud Service Providers (CSPs), and found 12,276 misconfigured databases. The risks associated with exposed services range from data leaking, which may pose a significant menace to users’ privacy, to data tampering of resources stored in the vulnerable databases, which may pose a relevant threat to a web service reputation. Regarding the last point, we found 742 potentially vulnerable websites linked to misconfigured instances with the write permission enabled to anonymous users.

GuardSpark++: Fine-Grained Purpose-Aware Access Control for Secure Data Sharing and Analysis in Spark

Tao Xue
Yu Wen
Bo Luo
Boyang Zhang
Yang Zheng
Yanfei Hu
Yingjiu Li
Gang Li
Dan Meng

With the development of computing and communication technologies, extremely large amount of data has been collected, stored, utilized, and shared, while new security and privacy challenges arise. Existing platforms do not provide flexible and practical access control mechanisms for big data analytics applications. In this paper, we present GuardSpark++, a fine-grained access control mechanism for secure data sharing and analysis in Spark. In particular, we first propose a purpose-aware access control (PAAC) model, which introduces new concepts of data processing/operation purposes to conventional purpose-based access control. An automatic purpose analysis algorithm is developed to identify purposes from data analytics operations and queries, so that access control could be enforced accordingly. Moreover, we develop an access control mechanism in Spark Catalyst, which provides unified PAAC enforcement for heterogeneous data sources and upper-layer applications. We evaluate GuardSpark++ with five data sources and four structured data analytics engines in Spark. The experimental results show that GuardSpark++ provides effective access control functionalities with a very small performance overhead (average 3.97%).

Understanding Promotion-as-a-Service on GitHub

Kun Du
Hao Yang
Yubao Zhang
Haixin Duan
Haining Wang
Shuang Hao
Zhou Li
Min Yang

As the world’s leading software development platform, GitHub has become a social networking site for programmers and recruiters who leverage its social features, such as star and fork, for career and business development. However, in this paper, we found a group of GitHub accounts that conducted promotion services in GitHub, called “promoters”, by performing paid star and fork operations on specified repositories. We also uncovered a stealthy way of tampering with historical commits, through which these promoters are able to fake commits retroactively. By exploiting such a promotion service, any GitHub user can pretend to be a skillful developer with high influence.

To understand promotion services in GitHub, we first investigated the underground promotion market of GitHub and identified 1,023 suspected promotion accounts from the market. Then, we developed an SVM (Support Vector Machine) classifier to detect promotion accounts from all active users extracted from GH Archive ranging from 2015 to 2019. In total, we detected 63,872 suspected promotion accounts. We further analyzed these suspected promotion accounts, showing that (1) a hidden functionality in GitHub is abused to boost the reputation of an account by forging historical commits and (2) a group of small businesses exploit GitHub promotion services to promote their products. We estimated that suspicious promoters could have made a profit of $3.41 million and $4.37 million in 2018 and 2019, respectively.

Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers

Ishai Rosenberg
Asaf Shabtai
Yuval Elovici
Lior Rokach

In this paper, we present a generic, query-efficient black-box attack against API call-based machine learning malware classifiers. We generate adversarial examples by modifying the malware’s API call sequences and non-sequential features (printable strings), and these adversarial examples will be misclassified by the target malware classifier without affecting the malware’s functionality. In contrast to previous studies, our attack minimizes the number of malware classifier queries required. In addition, in our attack, the attacker must only know the class predicted by the malware classifier; attacker knowledge of the malware classifier’s confidence score is optional. We evaluate the attack effectiveness when attacks are performed against a variety of malware classifier architectures, including recurrent neural network (RNN) variants, deep neural networks, support vector machines, and gradient boosted decision trees. Our attack success rate is around 98% when the classifier’s confidence score is known and 64% when just the classifier’s predicted class is known. We implement four state-of-the-art query-efficient attacks and show that our attack requires fewer queries and less knowledge about the attacked model’s architecture than other existing query-efficient attacks, making it practical for attacking cloud-based malware classifiers at a minimal cost.

SESSION: Web and Network Security

FPSelect: Low-Cost Browser Fingerprints for Mitigating Dictionary Attacks against Web Authentication Mechanisms

Nampoina Andriamilanto
Tristan Allard
Gaëtan Le Guelvouit

Browser fingerprinting consists into collecting attributes from a web browser. Hundreds of attributes have been discovered through the years. Each one of them provides a way to distinguish browsers, but also comes with a usability cost (e.g., additional collection time). In this work, we propose FPSelect, an attribute selection framework allowing verifiers to tune their browser fingerprinting probes for web authentication. We formalize the problem as searching for the attribute set that satisfies a security requirement and minimizes the usability cost. The security is measured as the proportion of impersonated users given a fingerprinting probe, a user population, and an attacker that knows the exact fingerprint distribution among the user population. The usability is quantified by the collection time of browser fingerprints, their size, and their instability. We compare our framework with common baselines, based on a real-life fingerprint dataset, and find out that in our experimental settings, our framework selects attribute sets of lower usability cost. Compared to the baselines, the attribute sets found by FPSelect generate fingerprints that are up to 97 times smaller, are collected up to 3,361 times faster, and with up to 7.2 times less changing attributes between two observations, on average.

Security Study of Service Worker Cross-Site Scripting.

Phakpoom Chinprutthiwong
Raj Vardhan
GuangLiang Yang
Guofei Gu

Nowadays, modern websites are utilizing service workers to provide users with app-like functionalities such as offline mode and push notifications. To handle such features, the service worker is equipped with special privileges including HTTP traffic manipulation. Thus, it is designed with security as a priority. However, we find that many websites introduce a questionable practice that can jeopardize the security of a service worker.

In this work, we demonstrate how this practice can result in a cross-site scripting (XSS) attack inside a service worker, allowing an attacker to obtain and leverage service worker privileges. Due to the uniqueness of these privileges, such attacks can lead to more severe consequences compared to a typical XSS attack. We term this type of vulnerability as Service Worker based Cross-Site Scripting (SW-XSS). To assess the real-world security impact, we develop a tool called SW-Scanner and use it to analyze top websites in the wild. Our findings reveal a worrisome trend. In total, we find 40 websites vulnerable to this attack including several popular and high ranking websites. Finally, we discuss potential defense solutions to mitigate the SW-XSS vulnerability.

CAPS: Smoothly Transitioning to a More Resilient Web PKI

Stephanos Matsumoto
Jay Bosamiya
Yucheng Dai
Paul van Oorschot
Bryan Parno

Many recent proposals to increase the resilience of the Web PKI against misbehaving CAs face significant obstacles to deployment. These hurdles include (1) the requirement of drastic changes to the existing PKI players and their interactions, (2) the lack of signaling mechanisms to protect against downgrade attacks, (3) the lack of an incremental deployment strategy, and (4) the use of inflexible mechanisms that hinder recovery from misconfiguration or from the loss or compromise of private keys. As a result, few of these proposals have seen widespread deployment, despite their promise of a more secure Web PKI. To address these roadblocks, we propose Certificates with Automated Policies and Signaling (CAPS), a system that leverages the infrastructure of the existing Web PKI to overcome the aforementioned hurdles. CAPS offers a seamless and secure transition away from today’s insecure Web PKI and towards present and future proposals to improve the Web PKI. Crucially, with CAPS, domains can take simple steps to protect themselves from MITM attacks in the presence of one or more misbehaving CAs, and yet the interaction between domains and CAs remains fundamentally the same. We implement CAPS and show that it adds at most 5% to connection establishment latency.

dStyle-GAN: Generative Adversarial Network based on Writing and Photography Styles for Drug Identification in Darknet Markets

Yiming Zhang
Yiyue Qian
Yujie Fan
Yanfang Ye
Xin Li
Qi Xiong
Fudong Shao

Despite the persistent effort by law enforcement, illicit drug trafficking in darknet markets has shown great resilience with new markets rapidly appearing after old ones being shut down. In order to more effectively detect, disrupt and dismantle illicit drug trades, there’s an imminent need to gain a deeper understanding toward the operations and dynamics of illicit drug trading activities. To address this challenge, in this paper, we design and develop an intelligent system (named dSytle-GAN) to automate the analysis for drug identification in darknet markets, by considering both content-based and style-aware information. To determine whether a given pair of posted drugs are the same or not, in dStyle-GAN, based on the large-scale data collected from darknet markets, we first present an attributed heterogeneous information network (AHIN) to depict drugs, vendors, texts and writing styles, photos and photography styles, and the rich relations among them; and then we propose a novel generative adversarial network (GAN) based model over AHIN to capture the underlying distribution of posted drugs’ writing and photography styles to learn robust representations of drugs for their identifications. Unlike existing approaches, our proposed GAN-based model jointly considers the heterogeneity of network and relatedness over drugs formulated by domain-specific meta-paths for robust node (i.e., drug) representation learning. To the best of our knowledge, the proposed dStyle-GAN represents the first principled GAN-based solution over graphs to simultaneously consider writing and photography styles as well as their latent distributions for node representation learning. Extensive experimental results based on large-scale datasets collected from six darknet markets and the obtained ground-truth demonstrate that dStyle-GAN outperforms the state-of-the-art methods. Based on the identified drug pairs in the wild by dStyle-GAN, we perform further analysis to gain deeper insights into the dynamics and evolution of illicit drug trading activities in darknet markets, whose findings may facilitate law enforcement for proactive interventions.

Session Key Distribution Made Practical for CAN and CAN-FD Message Authentication

Yang Xiao
Shanghao Shi
Ning Zhang
Wenjing Lou
Y. Thomas Hou

Automotive communication networks, represented by the CAN bus, are acclaimed for enabling real-time communication between vehicular ECUs but also criticized for their lack of effective security mechanisms. Various attacks have demonstrated that this security deficit renders a vehicle vulnerable to adversarial control that jeopardizes passenger safety. A recent standardization effort led by AUTOSAR has provided general guidelines for developing next-generation automotive communication technologies with built-in security mechanisms. A key security mechanism is message authentication between ECUs for countering message spoofing and replay attack. While many message authentication schemes have been proposed by previous work, the important issue of session key establishment with AUTOSAR compliance was not well addressed. In this paper, we fill this gap by proposing an AUTOSAR-compliant key management architecture that takes into account practical requirements imposed by the automotive environment. Based on this architecture, we describe a baseline session key distribution protocol called SKDC that realizes all designed security functionalities, and propose a novel secret-sharing-based protocol called SSKT that yields improved communication efficiency. Both SKDC and SSKT are customized for CAN/CAN-FD bus deployment. We implemented the two protocols on commercial microcontroller boards and evaluated their performance with hardware experiment and extrapolation analysis. The result shows while both protocols are performant, SSKT achieves superior computation and communication efficiency at scale.

SESSION: Embedded System and IoT Security

LeakyPick: IoT Audio Spy Detector

Richard Mitev
Anna Pazii
Markus Miettinen
William Enck
Ahmad-Reza Sadeghi

Manufacturers of smart home Internet of Things (IoT) devices are increasingly adding voice assistant and audio monitoring features to a wide range of devices including smart speakers, televisions, thermostats, security systems, and doorbells. Consequently, many of these devices are equipped with microphones, raising significant privacy concerns: users may not always be aware of when audio recordings are sent to the cloud, or who may gain access to the recordings. In this paper, we present the LeakyPick architecture that enables the detection of the smart home devices that stream recorded audio to the Internet in response to observing a sound. Our proof-of-concept is a LeakyPick device that is placed in a user’s smart home and periodically “probes” other devices in its environment and monitors the subsequent network traffic for statistical patterns that indicate audio transmission. Our prototype is built on a Raspberry Pi for less than USD $40 and has a measurement accuracy of 94% in detecting audio transmissions for a collection of 8 devices with voice assistant capabilities. Furthermore, we used LeakyPick to identify 89 words that an Amazon Echo Dot misinterprets as its wake-word, resulting in unexpected audio transmission. LeakyPick provides a cost effective approach to help regular consumers monitor their homes for sound-triggered devices that unexpectedly transmit audio to the cloud.

IvoriWatch: Exploring Transparent Integrity Verification of Remote User Input Leveraging Wearables

Prakash Shrestha
Zengrui Liu
Nitesh Saxena

Several sensitive operations, such as financial transactions, email construction, configurations of safety-critical devices (e.g., medical devices or smart home systems), are often performed via web interfaces from a host machine, usually a desktop or laptop PC. It is typically easy to secure the communication link between the local host machine and the remote server, for example, via a standard cryptographic protocol (e.g., TLS). However, if the host machine itself is compromised with a trojan or malware, the malicious adversary can manipulate the user-provided input (e.g., money transfer information, email content and configuration data) that can lead to severe consequences, including financial loss, damage of reputation, security breach, and even put human lives in danger.

In this paper, we introduce the notion of integrity verification for the user-provided input leveraging a wrist-worn wearable device (e.g., a watch or a bracelet). Specifically, we propose IvoriWatch1, a transparent and secure integrity verification mechanism, that inspects the user-provided input from a compromised host machine to a remote server for its integrity before acting upon the input. IvoriWatch requires the user to wear a wrist-wearable (either on one hand or both hands for better security). It verifies the validity of the payload/input received at the remote server by comparing it (i.e., the corresponding sequence of keyboard regions – left or right) with the predicted ones based on the wrist motions captured by the wrist-wearable. Only when the user input sufficiently correlates with the wrist motion data, the input is considered legitimate. We build a prototype implementation of IvoriWatch on an Android smartwatch as the wrist-wearable and a desktop PC terminal as a host machine, and evaluate it under benign and adversarial settings. Our results suggest that IvoriWatch can correctly detect the legitimacy of the input in the benign setting, and the manipulated as well as unintended input from a malicious program in the adversarial settings with minimal errors. Although IvoriWatch uses wrist movements for integrity verification, it is not a biometric scheme.

Verify&Revive: Secure Detection and Recovery of Compromised Low-end Embedded Devices

Mahmoud Ammar
Bruno Crispo

Tiny and specialized computing platforms, so-called embedded or Internet of Things (IoT) devices, are increasingly used in safety- and privacy-critical application scenarios. A significant number of such devices offer limited or no security features, making them attractive targets for a wide variety of cyber attacks, exemplified by malware infestations. One key component in securing these devices is establishing a root of trust, which is typically attained via remote attestation (RA), a security service that aims to ascertain the current state of a remote device and detect any malicious tampering. Although several (software-based, hardware-based, and hybrid) RA approaches have been proposed to address this problem, two main issues remain, regardless of the type of RA. First, all but one of the existing RA approaches are vulnerable to Time-Of-Check Time-Of-Use (TOCTOU) attack, where a transient malware may infect the corresponding embedded device between two consecutive RA routines without being detected. Second, little attention has been devoted to efficiently and securely rescuing devices that are determined to be compromised, increasing the maintenance cost of IoT deployments, especially in industrial control systems, where (re-)deploying a new device is often a cost-sensitive operation.

Motivated by the fact that many low-end devices neither support hardware-based RA nor can afford hardware modifications required by hybrid approaches, we tackle the aforementioned issues by proposing Verify&Revive, the first reliable pure-software approach to remote attestation with recovery techniques, targeting the low-end range of IoT devices. It consists of two components: Verify and Revive. Verify is a TOCTOU-secure RA scheme with a built-in secure erasure module that is automatically executed as a countermeasure in case of detection of a malware infection on the IoT device. Revive is a secure code update scheme that is executed upon request to install regular updates or as a recovery technique to restore the last benign settings of the cleaned, yet non-functioning, IoT device. A proof of attestation, erasure, and update/recovery is obtained relying on trustworthy software, leveraging and extending a formally-verified software-based memory isolation technique, called the Security MicroVisor (SμV). We implement and evaluate Verify&Revive on industrial resource-constrained IoT devices, showing very low overhead in terms of a memory footprint, performance, and battery lifetime.

FirmAE: Towards Large-Scale Emulation of IoT Firmware for Dynamic Analysis

Mingeun Kim
Dongkwan Kim
Eunsoo Kim
Suryeon Kim
Yeongjin Jang
Yongdae Kim

One approach to assess the security of embedded IoT devices is applying dynamic analysis such as fuzz testing to their firmware in scale. To this end, existing approaches aim to provide an emulation environment that mimics the behavior of real hardware/peripherals. Nonetheless, in practice, such approaches can emulate only a small fraction of firmware images. For example, Firmadyne, a state-of-the-art tool, can only run 183 (16.28%) of 1,124 wireless router/IP-camera images that we collected from the top eight manufacturers. Such a low emulation success rate is caused by discrepancy in the real and emulated firmware execution environment.

In this study, we analyzed the emulation failure cases in a large-scale dataset to figure out the causes of the low emulation rate. We found that widespread failure cases often avoided by simple heuristics despite having different root causes, significantly increasing the emulation success rate. Based on these findings, we propose a technique, arbitrated emulation, and we systematize several heuristics as arbitration techniques to address these failures. Our automated prototype, FirmAE, successfully ran 892 (79.36%) of 1,124 firmware images, including web servers, which is significantly (≈ 4.8x) more images than that run by Firmadyne. Finally, by applying dynamic testing techniques on the emulated images, FirmAE could check 320 known vulnerabilities (306 more than Firmadyne), and also find 12 new 0-days in 23 devices.

Device-agnostic Firmware Execution is Possible: A Concolic Execution Approach for Peripheral Emulation

Chen Cao
Le Guan
Jiang Ming
Peng Liu

With the rapid proliferation of IoT devices, our cyberspace is nowadays dominated by billions of low-cost computing nodes, which are very heterogeneous to each other. Dynamic analysis, one of the most effective approaches to finding software bugs, has become paralyzed due to the lack of a generic emulator capable of running diverse previously-unseen firmware. In recent years, we have witnessed devastating security breaches targeting low-end microcontroller-based IoT devices. These security concerns have significantly hamstrung further evolution of the IoT technology. In this work, we present Laelaps, a device emulator specifically designed to run diverse software of microcontroller devices. We do not encode into our emulator any specific information about a device. Instead, Laelaps infers the expected behavior of firmware via symbolic-execution-assisted peripheral emulation and generates proper inputs to steer concrete execution on the fly. This unique design feature makes Laelaps capable of running diverse firmware with no a priori knowledge about the target device. To demonstrate the capabilities of Laelaps, we applied dynamic analysis techniques on top of our emulator. We successfully identified both self-injected and real-world vulnerabilities.

SESSION: Applied Cryptography

Set It and Forget It! Turnkey ECC for Instant Integration

Dmitry Belyavsky
Billy Bob Brumley
Jesús-Javier Chi-Domínguez
Luis Rivera-Zamarripa
Igor Ustinov

Historically, Elliptic Curve Cryptography (ECC) is an active field of applied cryptography where recent focus is on high speed, constant time, and formally verified implementations. While there are a handful of outliers where all these concepts join and land in real-world deployments, these are generally on a case-by-case basis: e.g. a library may feature such X25519 or P-256 code, but not for all curves. In this work, we propose and implement a methodology that fully automates the implementation, testing, and integration of ECC stacks with the above properties. We demonstrate the flexibility and applicability of our methodology by seamlessly integrating into three real-world projects: OpenSSL, Mozilla’s NSS, and the GOST OpenSSL Engine, achieving roughly 9.5x, 4.5x, 13.3x, and 3.7x speedup on any given curve for key generation, key agreement, signing, and verifying, respectively. Furthermore, we showcase the efficacy of our testing methodology by uncovering flaws and vulnerabilities in OpenSSL, and a specification-level vulnerability in a Russian standard. Our work bridges the gap between significant applied cryptography research results and deployed software, fully automating the process.

Practical Over-Threshold Multi-Party Private Set Intersection

Rasoul Akhavan Mahdavi
Thomas Humphries
Bailey Kacsmar
Simeon Krastnikov
Nils Lukas
John A. Premkumar
Masoumeh Shafieinejad
Simon Oya
Florian Kerschbaum
Erik-Oliver Blass

Over-Threshold Multi-Party Private Set Intersection (OT-MP-PSI) is the problem where several parties, each holding a set of elements, want to know which elements appear in at least t sets, for a certain threshold t, without revealing any information about elements that do not meet this threshold. This problem has many practical applications, but current solutions require a number of expensive operations exponential in t and thus are impractical.

In this work we introduce two new OT-MP-PSI constructions using more efficient techniques. Our more refined scheme, which we call , runs in three communication rounds. achieves communication complexity that is linear in the number of parties, the number of elements they hold, and the intersection threshold. The computational cost of is still exponential in t, but it relies on cheap linear operations and thus it is still practical. We implement our new constructions to validate their practicality for varying thresholds, number of parties, and dataset size.

Secure and Verifiable Inference in Deep Neural Networks

Guowen Xu
Hongwei Li
Hao Ren
Jianfei Sun
Shengmin Xu
Jianting Ning
Haomiao Yang
Kan Yang
Robert H. Deng

Outsourced inference service has enormously promoted the popularity of deep learning, and helped users to customize a range of personalized applications. However, it also entails a variety of security and privacy issues brought by untrusted service providers. Particularly, a malicious adversary may violate user privacy during the inference process, or worse, return incorrect results to the client through compromising the integrity of the outsourced model. To address these problems, we propose SecureDL to protect the model’s integrity and user’s privacy in Deep Neural Networks (DNNs) inference process. In SecureDL, we first transform complicated non-linear activation functions of DNNs to low-degree polynomials. Then, we give a novel method to generate sensitive-samples, which can verify the integrity of a model’s parameters outsourced to the server with high accuracy. Finally, We exploit Leveled Homomorphic Encryption (LHE) to achieve the privacy-preserving inference. We shown that our sensitive-samples are indeed very sensitive to model changes, such that even a small change in parameters can be reflected in the model outputs. Based on the experiments conducted on real data and different types of attacks, we demonstrate the superior performance of SecureDL in terms of detection accuracy, inference accuracy, computation, and communication overheads.

ZeroAUDIT

Aman Luthra
James Cavanaugh
Hugo Renzzo Oclese
Rina M. Hirsch
Xiang Fu

Consider the problem of auditing an investment fund. This usually involves inspecting each transaction in its trading history, and accumulating its capital gains and losses, so that its net asset value can be computed precisely to avoid financial frauds. We present ZeroAUDIT, a confidential and privacy preserving auditing platform, which accomplishes this goal without having to know any of a transaction’s details. Sitting at the heart of the system is a zero knowledge proof protocol, in the discrete logarithm setting, which allows one to reason about the elements of a Merkle tree. Using it, we can prove that a trading transaction is occurring at a fair market price without disclosing which securities are being traded. We have implemented the system on the Hyperledger Fabric platform and we report the use of batch verification techniques in improving its efficiency.

Policy-based Chameleon Hash for Blockchain Rewriting with Black-box Accountability

Yangguang Tian
Nan Li
Yingjiu Li
Pawel Szalachowski
Jianying Zhou

Policy-based chameleon hash is a useful primitive for blockchain rewriting. It allows a party to create a transaction associated with an access policy, while another party who possesses enough rewriting privileges satisfying the access policy can rewrite the transaction. However, it lacks accountability. The chameleon trapdoor holder may abuse his/her rewriting privilege and maliciously rewrite the hashed object in the transaction without being identified. In this paper, we introduce policy-based chameleon hash with black-box accountability (PCHBA). Black-box accountability allows an attribute authority to link modified transactions to responsible transaction modifiers in case of dispute, in which any public user identifies those transaction modifiers from interacting with an access device/blackbox. We first present a generic framework of PCHBA. Then, we present a practical instantiation, showing its practicality through implementation and evaluation analysis.

SESSION: Security of Voice Assistants

WearID: Low-Effort Wearable-Assisted Authentication of Voice Commands via Cross-Domain Comparison without Training

Cong Shi
Yan Wang
Yingying Chen
Nitesh Saxena
Chen Wang*

Due to the open nature of voice input, voice assistant (VA) systems (e.g., Google Home and Amazon Alexa) are vulnerable to various security and privacy leakages (e.g., credit card numbers, passwords), especially when issuing critical user commands involving large purchases, critical calls, etc. Though the existing VA systems may employ voice features to identify users, they are still vulnerable to various acoustic-based attacks (e.g., impersonation, replay, and hidden command attacks). In this work, we propose a training-free voice authentication system, WearID, leveraging the cross-domain speech similarity between the audio domain and the vibration domain to provide enhanced security to the ever-growing deployment of VA systems. In particular, when a user gives a critical command, WearID exploits motion sensors on the user’s wearable device to capture the aerial speech in the vibration domain and verify it with the speech captured in the audio domain via the VA device’s microphone. Compared to existing approaches, our solution is low-effort and privacy-preserving, as it neither requires users’ active inputs (e.g., replying messages/calls) nor to store users’ privacy-sensitive voice samples for training. In addition, our solution exploits the distinct vibration sensing interface and its short sensing range to sound (e.g., 25cm) to verify voice commands. Examining the similarity of the two domains’ data is not trivial. The huge sampling rate gap (e.g., 8000Hz vs. 200Hz) between the audio and vibration domains makes it hard to compare the two domains’ data directly, and even tiny data noises could be magnified and cause authentication failures. To address the challenges, we investigate the complex relationship between the two sensing domains and develop a spectrogram-based algorithm to convert the microphone data into the lower-frequency “ motion sensor data” to facilitate cross-domain comparisons. We further develop a user authentication scheme to verify that the received voice command originates from the legitimate user based on the cross-domain speech similarity of the received voice commands. We report on extensive experiments to evaluate the WearID under various audible and inaudible attacks. The results show WearID can verify voice commands with 99.8% accuracy in the normal situation and detect 97.2% fake voice commands from various attacks, including impersonation/replay attacks and hidden voice/ultrasound attacks.

Imperio: Robust Over-the-Air Adversarial Examples for Automatic Speech Recognition Systems

Lea Schönherr
Thorsten Eisenhofer
Steffen Zeiler
Thorsten Holz
Dorothea Kolossa

Automatic speech recognition (ASR) systems can be fooled via targeted adversarial examples, which induce the ASR to produce arbitrary transcriptions in response to altered audio signals. However, state-of-the-art adversarial examples typically have to be fed into the ASR system directly, and are not successful when played in a room. Previously published over-the-air adversarial examples fall into one of three categories: they are either handcrafted examples, they are so conspicuous that human listeners can easily recognize the target transcription once they are alerted to its content, or they require precise information about the room where the attack takes place, and are hence not transferable to other rooms.

In this paper, we demonstrate the first algorithm that produces generic adversarial examples against hybrid ASR systems, which remain robust in an over-the-air attack that is not adapted to the specific environment. Hence, no prior knowledge of the room characteristics is required. Instead, we use room impulse responses (RIRs) to compute robust adversarial examples for arbitrary room characteristics and employ the ASR system Kaldi to demonstrate the attack. Further, our algorithm can utilize psychoacoustic methods to hide changes of the original audio signal below the human thresholds of hearing. In practical experiments, we show that the adversarial examples work for varying room setups, and that no direct line-of-sight between speaker and microphone is necessary. As a result, an attacker can create inconspicuous adversarial examples for any target transcription and apply these to arbitrary room setups without any prior knowledge.

Measuring the Effectiveness of Privacy Policies for Voice Assistant Applications

Song Liao
Christin Wilson
Long Cheng
Hongxin Hu
Huixing Deng

Voice Assistants (VA) such as Amazon Alexa and Google Assistant are quickly and seamlessly integrating into people’s daily lives. The increased reliance on VA services raises privacy concerns such as the leakage of private conversations and sensitive information. Privacy policies play an important role in addressing users’ privacy concerns and informing them about the data collection, storage, and sharing practices. VA platforms (both Amazon Alexa and Google Assistant) allow third-party developers to build new voice-apps and publish them to app stores. Voice-app developers are required to provide privacy policies to disclose their apps’ data practices. However, little is known whether these privacy policies are informative and trustworthy or not on emerging VA platforms. On the other hand, many users invoke voice-apps through voice and thus there exists a usability challenge for users to access these privacy policies.

In this paper, we conduct the first large-scale data analytics to systematically measure the effectiveness of privacy policies provided by voice-app developers on two mainstream VA platforms. We seek to understand the quality and usability issues of privacy policies provided by developers in the current app stores. We analyzed 64,720 Amazon Alexa skills and 16,002 Google Assistant actions. Our work also includes a user study to understand users’ perspectives on privacy policies of voice-apps. Our findings reveal a worrisome reality of privacy policies in two mainstream voice-app stores. For the 17,952 skills and 9,955 actions that have privacy policies, there are many voice-apps with incorrect privacy policy URLs or broken links. We found that 1,755 Alexa skills and 192 Google actions provide a broken privacy policy URL. Amazon Alexa has more than 56% of skills with duplicate privacy policy URLs. While the Google Assistant platform has 9.0% of actions with duplicate privacy policy URLs. There are also skills/actions with inconsistency between the privacy policy and description. 6,047 Google actions do not have a privacy policy although they are required to provide one. Google and Amazon even have official voice-apps violating their own requirements regarding the privacy policy. We have reported our findings to both Amazon Alexa and Google Assistant teams, and received acknowledgments from both vendors.

Voicefox: Leveraging Inbuilt Transcription to Enhance the Security of Machine-Human Speaker Verification against Voice Synthesis Attacks

Maliheh Shirvanian
Manar Mohammed
Nitesh Saxena
S Abhishek Anand

In this paper, we propose Voicefox1, a defense against the threat of automated voice synthesis attacks in machine-based and human-based speaker verification applications. Voicefox is based on a hitherto undiscovered potential of speech-to-text transcription, already built into these applications. Voicefox relies on the premise that while the synthesized samples might be falsely accepted by the speaker verification systems and human listeners, they cannot be transcribed as accurately as a natural human voice by transcribers. Voicefox is not a speaker verification system, but rather an independent module that can be integrated with any speaker verification system to enhance its security against voice synthesis attacks.

To test our premise and as an essential pre-requisite for building Voicefox, we ran an extensive study that measures the accuracy of off-the-shelf speech-to-text techniques when confronted with the synthesized samples generated by the state-of-the-art speech synthesis techniques. Our results show that the transcription error rate for the synthesized voices is significantly higher, on average 2-3x, than the error rate for natural voices. This study quantitatively proves our hypothesis that human voices are transcribed more accurately than synthesized voices. We further propose several post-transcription rules in designing Voicefox, including acceptance of transcribed text even if up to a certain number of words are not transcribed correctly, and ignoring the words not available in the reference dictionary. Using these rules, Voicefox can effectively reduce the false rejection rates to as low as 1.20-4.69% depending on the application and the transcriber used, and reduce the false accept rates to 0% for dictionaries with phonetically-distinct words.

VibLive: A Continuous Liveness Detection for Secure Voice User Interface in IoT Environment

Linghan Zhang
Sheng Tan
Zi Wang
Yili Ren
Zhi Wang
Jie Yang

The voice user interface (VUI) has been progressively used to authenticate users to numerous devices and applications. Such massive adoption of VUIs in IoT environments like individual homes and businesses arises extensive privacy and security concerns. Latest VUIs adopting traditional voice authentication methods are vulnerable to spoofing attacks, where a malicious party spoofs the VUIs with pre-recorded or synthesized voice commands of the genuine user. In this paper, we design VibLive, a continuous liveness detection system for secure VUIs in IoT environments. The underlying principle of VibLive is to catch the dissimilarities between bone-conducted vibrations and air-conducted voices when human speaks for liveness detection. VibLive is a text-independent system that verifies live users and detects spoofing attacks without requiring users to enroll specific passphrases. Moreover, VibLive is practical and transparent as it requires neither additional operations nor extra hardwares, other than a loudspeaker and a microphone that are commonly equipped on VUIs. Our evaluation with 25 participants under different IoT intended experiment settings shows that VibLive is highly effective with over 97% detection accuracy. Results also show that VibLive is robust to various use scenarios.

SESSION: Machine Learning Security

Februus: Input Purification Defense Against Trojan Attacks on Deep Neural Network Systems

Bao Gia Doan
Ehsan Abbasnejad
Damith C. Ranasinghe

We propose Februus; a new idea to neutralize highly potent and insidious Trojan attacks on Deep Neural Network (DNN) systems at run-time. In Trojan attacks, an adversary activates a backdoor crafted in a deep neural network model using a secret trigger, a Trojan, applied to any input to alter the model’s decision to a target prediction—a target determined by and only known to the attacker. Februus sanitizes the incoming input by surgically removing the potential trigger artifacts and restoring the input for the classification task. Februus enables effective Trojan mitigation by sanitizing inputs with no loss of performance for sanitized inputs, Trojaned or benign. Our extensive evaluations on multiple infected models based on four popular datasets across three contrasting vision applications and trigger types demonstrate the high efficacy of Februus. We dramatically reduced attack success rates from 100% to near 0% for all cases (achieving 0% on multiple cases) and evaluated the generalizability of Februus to defend against complex adaptive attacks; notably, we realized the first defense against the advanced partial Trojan attack. To the best of our knowledge, Februus is the first backdoor defense method for operation at run-time capable of sanitizing Trojaned inputs without requiring anomaly detection methods, model retraining or costly labeled data.

NoiseScope: Detecting Deepfake Images in a Blind Setting

Jiameng Pu
Neal Mangaokar
Bolun Wang
Chandan K Reddy
Bimal Viswanath

Recent advances in Generative Adversarial Networks (GANs) have significantly improved the quality of synthetic images or deepfakes. Photorealistic images generated by GANs start to challenge the boundary of human perception of reality, and brings new threats to many critical domains, e.g., journalism, and online media. Detecting whether an image is generated by GAN or a real camera has become an important yet under-investigated area. In this work, we propose a blind detection approach called NoiseScope for discovering GAN images among other real images. A blind approach requires no a priori access to GAN images for training, and demonstrably generalizes better than supervised detection schemes. Our key insight is that, similar to images from cameras, GAN images also carry unique patterns in the noise space. We extract such patterns in an unsupervised manner to identify GAN images. We evaluate NoiseScope on 11 diverse datasets containing GAN images, and achieve up to 99.68% F1 score in detecting GAN images. We test the limitations of NoiseScope against a variety of countermeasures, observing that NoiseScope holds robust or is easily adaptable.

StegoNet: Turn Deep Neural Network into a Stegomalware

Tao Liu
Zihao Liu
Qi Liu
Wujie Wen
Wenyao Xu
Ming Li

Deep Neural Networks (DNNs) are now presenting human-level performance on many real-world applications, and DNN-based intelligent services are becoming more and more popular across all aspects of our lives. Unfortunately, the ever-increasing DNN service implies a dangerous feature which has not yet been well studied–allowing the marriage of existing malware and DNN model for any pre-defined malicious purpose. In this paper, we comprehensively investigate how to turn DNN into a new breed evasive self-contained stegomalware, namely StegoNet, using model parameter as a novel payload injection channel, with no service quality degradation (i.e. accuracy) and the triggering event connected to the physical world by specified DNN inputs. A series of payload injection techniques which take advantage of a variety of unique neural network natures like complex structure, high error resilience capability and huge parameter size, are developed for both uncompressed models (with model redundancy) and deeply compressed models tailored for resource-limited devices (no model redundancy), including LSB substitution, resilience training, value mapping, and sign-mapping. We also proposed a set of triggering techniques like logits trigger, rank trigger and fine-tuned rank trigger to trigger StegoNet by specific physical events under realistic environment variations. We implement the StegoNet prototype on Nvidia Jetson TX2 testbed. Extensive experimental results and discussions on the evasiveness, integrity of proposed payload injection techniques, and the reliability and sensitivity of the triggering techniques, well demonstrate the feasibility and practicality of StegoNet.

SEEF-ALDR: A Speaker Embedding Enhancement Framework via Adversarial Learning based Disentangled Representation

Jianwei Tai
Xiaoqi Jia
Qingjia Huang
Weijuan Zhang
Haichao Du
Shengzhi Zhang

Speaker verification, as a biometric authentication mechanism, has been widely used due to the pervasiveness of voice control on smart devices. However, the task of “in-the-wild” speaker verification is still challenging, considering the speech samples may contain lots of identity-unrelated information, e.g., background noise, reverberation, emotion, etc. Previous works focus on optimizing the model to improve verification accuracy, without taking into account the elimination of the impact from the identity-unrelated information. To solve the above problem, we propose SEEF-ALDR, a novel Speaker Embedding Enhancement Framework via Adversarial Learning based Disentangled Representation, to reinforce the performance of existing models on speaker verification. The key idea is to retrieve as much speaker identity information as possible from the original speech, thus minimizing the impact of identity-unrelated information on the speaker verification task by using adversarial learning. Experimental results demonstrate that the proposed framework can significantly improve the performance of speaker verification by 20.3% and 23.8% on average over 13 tested baselines on dataset Voxceleb1 and 8 tested baselines on dataset Voxceleb2 respectively, without adjusting the structure or hyper-parameters of them. Furthermore, the ablation study was conducted to evaluate the contribution of each module in SEEF-ALDR. Finally, porting an existing model into the proposed framework is straightforward and cost-efficient, with very little effort from the model owners due to the modular design of the framework.

Attacking Graph-Based Classification without Changing Existing Connections

Xuening Xu
Xiaojiang Du
Qiang Zeng

In recent years, with the rapid development of machine learning in various domains, more and more studies have shown that machine learning models are vulnerable to adversarial attacks. However, most existing researches on adversarial machine learning study non-graph data, such as images and text. Though some previous works on graph data have shown that adversaries can make graph-based classification methods unreliable by adding perturbations to features or adjacency matrices of existing nodes, these kinds of attacks sometimes have limitations for real-world applications. For example, to launch such attacks in real social networks, the attacker cannot force two good users to change (e.g., remove) the connection between them, which means that the attacker can not launch such attacks. In this paper, we propose a novel attack on collective classification methods by adding fake nodes into existing graphs. Our attack is more realistic and practical than the attack mentioned above. For instance, in a real social network, an attacker only needs to create some fake accounts and connect them to existing users without modifying the connections among existing users. We formulate the new attack as an optimization problem and utilize a gradient-based method to generate edges of newly added fake nodes. Our extensive experiments show that the attack can not only make new fake nodes evade detection, but also make the detector misclassify most of the target nodes. The proposed new attack is very effective and can achieve up to 100% False Negative Rates (FNRs) for both the new node set and the target node set.