Full Program »
Robust Keystroke Transcription from the Acoustic Side-Channel
The acoustic emanations from keyboards provide a side-channel attack from which an attacker can recover sensitive user information, such as passwords and personally identifiable information. Previous work has shown the feasibility of these attacks given isolated key strokes, but has not demonstrated robust keystroke detection and segmentation in the presence of realistic noise and fast typing speeds. Common problems include noises like doors closing or speech as well as overlapping keystroke waveforms. Prior work has assumed that isolating the waveform of individual key strokes can be achieved with near 100% accuracy, but we show that these techniques generate a large number of misses and false positives, drastically impacting the downstream keystroke classification task.
To solve this problem, we present a deep learning system, leveraging related state-of-the-art techniques from speech transcription, that performs end-to-end, audio-to-keystroke transcription with superior performance. The recurrent architecture enables it to robustly handle overlapping waveforms and adapt to local noise profiles. Furthermore, the joint approach to keystroke detection and classification enables us to both train without ground truth keystroke timings and outperform standard classification approaches even they have ground truth timings. Due to the paucity of existing datasets, we collected a novel acoustic and keylogger dataset comprising 17 users and 86k keystrokes across various real-world typing tasks. On this dataset, we reduce the end-to-end character error rate on English text from 36.0% to 7.41% for known typists and 41.3% to 15.41% for unknown typists. The keystroke acoustic side-channel attack remains dangerously feasible.