Full Program »
T5. Finding Data Leaks in Applications, Network Protocols, and Systems with Open Source Computer Forensics Tools
Tuesday, 10 December 2013
08:30 - 12:00
Many kinds of data leaks and security flaws are easy to find if you just look. Hard-coded usernames and passwords, weak or missing cryptography, and logfiles containing inappropriately sensitive information are easy to spot - provided that you know what sensitive data looks like, and provided that you're using the right tools. Although many privacy and security auditors restrict themselves to reading privacy policies, written specifications and architectural diagrams, experienced investigators know that there is no substitute for looking at the data as well. This is especially true of cyber-physical systems, which were historically developed by programmers that have little training in information security. Frequently these developers don't realize that security snafus hiding in their own code.
This course teaches how to use the open source tools bulk_extractor and tcpflow to analyze application files, databases, network packet traces, memory dumps, and entire operating systems for data leaks and other kinds of related security problems. It teaches the student how to recognize sensitive data when encoded in a variety of different formats, and how to extend the open source tools when presented with data that is in a proprietary form. (Such proprietary formats are common with process control systems.) It presents famous cases of how sensitive data was left behind by application programs, operating systems, programmers and users in PDF files, databases, network connections, and system memory. Finally, it presents programming patterns for eliminating leakage of sensitive information. Prerequisites. Basic knowledge of scripting languages (e.g. python) and cryptography (hash functions, symmetric algorithms such as AES, asymmetric algorithms such as RSA, and PKI).
Auditing for Data Leakage: (1) What is Data Leakage? (2) What is Auditing? (3) Advantages: You can find stuff that the vendor / designer / programmer doesn't tell you about/doesn't know about. (4) Limitations: You can't find it all. (5) Kinds of auditing: black box / grey box /white box .
What are we looking for? (1) PII - Personally Identifiable Information (email addresses, names, CCNs, etc) (2) Plaintext passwords. (3) Hardcoded passwords (4) Hardcoded URLs, IP addresses (5) Examples
What Data Look like: (1) ASCII (2) Unicode (3) Hex Dumps (4) Strings (5) Numbers (6) Magic Numbers (7) Encodings (base16, Base64, Base85, compression)
Analyzing Application Programs at Rest: (1) Strings; (2) Disassemblers for x86, ARM and Java; (3) Why you want to avoid disassembling
Introduction - Using bulk_extractor: (1) What it is; (2) How it works
Using bulk_extractor: (1) To analyze programs and program installations; (2) To analyzing running computer systems; (3) To analyze memory
Introduction: (1) Brief introduction to IP networks, TCP protocols, and Encryption Protocols; (2) Live vs. captured analysis; (3) Packet file formats; (4) Wireless vs. Wired networks; (5) Making network captures with tcpdump; (6) Wireshark to analyze individual packets; (7) Tcpflow to analyze TCP streams
Using tcpflow: (1) Breaking a packet capture into
Dealing with encryption: (1) Decrypting SSL with decrypting proxies; (2) Decrypting SSL with server keys
Extending these tools
Post-processing with python modules . Working with Unicode and large files with Python . Writing extensions for bulk_extractor and tcpflow in C++ .
About the Instructor:
Dr. Simson L. Garfinkel is an Associate Professor at the Naval Postgraduate School. Based in Arlington VA, Garfinkel's research interests include computer forensics, the emerging field of usability and security, personal information management, privacy, information policy and terrorism. He holds six US patents for his computer-related research and has published dozens of journal and conference papers in security and computer forensics. Garfinkel is the author or co-author of fourteen books on computing. He is perhaps best known for his book Database Nation: The Death of Privacy in the 21st Century. Garfinkel's most successful book, Practical UNIX and Internet Security (co-authored with Gene Spafford), has sold more than 250,000 copies and been translated into more than a dozen languages since the first edition was published in 1991. Garfinkel received three Bachelor of Science degrees from MIT in 1987, a Master's of Science in Journalism from Columbia University in 1988, and a Ph.D. in Computer Science from MIT in 2005. Garfinkel is the primary developer and maintainer of bulk_extractor and tcpflow, the two primary tools that will be used in this course.