Automatic Reassembly of Document Fragments via Context Based Statistical Models

Kulesh Shanmugasundaram, Nasir Memon
Polytechnic University

Reassembly of fragmented objects from a collection of randomly mixed fragments is a common problem in classical forensics. In this paper we address the digital forensic equivalent, i.e., reassembly of document fragments, using statistical modelling tools applied in data compression. We propose a general process model for automatically analyzing a collection fragments to reconstruct the original document by placing the fragments in proper order. Probabilities are assigned to the likelihood that two given fragments are adjacent in the original using context modelling techniques in data compression. The problem of finding the optimal ordering is shown to be equivalent to finding a maximum weight Hamiltonian path in a complete graph. Heuristics are designed and explored and implementation results provided which demonstrate the validity of the proposed technique.

Keywords: Forensics, Document Reassembly

Read Paper Read Paper (in PDF)