Document Reassembly Research in LSU GVC Lab

Project Overview

Reassembling the remnants of destroyed (shredded or ripped) documents enables recovery of valuable information in forensic investigations and archival research. In the field of information security, reassembling document could also help us understand our limitations against adversaries' attempts to gain access to information. A computerized system capable of automatically restoring fragmented archives into their complete digital form is not only a significant time saver, but also a valuable tool to help convert these physical ruins to computer-understandable digital representations for subsequent machine-assisted analysis.

We are developing fragment matching algorithms/metrics to support the local alignment computation of document pieces. We are also developing efficient composition algorithms to find a global solution that best reassembles all the fragments.

We are maintaining a public data benchmark of digital document stripe puzzles for comparative studies.

Team Members

Publications

Reassembling Shredded Document Stripes Using Word-path Metric and Greedy Composition Optimal Matching Solver

Yongqing Liang and Xin Li

IEEE Transactions on Multimedia, 2019

[Paper][Codes and Data on GitHub]

Dataset

The DocDataset contains:

  • 60 striped document puzzles with four types of complexities of 20, 30, 40, and 60 stripes. They are named as “doc*_*”.
  • 3 physically shredded document puzzles. They are named as “real*_*”.
  • 1 randomly oriented puzzle named “doc3_36”.

Click here to download the DocDataset. Unzip the package and copy the “gt” and “stripes” into the “/data/” folder of the repository.

Demo Video

The following is a demo of our Stripe Document Reassembly pipeline (IEEE Trans. Multimedia 2019 paper). This video is also available at https://youtu.be/dBlXH8XppoQ


Other:

Locations of visitors to this page
Web Page Hit Counter