Please wait while the page is being loaded Skip this advertisement >
ARN

Recovering PDF redaction

PDF redaction exposed by security researcher.
Carl Jongsma (Computerworld)  09 May, 2008 10:08:57

Unintentional exposure of sensitive data through Word files is a has caused problems for companies in the past, especially when people forget that Track Changes can easily allow document recipients to view information that has been deleted or sanitised for release.

Recovery of information from PDF files has also led to some unintended consequences when it was discovered that the attempt to redact information was as simple as placing a black square/rectangle over the text, making it a simple process to recover the original text.

Didier Stevens, who gained attention for his recent discoveries relating to hiding content in PDF files, has again discovered a side effect of creating PDF files that might lead to unexpected information disclosure for the unaware.

The concept of an Incremental Update in PDF files is relatively well known, when changes to an existing PDF document don't result in the PDF file being completely rewritten on saving. How an incremental update is actually represented in the raw PDF file is less well known, but it is basically the amended data being appended to the original document, with the process repeating for subsequent updates. Stevens discovered that the process of stripping away an update and recovering the original content is an extremely simple one. What this means is that for documents that have been redacted or otherwise modified by replacing text instead of drawing a black rectangle over it, the deleted/replaced text can be recovered along with the original unmodified document in a simple one-step procedure. Making the process even simpler is that it can often be achieved with a text editor and it doesn't matter if the PDF content has been encrypted.

There are some efforts to increase awareness of the risk of document metadata, but this recent rediscovery adds another item to check prior to releasing documents for wider consumption. It is also another simple tool for forensic researchers to help in recovering original data from a document. A saving grace appears to be that many applications that export to PDF as part of their Save process do not support incremental updates, which means that if you want to redact data, do it in the original application and then export the redacted version.

It is nothing that can't be gained from reading the PDF specification, but who takes the time to read in depth the technical specification for the data format that they are using?

Comments

Post new comment

Users posting comments agree to the ARN comments policy.
Login or register to link comments to your user profile, or you may also post a comment without being logged in.
The content of this field is kept private and will not be shown publicly.
Enter the fully qualified URL, eg. http://www.example.com/
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

Syndicate content
 
ARN Vendor Directory
ARN Community Comments
ARN Library

RSA - Where Online Fraud is Going

Where Online Fraud is Going: An Insight into Emerging Threats and Changing Fraud Patterns The basic workings of online fraud can be directly correlated to “ real-world” crime.

Subscribe to ARN

ARN has been the premier provider of information to the Australian IT channel for more than 12 years. As the only weekly publication dedicated to the channel, ARN produces timely, accurate news and analysis about IT business issues, products and services, new technology and market opportunities.
Sponsored Links