Weaponized File Formats

This is a series of articles about file formats and related security issues. In 2003 I had presented an article in French about this subject at the SSTIC conference: [SSTIC03]. In the following articles I will provide an updated version in English with more information about common file formats.

The original location of this book is http://www.decalage.info/file_formats_security.

Each file format will be described with the following information:

In the future I plan to cover common file formats such as PDF, MS Office (binary and Open XML), HTML, XML, RTF, ZIP, JPEG, EXE, etc. Stay tuned! ;-)

MS Office Open XML formats security (docx, xslx, pptx, ...)

This article describes the Microsoft Office Open XML file formats (docx, xlsx, pptx), related security issues and useful resources. [WORK IN PROGRESS]

For now, see http://www.decalage.info/opendocument_openxml

Example of known vulnerabilities:

ODF / OpenDocument format security

This article describes the OpenDocument file format (ODF), related security issues and useful resources. [WORK IN PROGRESS]

For now, see http://www.decalage.info/opendocument_openxml

Weaponized MS Office 97-2003 legacy/binary formats (doc, xls, ppt, ...)

This article describes the Microsoft Office 97-2003 legacy/binary file formats (doc, xls, ppt), related security issues and useful resources.

The original location of this page is http://www.decalage.info/file_formats_security/office.

Last update: 2014-11-19 (created 2010-03-08)

File format description

MS Office binary formats are widely used:

Except for very old MS Office versions, all these formats share the same basic container structure, either called OLE2, OLECF, structured storage or compound file/document.

MS Office also contain other applications such as MS Access which use different file formats not based on the OLE2 format.

Since MS Office 2007, new file formats based on XML (docx, xslx, pptx) are used by default. See the article about MS Office Open XML.

Main client applications

The main applications used to open MS Office files are part of the MS Office suite:

Many alternative applications are also able to open MS Office files, such as OpenOffice, StarOffice, GNOME Office and KOffice.

Main security issues

Format specifications and technical information

Specifications for the OLE2 Compound File format:

Specifications for the MS Office legacy formats:

Publications about MS Office formats security issues

Examples of known vulnerabilities and exploits

Analysis Techniques

Useful Analysis Tools

Parsing tools and libraries

Filtering tools and libraries

 

Weaponized PDF

This article describes the PDF file format, related security issues and useful resources. [WORK IN PROGRESS]

The original location of this article is http://www.decalage.info/file_formats_security/pdf

Last update: 2013-11-15 (created 2010-02-13)

File format description

PDF (Portable Document Format) is a file format designed by Adobe. It is mainly used to publish final version of documents on the Internet, by e-mail or on CD-ROMs. Its main purpose is to display or print documents with a fixed layout. The PDF format may also be used to create electronic forms.

More info: http://en.wikipedia.org/wiki/Portable_Document_Format

Main client applications

The main application used to open PDF files for display is Adobe Reader. Many alternative applications are also able to display PDF files, such as Preview on MacOSX and Foxit Reader on Windows.

Adobe Acrobat is one of the applications which can create and edit PDF documents.

Main security issues

PDF is usually considered as a static and safe format for document exchange, which is a wrong perception.

The PDF format is in fact very complex, and contains several features which may lead to security issues:

Potential Solutions

Format specifications and technical information

Publications about PDF security issues

Examples of known vulnerabilities and exploits

Obfuscation techniques

Before analyzing malicious documents, it's good to know your enemy. Here are a few hand-picked blog posts and articles that explain known obfuscation and anti-analysis techniques:

Analysis techniques

Useful analysis tools

(listed in no particular order)

Command-line

GUI

Linux distributions

Online

Parsing tools and libraries

Filtering tools and libraries