rtfobj is a Python module to extract embedded objects from RTF files, such as OLE ojects. It can be used as a Python library or a command-line tool. It is part of the oletools package.
See the oletools page for more info.
News
- 2013-04-18 v0.02: fixed bug in rtfobj, added documentation
- 2012-11-09 v0.01: 1st version of rtfobj, used by pyxswf
- See changelog in source code for more info.
Download:
The archive is available on the project page.
Usage
Usage: rtfobj.py <file.rtf>
It extracts and decodes all the data blocks encoded as hexadecimal in the RTF document, and saves them as files named "object_xxxx.bin", xxxx being the location of the object in the RTF file.
Usage as python module: rtf_iter_objects(filename) is an iterator which yields a tuple (index, object) providing the index of each hexadecimal stream in the RTF file, and the corresponding decoded object. Example:
import rtfobj
for index, data in rtfobj.rtf_iter_objects("myfile.rtf"):
print 'found object size %d at index %08X' % (len(data), index)