File: //usr/share/doc/clamav-0.103.3/html/UserManual/Signatures/FileTypes.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<title></title>
<style type="text/css">code{white-space: pre;}</style>
<link rel="stylesheet" href="/en/github.css" type="text/css" />
</head>
<body>
<h1 id="clamav-file-types">ClamAV File Types</h1>
<p>ClamAV maintains it's own file typing format and assigns these types using either:</p>
<ul>
<li>Evaluation of a unique sequence of bytes at the start of a file (<a href="../../UserManual/Signatures/FileTypeMagic.html">File Type Magic</a>).</li>
<li>File type indicators when parsing container files.</li>
<li>For example:<br />
CL_TYPE_SCRIPT may be assigned to data contained in a PDF when the PDF indicates that a stream of bytes is "Javascript"</li>
<li>File type determination based on the names or characteristics contained within the file.</li>
<li>For example:<br />
CL_TYPE_OOXML_WORD may be assigned to a Zip file containing files with specific names.</li>
</ul>
<hr />
<h2 id="target-types">Target Types</h2>
<p>A Target Type is an integer that indicates which kind of file the signature will match against. Target Type notation was first created for the purposes writing efficient signatures. A signature with a target type of <code>0</code> will be run against every file type, and thus is not ideal. However, the Target Type notation is limited and it may be unavoidable.</p>
<p>Although the newer CL_TYPE string name notation has replaced the Target Type for some signature formats, many signature formats require a target type number.</p>
<p>This is the current list of available Targe Types:</p>
<ul>
<li>0 = any file</li>
<li>1 = Portable Executable, both 32- and 64-bit.</li>
<li>2 = OLE2 containers, including their specific macros. The OLE2 format is primarily used by MS Office and MSI installation files.</li>
<li>3 = HTML (normalized)</li>
<li>4 = Mail file</li>
<li>5 = Graphics</li>
<li>6 = ELF</li>
<li>7 = ASCII text file (normalized)</li>
<li>8 = Unused</li>
<li>9 = Mach-O files</li>
<li>10 = PDF files</li>
<li>11 = Flash files</li>
<li>12 = Java class files</li>
</ul>
<p>**<em>Important</em>: HTML, ASCII, Javascript are all normalized.</p>
<ul>
<li>ASCII:</li>
<li>All lowercase.</li>
<li>HTML:</li>
<li>Whitespace transformed to spaces, tags/tag attributes normalized, all lowercase.</li>
<li>Javascript:</li>
<li>All strings are normalized (hex encoding is decoded), numbers are parsed and normalized, local variables/function names are normalized to ’n001’ format, argument to eval() is parsed as JS again, unescape() is handled, some simple JS packers are handled, output is whitespace normalized.</li>
</ul>
<hr />
<h2 id="cl_types">CL_TYPEs</h2>
<p>ClamAV Types are prefixed with <code>CL_TYPE_</code>. The following is an exhaustive list of all current CL_TYPE's.</p>
<pre>
| CL_TYPE | Description |
|------------------------|--------------------------------------------------------------|
| `CL_TYPE_7Z` | 7-Zip Archive |
| `CL_TYPE_7ZSFX` | Self-Extracting 7-Zip Archive |
| `CL_TYPE_APM` | Disk Image - Apple Partition Map |
| `CL_TYPE_ARJ` | ARJ Archive |
| `CL_TYPE_ARJSFX` | Self-Extracting ARJ Archive |
| `CL_TYPE_AUTOIT` | AutoIt Automation Executable |
| `CL_TYPE_BINARY_DATA` | binary data |
| `CL_TYPE_BINHEX` | BinHex Macintosh 7-bit ASCII email attachment encoding |
| `CL_TYPE_BZ` | BZip Compressed File |
| `CL_TYPE_CABSFX` | Self-Extracting Microsoft CAB Archive |
| `CL_TYPE_CPIO_CRC` | CPIO Archive (CRC) |
| `CL_TYPE_CPIO_NEWC` | CPIO Archive (NEWC) |
| `CL_TYPE_CPIO_ODC` | CPIO Archive (ODC) |
| `CL_TYPE_CPIO_OLD` | CPIO Archive (OLD, Little Endian or Big Endian) |
| `CL_TYPE_CRYPTFF` | Files encrypted by CryptFF malware |
| `CL_TYPE_DMG` | Apple DMG Archive |
| `CL_TYPE_EGG` | ESTSoft EGG Archive, new in 0.102 |
| `CL_TYPE_ELF` | ELF Executable (Linux/Unix program or library) |
| `CL_TYPE_GIF` | GIF Graphics File, new in 0.103 |
| `CL_TYPE_GPT` | Disk Image - GUID Partition Table |
| `CL_TYPE_GRAPHICS` | Other graphics files; BMP, JPEG2000 |
| `CL_TYPE_GZ` | GZip Compressed File |
| `CL_TYPE_HTML_UTF16` | Wide-Character / UTF16 encoded HTML |
| `CL_TYPE_HTML` | HTML data |
| `CL_TYPE_HWP3` | Hangul Word Processor (3.X) |
| `CL_TYPE_HWPOLE2` | Hangul Word Processor embedded OLE2 |
| `CL_TYPE_INTERNAL` | Internal properties |
| `CL_TYPE_ISHIELD_MSI` | Windows Install Shield MSI installer |
| `CL_TYPE_ISO9660` | ISO 9660 file system for optical disc media |
| `CL_TYPE_JAVA` | Java Class File |
| `CL_TYPE_JPEG` | JPEG Graphics File, new in 0.103.1 |
| `CL_TYPE_LNK` | Microsoft Windows Shortcut File |
| `CL_TYPE_MACHO_UNIBIN` | Universal Binary/Java Bytecode |
| `CL_TYPE_MACHO` | Apple/NeXTSTEP Mach-O Executable file format |
| `CL_TYPE_MAIL` | Email file |
| `CL_TYPE_MBR` | Disk Image - Master Boot Record |
| `CL_TYPE_MHTML` | MHTML Saved Web Page |
| `CL_TYPE_MSCAB` | Microsoft CAB Archive |
| `CL_TYPE_MSCHM` | Microsoft CHM help archive |
| `CL_TYPE_MSEXE` | Microsoft EXE / DLL Executable file |
| `CL_TYPE_MSOLE2` | Microsoft OLE2 Container file |
| `CL_TYPE_MSSZDD` | Microsoft Compressed EXE |
| `CL_TYPE_NULSFT` | NullSoft Scripted Installer program |
| `CL_TYPE_OLD_TAR` | TAR archive (old) |
| `CL_TYPE_OOXML_HWP` | Hangul Office Open Word Processor (5.X) |
| `CL_TYPE_OOXML_PPT` | Microsoft Office Open XML PowerPoint |
| `CL_TYPE_OOXML_WORD` | Microsoft Office Open Word 2007+ |
| `CL_TYPE_OOXML_XL` | Microsoft Office Open Excel 2007+ |
| `CL_TYPE_PART_HFSPLUS` | Apple HFS+ partition |
| `CL_TYPE_PDF` | Adobe PDF document |
| `CL_TYPE_PNG` | PNG Graphics File, new in 0.103 |
| `CL_TYPE_POSIX_TAR` | TAR archive |
| `CL_TYPE_PS` | Postscript |
| `CL_TYPE_RAR` | RAR Archive |
| `CL_TYPE_RARSFX` | Self-Extracting RAR Archive |
| `CL_TYPE_RIFF` | Resource Interchange File Format container formatted file |
| `CL_TYPE_RTF` | Rich Text Format document |
| `CL_TYPE_SCRENC` | Files encrypted by ScrEnc malware |
| `CL_TYPE_SCRIPT` | Generic type for scripts (Javascript, Python, etc) |
| `CL_TYPE_SIS` | Symbian OS Software Installation Script Archive |
| `CL_TYPE_SWF` | Adobe Flash File (LZMA, Zlib, or uncompressed) |
| `CL_TYPE_TEXT_ASCII` | ASCII text |
| `CL_TYPE_TEXT_UTF16BE` | UTF-16BE text |
| `CL_TYPE_TEXT_UTF16LE` | UTF-16LE text |
| `CL_TYPE_TEXT_UTF8` | UTF-8 text |
| `CL_TYPE_TIFF` | TIFF Graphics File (Little or Big Endian), new in 0.103.1 |
| `CL_TYPE_TNEF` | Microsoft Outlook & Exchange email attachment format |
| `CL_TYPE_UUENCODED` | UUEncoded (Unix-to-Unix) binary file (Unix email attachment) |
| `CL_TYPE_XAR` | XAR Archive |
| `CL_TYPE_XDP` | Adobe XDP - Embedded PDF |
| `CL_TYPE_XML_HWP` | Hangul Word Processor XML (HWPML) Document |
| `CL_TYPE_XML_WORD` | Microsoft Word 2003 XML Document |
| `CL_TYPE_XML_XL` | Microsoft Excel 2003 XML Document |
| `CL_TYPE_XZ` | XZ Archive |
| `CL_TYPE_ZIP` | Zip Archive |
| `CL_TYPE_ZIPSFX` | Self-Extracting Zip Archive |
</pre>
</body>
</html>