Posted by Javier Guerrero, September 8th, 2010

Sometimes when writing my posts, I get the urge to forget about malware for a while and talk about the other “side”: antivirus software. Specifically, I like to stress the difficulty involved in certain aspects of developing anti-malware products; I think it’s an interesting subject, and one that is not widely understood.

False positives
False positives

And so now, I’d like to talk about a problem that affects all malware detection software: false positives… So what are they?

A false positive occurs when an antivirus erroneously identifies a legitimate file or process as malware. This can happen with signature-based scans as well as behavior analysis.

An antivirus identifies malware basically using one of two methods: signature-based scanning or analysis of behavior. In the first instance, the scanner looks for a specific pattern of bytes, which has been previously catalogued as malicious, or at least suspicious, and may correspond to a sequence of malware commands, a univocal value that identifies the file (known as a hash) or other values that may be used for identification.

In the case of behavior analysis, actions are detected which, although on their own may not be malicious, when they are correlated with others represent a symptom of malicious activity.

The problem is that neither of these methods is infallible: the hash of a file is useless, for example, against polymorphic viruses, or expackers. Moreover, a sequence of instructions classified as suspicious could easily be contained in a legitimate file, as after all, we are talking about executable code.

The same thing occurs with behavior analysis: The process that generates an executable file, which later writes a registry entry referring to the executable, could be an intruder inserting a rootkit on the system, but also the installer of a bona fide application.

The consequences of false positives can be serious: If an antivirus erroneously deletes a file which is vital to the functioning of the computer, the system could be rendered unusable, and this does actually happen, with grave repercussions.

Fortunately, false positives are not frequent (particularly in relation to the immense amount of files that anti-viruses have to scan) and security companies implement strict quality control to avoid them.

In any event, as I mentioned in the beginning, all developers suffer from this problem, which, I believe, demonstrates how challenging it is to develop and anti-malware product.