Wednesday, January 28, 2009

Can you test AV using VirusTotal?

Just a little post to bait Kurt :-)

Many people are up in arms about the idea of submitting a sample to VirusTotal and interpreting the (usually rather poor) detection count. A few links to get you started:

Now, if you read trough those, you might think that such numbers are entirely without merit and performed only by ambulance chasing amateurs who don't know better or want go create sensationalist headlines. The truth is of course somewhere in the middle, but much of the arguments given by the anti-VT-numbers doesn't seem bulletproof to me:

  • The biggest argument seems to be that full-blown AV installations have "other means" for detecting malware which is not incorportated in the command-line scanners. Here is my problem with this argument:
    • Nobody seems to be able to point the finger at what technology that should be. Is is a firewall which asks "program X wants to connect to the Internet. Allow / Deny"? Then the AV might just be perfect :-). Didier mentions McAfee's ScriptScan, but he is the one pointing out that it is easily circumvented (and I didn't even mention the issue that the technology is IE specific AFAIK, so all of Firefox / Opera users don't have the extra protection.
    • This is basically the same configuration which runs on e-mail gateways / http proxies, so if this doesn't catch it, neither will your proxy!
    • An other magical pixie dust feature would be "sandboxing", which is "not used" on VirusTotal. In fact it is! Many AV products include in them an x86 execution engine and use it for unpacking or detecting different behavior. Of course this is limited by time, but it is still "executing in an emulated environment". (Two products which rely heavily on this and are present on VT are Norman and ESET NOD32) - so you have your sandboxing
    The gist of the matter is: if it can't detect it "offline", you're in very muddy waters, praying that the "online" detection will catch it before it can do real damage (for example it is conceivable that such online detection works on the basis of accumulating scores, which would mean that the file has done several dubious actions before it accumulated a high enough score) and then there is the whole issue of bugs in these software.
  • Virus signatures are not updated frequently - well they are updated more frequently than the ones on the average user's computer.
  • AV software is not configured properly - the guys at VT are very good about getting back to you and I'm sure that if such a concern would be a real one, companies would have already advised them on how to do it.
  • The fact that no detection is returned might mean that the AV timed out - so what? Most desktop AV programs contain hard limits on execution time and file size (to avoid making the computer unusable) and if either of those is tripped, the file is not scanned!

So there you have it: the detection in the "real world" might be slightly better than on VT, but not so much better that you can disqualify the results.

9 comments:

  1. in order
    1) while you may have a problem with the idea that not all of an av product's detective capabilities are present in the command-line scanner, it's still a fact rather than an argument
    1a) what technology is missing varies depending on the vendor, but as a general rule any detective capability based runtime behavioural detection will not be present (because scanners do not run their targets)
    1b) as someone who is in the av industry, i would have thought you already knew the detective capability of gateway scanning doesn't equal that of the collection of endpoint detection technologies that a product deploys on the desktop machines - so yes, it's the same as gateway scanning and yes gateway scanning has less detective capability than that which an end user would see with a desktop product
    1c) you're arguing that the products virustotal uses include these technologies (and they do) but you have failed to show that those technologies remain enabled in the configuration used by virustotal (nor can you show this without detailed knowledge of the configurations used by virustotal (something the hispasec folks do have, by comparison)
    1summary) yes there are problems with behavioural detection, but there are also limits to what known-malware scanning can do on it's own - that's why it's getting complemented with additional technologies like behavioural detection
    2) the fact that many users act to block the updating of their own av software doesn't change the fact that the av software is capable of detecting more than it is detecting, it only points to the fact that many users fail to use av properly
    3) i would reword that to say the av software is not configured OPTIMALLY, but that just goes back to previous points and it's not something virustotal can necessarily fix... the virustotal service is a file processing service, on live machines there's more than just file processing going on and the resources required to test the samples in that sort of context is far beyond what is needed for a simple file processing service and perhaps even beyond what we can reasonably expect to be offered for free
    4) timeout constraints set for a webservice like virustotal are necessarily much more strict than those for a desktop av due to the scale of the operation

    summary) "real world" testing is just as ridiculous... you need proper testing to see what an av product's detective capability can be when used properly, and then you need to use it properly to achieve those results in real life...

    ReplyDelete
  2. by the way - why bait me? am i not already your top commenter by a wide margin?

    ReplyDelete
  3. With the baiting: it was just a lighthearted joke - I really should start putting smileys in my posts :-)

    ReplyDelete
  4. maybe i should put smileys in my comments then :P

    ReplyDelete
  5. I'm not convinced (isn't that a surprise [grin]). Here are my counter-counter arguments:

    - I'm 100% with you that not all capabilities of all products are present. I'm just arguing that (empirically) this makes up a very small percent of the detections and AFAIK nobody was able until now to quantify these features.

    - Getting back to the "how are those scanners configured" question - well, most of the tests omit the exact configuration of the products. Again, given how VT is not in the "testing" business, there isn't any reason why they wouldn't configure the engines the way companies ask them, which is much better than most of the testers do (who usually use "default" configurations to test).

    - An other argument in favor of VT is the flux of samples they get. Lets say that they process 10 000 malware samples a day (a conservative estimate). Av-comparatives works with a collection of ~1 000 000 files spanning the last 6 months. During the same period VT would have seen ~1 800 000 samples. Of course some of those are duplicates, some of those are damaged and so on, but still, the numbers favor VT.

    I'm still of the opinion that the numbers VT sees (but doesn't publish for political reasons) are very relevant and probably close to the ones seen by organizations like av-comparatives or av-test. Of course, uploading a couple of random samples and using the results to declare "AV is dead" is not valid, but uploading known malware by SANS incident handlers and seeing the poor detection rate is a good indication of the reaction time for AV products.

    ReplyDelete
  6. in order:
    1) that would imply that the non-scanner based detections offer very little improvement over just using a plain scanner alone - or in other words simple scanning alone is nearly as good as it gets... not only does that not make sense when you consider the kinds of results we've seen in the past for retrospective testing, but that's also something that proponents of alternative detection technologies (the people trash-talking av) are very unlikely to accept...

    2) indeed, i fully expect vt does configure them the way av companies suggest, but av companies are constrained by the operating environment and they aren't being particular transparent about what sacrifices they're making when they're giving vt configuration suggestions...

    3) argumentum ad numerum - more isn't necessarily better, especially when we don't know for sure those are really malware samples... i blogged recently about someone using the output from metasploit in a demonstration with virustotal - and it's clear to me at least that the output from metasploit is NOT malware...

    summary) i don't trust 'incident handlers' as far as i can throw them when it comes to malware - at least not since i witnessed the complete inability of isc incident handlers to recognize and properly parse a fully caro compliant malware name... just because someone's job is 'incident handler' doesn't mean they're qualified to determine something is/isn't malware... besides that, virustotal is still missing the non-scanner based detection capabilities of av products and not even i accept that the difference that makes is negligible...

    ReplyDelete
  7. Anonymous11:15 AM

    Kurt Wismer = A**Hol*

    ReplyDelete
  8. @Anonymous: I posted your comment because it is my policy not to moderate comments other than spam, but please refrain from ad hominem attacks. They bring nothing useful to the discussion.

    Also, while I don't know Kurt personally, I consider him a very knowledgeable person in the field of (anti-)malware who has opinions which are well founded.

    ReplyDelete
  9. @cdman83
    thanks for that, though i do find the input valuable and suspect there are strong arguments for calling me an a-hole. i often don't tread as lightly as i probably should. i can get condescending and snide and a bunch of other things that i should probably try to keep a better lid on, if only it would occur to me at the time.

    ReplyDelete