

In addition, we studied their responses individually and in aggregate learn what the black boxes were doing. Our long term aim is, of course, to strengthen them through coevolution. We measured success in terms of driving down their median detection proportion across the malware submitted to anti-virus engines. The plan was to treat the AV engines of VirusTotal as black boxes with which we interact in real time. VirusTotal is a framework where different vendors can contribute with anti-virus (AV) engines. VirusTotal’s existence makes answering our question possible. In this coevolution, the detector and the evader try to defeat each other. The question that provoked this research is “Can we engineer a situation in which artificial evolutionary behaviour can assist in ‘hothousing’ the actual malware detection arms race to the benefit of the detection side?”. This evolutionary stress ’hothouses’ the detector speeding its acquisition of detection power, modulo the last disruptive move ( Section 2). We use genetic algorithms to place evolutionary stress on state of the art malware detectors so that they have to improve their detection and quickly improve their ability to detect incrementally produced variants. Search and, specifically, genetic algorithms simulate evolutionary behaviour, usually with a restricted palette so as to make the simulation tractable. We study this arms race as a coevolution. Simulating these black hats’ incremental moves will speed white hats’ ability to detect variants based on them, and force black hats to make disruptive moves instead of incremental ones.

Black hats are exploiting the new status quo with a flood of cheap, evasive variants and are provoking white hats to develop new disruptive techniques white hats’ efforts currently center on semantics-based detection methods and general improvements to their machine learning based detectors. This has already happened with malware detection, where black hats can easily and effectively add opaque predicates to evade machine learning algorithms. Once black hats have adapted to the new detection technique, the race reverts to a status quo where black hats need only make trivial modifications to a malware sample.

A good example of such a disruptive move is forcing black hats to evade machine learning-based detectors. To get ahead, white hats must make a disruptive move. White hats are trapped chasing black hats’ moves. The malware detection arms race favours black hats, who can effectively respond to detection efforts with low-cost incremental moves. We also show where VirusTotal focuses its detection efforts, by analysing EEE’s variants. VirusTotal’s tools learn and forget fast, actually in about 3 days. We report both how well VirusTotal learns to detect EEE-packed binaries and how well VirusTotal forgets in order to reduce false positives. During our 6 month study, we continually improved EEE in response to VirusTotal, eventually learning a packer that produces packed malware whose evasiveness goes from an initial 51.8% median to 19.6%. We enter EEE into the detection arms race with VirusTotal, the most prominent cloud service for running anti-virus tools on software. Playing the role of a black hat, EEE uses evolutionary computation to disrupt the creation of malware signatures.
MALWARE USED RUNONLY TO AVOID DETECTION WINDOWS
To realise Hothouse, we evolve EEE, an entropy-based polymorphic packer for Windows executables. We present a method, called Hothouse, that combines simulation and search to accelerate the white hat’s ability to counter the black hat’s incremental moves, thereby forcing black hats to perform disruptive moves more often. Examples include system calls, signatures and machine learning. On occasion, white hats make a disruptive move and find a new technique that forces black hats to work harder. Most of the time, black hats need only make incremental changes to evade them. White hats must be conservative to avoid false positives when searching for malicious behaviour. This arms race is asymmetric: detection is harder and more expensive than evasion. Malware detection is in a coevolutionary arms race where the attackers and defenders are constantly seeking advantage.
