Carnegie Mellon University

blue computer screen with white binary code in background and white binary code skull in middle

September 13, 2023

Researchers develop adversarial training methods to improve machine learning-based malware detection software

By Ryan Noone

Over the past several years, machine learning has rapidly transformed the way many computer-related tasks are thought about and performed. Its ability to find patterns and process large amounts of data lends itself to countless applications.

When it comes to malware detection, machine learning has taken a once daunting task and helps streamline the process, enabling antivirus software to detect potential attacks more efficiently and at a higher success rate. Years ago, antivirus software had relied on knowledge of prior attacks, essentially comparing programs' code to an extensive list of known adversarial binaries to detect which programs may cause harm. Today, machine learning leverages behavioral and static artifacts to identify ever-evolving malware attacks, improving the effectiveness of antivirus software.

However, with new technology comes many unknowns, leaving it to researchers to uncover potential vulnerabilities.

"For some of the newest machine learning technologies, like generative AI, we don't fully understand how they can be attacked, so the first step is to figure out what the threat model looks like," said Lujo Bauer, professor in Carnegie Mellon’s Electrical and Computer Engineering and Software and Societal Systems departments.

In 2021, Bauer and a team of researchers demonstrated that machine learning-based malware detectors could be misled by creating variants of malicious binaries, referred to as adversarial examples, that are transformed in a functionality-preserving way to evade detection.

"It's just like any other machine learning classifier. If you know how it works, you may be able to tweak the input so the classifier makes the wrong decision in recognizing an attack," said Bauer.

In their latest paper, 'Adversarial Training for Raw-Binary Malware Classifiers,' researchers investigate the effectiveness of using adversarial training methods to create malware detection models that are more robust to some of the state-of-the-art attacks. In order to train these models, the study's authors found a way to increase the efficiency and scale of creating adversarial examples to make adversarial training practical.

"This type of research hasn't been done in the malware domain before, mainly because of how expensive it is to create adversarial examples," said Bauer. "For this type of training, you need to create hundreds of thousands of examples, which, in the past, hasn't been practically feasible."

To overcome this challenge, the researchers used fewer iterations, code optimization, parallelization and increased the pool of attack eligible binaries to create adversarial examples quickly and affordably. They then began analyzing the effects of varying the length of adversarial training and the impact of training with various types of attacks.

The authors' findings show that while augmentation does not deter state-of-the-art attacks, using an adapted version of a generic gradient-guided method can improve robustness. Results revealed that models can be made more robust to malware-domain attacks in most cases by adversarially training them with lower-effort versions of the same attack. In the best case, the success rate of one state-of-the-art attack was reduced by 85 percent. The authors also found that training with some types of attacks can increase the models' robustness to other attacks.

Overall, their work shows that it is possible to defend against adversarial examples in the raw-binary malware-detection domain through adversarial training and serves as an encouraging development and guide for how to create more robust raw-binary malware classifiers that are less susceptible to evasion attacks.

Paper reference:

Adversarial Training for Raw-Binary Malware Classifiers
*Presented at 32nd USENIX Security Symposium