What it is Code Insight, a new tool integrated into the VirusTotal online service that analyzes possible malicious behavior of malware using an approach based on generative models and artificial intelligence. Also presented was the large linguistic model Sec-PaLM by Google on which the tool just presented by VirusTotal is based.
Among the best known and most used online services that allow you to make the malware scan of files and Internet addresses, there is VirusTotal.
In our guide to VirusTotal we saw that it is a free online service (however, paid plans are available that offer more advanced features) which allows theparsing files and URLs for detecting malware, viruses, trojans and other types of computer threats. It was created in 2004 by a Spanish company and then acquired by Google in 2012 and is now maintained by a subsidiary company (Chronicle).
Users can upload files or enter URLs to VirusTotal for the service to scan using dozens of antivirus engines and third-party anti-malware, including well-known names like Sophos, BitDefender, ESET, TrendMicro, Malwarebytes, F-Secure, AVG, Avast, McAfee, Kaspersky, Dr.Web, Symantec and many more. After the scan, VirusTotal provides a detailed report indicating how many antivirus solutions detected malware in the scanned file or URL.
VirusTotal is widely used by computer security professionals, malware researchers and ordinary users interested in checking the security of a file or a website before opening or visiting it. The goal is to provide a fast and accurate analysis service, to help identify potential cyber threats and improve the security of online and offline information.
VirusTotal Code Insight, threat analysis with generative artificial intelligence
On the occasion of RSA Conference of April 2023, VirusTotal announced that it has embraced the use of a generative model expressly developed to manage aspects related to malware detection and ascertain the behavior of potentially harmful elements.
The new feature integrated into VirusTotal is called Code Insight and is in turn based on the Google Cloud Security AI Workbench (featured in this YouTube video), a newly unveiled platform that uses Large Language Models (LLM) Sec-PaLM specially developed to handle the needs related to cyber security and “understand” what files being scanned or running on the system do.
At first, Code Insight has been configured to analyze a subset of file PowerShell uploaded to VirusTotal. The system excludes files that are very similar to those already processed previously as well as items that are too large. This approach enables efficient use of parsing resources, ensuring that only the most relevant files (such as PowerShell ones in PS1 format) are subject to scrutiny.
Code Insight it also helps to get information about false positives e negatives since its analysis is completely independent of associated metadata (such as antivirus results): in fact, only the content of the file is examined.
“The integration of the LLM into the arsenal of code analysis tools represents a significant advance that allows security professionals to obtain valuable insights into the structure and behavior of potentially malicious code, improving the threat detection and response efficiency“, commented Bernardo Quinterofounder of VirusTotal.
VirusTotal technicians had declared in recent days that they would add support for other file formats, alongside PS1, to the list of supported ones with the precise aim of extending the “range” of this new feature. No sooner said than done.
From now Code Insight is also capable of parsing batch files in BAT format and CMD, Shell (SH) and VBScript (VBS), scripts that Microsoft wants to deactivate soon in Windows 11. Although there is no official confirmation yet, the new version of Code Insight appears to work AutoHotkey (AHK) and Python (PY) files as well. Quintero also added that the system can perform the file scan twice as large as before.
As is the case with any other LLM modelit should be noted that the data returned by Code Insight can be subject to errors and their accuracy is variable: security analysts should therefore interpret the information generated by Code Insight considering the context data relating to each file being analysed.