Applying AI Analysis to PDF Threats

In our previous post we extended VirusTotal Code Insights to browser extensions and supply-chain artifacts. A key finding from that analysis was how our AI could apply contextual knowledge to its evaluation. It wasn’t just analyzing code in isolation, it was correlating a package’s stated purpose (its name and description) with its actual behavior, flagging malicious logic that contradicted its public description. We’re now applying the same idea to one of the most common file formats in the world, the PDF.


Audio version of this post, created with NotebookLM Deep Dive


PDFs are multi-layered. There’s the object tree (catalog, pages, objects, streams, actions, embedded files) and there’s the visible layer (text/images the user reads). Code Insights analyzes both, then correlates: does the document content, claims, and branding make sense given its internal behaviors? That lets us surface not only classic PDF exploitation (e.g., auto-actions, JS, external launches) but also pure social engineering (phishing, vishing, QR-lures) even when the file has no executable logic. This dual approach allows the AI not only to detect malicious code but also to identify sophisticated scams.

Let’s look at real-world samples surfaced by Code Insights during its initial testing phase. We’ll start with cases where the PDF contains no malicious code, which traditional engines often miss because there’s no executable payload to detect. This is where Code Insights proves useful, identifying clear signs of fraud and social engineering that aim to manipulate the user, not the machine.

Case 1 – Fake debt collection targeting financial fraud

This PDF is a real-world sample sent to VirusTotal and captured by Code Insights during early testing. It was flagged as malicious based entirely on its visible content, without relying on any embedded code or execution logic. The file was marked as clean by all other engines, likely because it contains no scripts, exploits, or embedded payloads.

Read the original article: