IRIS

IRIS is a neurosymbolic framework that combines LLMs with static analysis for security vulnerability detection. IRIS uses LLMs to generate source and sink specifications and to filter false positive vulnerable paths.

Overview

At a high level, IRIS takes a project and a CWE (vulnerability class, such as path traversal vulnerability or CWE-22) as input, statically analyzes the project, and outputs a set of potential vulnerabilities (of type CWE) in the project.

iris workflow

Key Features

Combines LLMs with static analysis for enhanced vulnerability detection
Supports multiple vulnerability classes (CWEs)
Works with various LLM models
Provides detailed analysis and results
Easy to set up with Docker or native installation

Getting Started

To get started with IRIS:

Check out the Environment Setup guide
Follow our Quickstart tutorial
Learn about Supported CWEs and Models
If you are interested in contributing, read our Guidelines

Resources

Workflow

IRIS accepts a codebase and a CWE (Common Weakness Enumeration) as input, and uses a neurosymbolic approach to identify security vulnerabilities of type CWE in the project. To acheieve this, IRIS uses the following steps:

iris workflow

First we create CodeQL queries to collect external APIs in the project and all internal function parameters.
We use an LLM to classify the external APIs as potential sources, sinks, or taint propagators. In another query, we use an LLM to classify the internal function parameters as potential sources. We call these taint specifications.
Using the taint specifications from step 2, we build a project-specific and cwe-specific (e.g., for CWE 22) CodeQL query.
Then we run the query to find vulnerabilities in the given project and post-process the results.
We provide the LLM the post-processed results to filter out false positives and determine whether a CWE is detected.

Dataset

We have curated a dataset of Java projects, containing 127 real-world previously known vulnerabilities across 11 popular vulnerability classes. The dataset is also available to use with the Hugging Face datasets library.

CWE-Bench-Java

CWE-Bench-Java on Hugging Face

Results

These results are from the ICLR 2025 version of IRIS, and do not include newly added CWEs/ projects.

Results on the effectiveness of IRIS across 121 projects and 9 LLMs can be found at /results. Each model has a unique CSV file, with the following structure as an example.

CWE ID	CVE	Author	Package	Tag	Recall	Alerts	Paths	TP Alerts	TP Paths	Precision	F1
CWE-022	CVE-2016-10726	DSpace	DSpace	4.4	0	31	63	0	0	0	0

None refers to data that was not collected, while N/A refers to a measure that cannot be calculated, either because of missing data or a division by zero.

Using Docker (Recommended)

docker build -f Dockerfile --platform linux/x86_64 -t iris:latest .
docker run --platform=linux/amd64 -it iris:latest

Note: Read the instructions for "Native Setup" ahead if you intend to configure Java build tools (JDK, Maven, Gradle) or CodeQL.

Native Setup (Mac/Linux)

Step 1: Setup Conda environment

conda env create -f environment.yml
conda activate iris

If you have a CUDA-capable GPU and want to enable hardware acceleration, install the appropriate CUDA toolkit, for example:

$ conda install pytorch-cuda=12.1 -c nvidia -c pytorch

Replace 12.1 with the CUDA version compatible with your GPU and drivers, if needed.

Step 2: Configure Java build tools

To apply IRIS to Java projects, you need to specify the paths to your Java build tools (JDK, Maven, Gradle) in the dep_configs.json file in the project root.

The versions of these tools required by each project are specified in data/build_info.csv. For instance, perwendel__spark_CVE-2018-9159_2.7.1 requires JDK 8 and Maven 3.5.0. You can install and manage these tools easily using SDKMAN!.

# Install SDKMAN!
curl -s "https://get.sdkman.io" | bash
source "$HOME/.sdkman/bin/sdkman-init.sh"

# Install Java 8 and Maven 3.5.0
sdk install java 8.0.452-amzn
sdk install maven 3.5.0

Step 3: Configure CodeQL

IRIS relies on the CodeQL Action bundle, which includes CLI utilities and pre-defined queries for various CWEs and languages ("QL packs").

If you already have CodeQL installed, specify its location via the CODEQL_DIR environment variable in src/config.py. Otherwise, download an appropriate version of the CodeQL Action bundle from the CodeQL Action releases page.

For the latest version: Visit the latest release and download the appropriate bundle for your OS:
- codeql-bundle-osx64.tar.gz for macOS
- codeql-bundle-linux64.tar.gz for Linux
For a specific version (e.g., 2.15.0): Go to the CodeQL Action releases page, find the release tagged codeql-bundle-v2.15.0, and download the appropriate bundle for your platform.

After downloading, extract the archive in the project root directory:

tar -xzf codeql-bundle-<platform>.tar.gz

This should create a sub-directory codeql/ with the executable codeql inside.

Lastly, add the path of this executable to your PATH environment variable:

export PATH="$PWD/codeql:$PATH"

Note: Also adjust the environment variable CODEQL_QUERY_VERSION in src/config.py according to the instructions therein. For instance, for CodeQL v2.15.0, this should be 0.8.0.

Quickstart

Make sure you have followed all of the environment setup instructions before proceeding!

To quickly try IRIS on the example project perwendel__spark_CVE-2018-9159_2.7.1, run the following commands:

# Build the project
python scripts/fetch_and_build.py --filter perwendel__spark_CVE-2018-9159_2.7.1

# Generate the CodeQL database
python scripts/build_codeql_dbs.py --project perwendel__spark_CVE-2018-9159_2.7.1

# Run IRIS analysis
python src/iris.py --query cwe-022wLLM --run-id test --llm qwen2.5-coder-7b perwendel__spark_CVE-2018-9159_2.7.1

This will build the project, generate the CodeQL database, and analyze it for CWE-022 vulnerabilities using the specified LLM (qwen2.5-coder-7b). The output of these three steps will be stored under data/build-info/, data/codeql-dbs/, and output/ respectively.

Supported CWEs

Here are the following CWEs supported, that you can specify as an argument to --query when using src/iris.py.

cwe-022wLLM - CWE-022 (Path Traversal)
cwe-078wLLM - CWE-078 (OS Command Injection)
cwe-079wLLM - CWE-079 (Cross-Site Scripting)
cwe-089wLLM - CWE-089 (SQL Injection)
cwe-094wLLM - CWE-094 (Code Injection)
cwe-295wLLM - CWE-295 (Improper Certificate Validation)
cwe-352wLLM - CWE-352 (Cross-Site Request Forgery)
cwe-502wLLM - CWE-502 (Deserialization of Untrusted Data)
cwe-611wLLM - CWE-611 (Improper Restriction of XML External Entity Reference)
cwe-807wLLM - CWE-807 (Reliance on Untrusted Inputs in a Security Decision)
cwe-918wLLM - CWE-918 (Server-Side Request Forgery)

Supported Models

We support the following models with our models API wrapper (found in src/models) in the project. Listed below are the arguments you can use for --llm when using src/iris.py. You're free to use your own way of instantiating models or adding on to the existing library. Some of them require your own API key or license agreement on HuggingFace.

List of Models

Codegen

codegen-16b-multi
codegen25-7b-instruct
codegen25-7b-multi

Codellama

Standard Models

codellama-70b-instruct
codellama-34b
codellama-34b-python
codellama-34b-instruct
codellama-13b-instruct
codellama-7b-instruct

CodeT5p

codet5p-16b-instruct
codet5p-16b
codet5p-6b
codet5p-2b

DeepSeek

deepseekcoder-33b
deepseekcoder-7b
deepseekcoder-v2-15b

Gemini

All Gemini models are supported, including those below, other model names can be found in the Gemini API documentation.
gemini-1.5-pro
gemini-1.5-flash
gemini-pro
gemini-pro-vision
gemini-1.0-pro-vision

Gemma

gemma-7b
gemma-7b-it
gemma-2b
gemma-2b-it
codegemma-7b-it
gemma-2-27b
gemma-2-9b

GPT

gpt-4
gpt-3.5
gpt-4-1106
gpt-4-0613

LLaMA

LLaMA-2

llama-2-7b-chat
llama-2-13b-chat
llama-2-70b-chat
llama-2-7b
llama-2-13b
llama-2-70b

LLaMA-3

llama-3-8b
llama-3.1-8b
llama-3-70b
llama-3.1-70b
llama-3-70b-tai

Mistral

mistral-7b-instruct
mixtral-8x7b-instruct
mixtral-8x7b
mixtral-8x22b
mistral-codestral-22b

Qwen

qwen2.5-coder-7b
qwen2.5-coder-1.5b
qwen2.5-14b
qwen2.5-32b
qwen2.5-72b

StarCoder

starcoder
starcoder2-15b

WizardLM

WizardCoder

wizardcoder-15b
wizardcoder-34b-python
wizardcoder-13b-python

WizardLM Base

wizardlm-70b
wizardlm-13b
wizardlm-30b

Ollama

You need to install the ollama package manually.

qwen2.5-coder:latest
qwen2.5:32b
llama3.2:latest
deepseek-r1:32b
deepseek-r1:latest

IRIS Results Visualizer

A web-based visualizer for IRIS static analysis results stored in SARIF format.

iris visualizer

Features

Interactive visualization of IRIS analysis results
Code flow visualization with clickable nodes
Source code display with line highlighting
Filtering by project, CWE, and trace
Configurable paths and settings
User-driven project selection
Project metadata panel (CVE, CWE, GitHub repo, commits)

Configuration

The visualizer uses a config.json file to configure paths and settings. The configuration file is automatically created with default values if it doesn't exist.

Configuration Options

{
    "server": {
        "port": 8000,
        "host": "localhost"
    },
    "paths": {
        "outputs_dir": "../output",
        "project_sources_dir": "../data/project-sources",
        "project_info_csv": "../data/project_info.csv"
    },
    "ui": {
        "max_source_code_height": "600px",
        "default_project": "perwendel__spark_CVE-2018-9159_2.7.1"
    }
}

Path Configuration

outputs_dir: Path to the directory containing IRIS analysis outputs (SARIF files)
project_sources_dir: Path to the directory containing project source code
project_info_csv: Path to the CSV file that stores project metadata (CVE, CWE, repo, commits)

Server Configuration

port: Port number for the HTTP server (default: 8000)
host: Host address for the server (default: localhost)

UI Configuration

max_source_code_height: Maximum height for the source code display area
default_project: Project that will be pre-selected when the UI first loads

Usage

Configure paths: Edit config.json to point to your outputs and source directories
Start the server: Run python3 server.py
Open in browser: Navigate to http://localhost:8000
Select a project: Choose a project from the dropdown to load its analysis results
Filter and explore: Use the CWE and model filters to explore specific vulnerabilities

User Flow

Project Selection: The visualizer starts with an empty state. Users must first select a project from the dropdown.
Data Loading: Once a project is selected, the visualizer loads all available SARIF files for that project.
Filtering: Users can then filter by CWE and model to focus on specific vulnerability types.
Trace Exploration: Click on traces to view detailed information, code flows, and source code.

Directory Structure

The visualizer expects the following directory structure:

output/
├── project1/
│   ├── cwe-352/
│   ├── cwe-352wLLM/
│   │   ├── results.sarif
│   │   ├── api_labels.json
│   │   └── ...
│   └── cwe-352wLLM-final/
|   └── ...
└── project2/
    ├── cwe-022/
    └── cwe-022wLLM/

data/project-sources/
├── project1/
│   ├── src/
│   │   └── main.java
│   └── ...
└── project2/
    └── ...

API Endpoints (server-side)

GET /api/projects - List available projects
GET /api/cwes - List available CWEs
GET /api/sarif/{path} - Get SARIF file content
GET /api/source/{project}/{file} - Get source code file
GET /api/project_metadata/{project_slug} - Get metadata (CVE/CWE/GitHub info) for a project
GET /api/models?project={slug}&cwe={id} - List models that produced results for a given project/CWE
GET /api/project_cwes?project={slug} - List CWEs that have results for a given project
GET /api/source_projects - List source code projects available on disk
GET /api/local_file/{project}/{relative_path}?line={n} - Render a source file in HTML with optional line highlight
GET /api/dir?project={slug}&path={subdir} - Directory listing for the in-browser file explorer
GET /api/config - Get client configuration

Customization

To use the visualizer with different data sources:

Update the paths section in config.json
Ensure your outputs directory follows the expected structure
Ensure your project sources directory contains the source code files referenced in the SARIF files

Troubleshooting

404 errors: Check that the paths in config.json are correct and the directories exist
Empty project list: Verify that the outputs directory contains the expected folder structure
Source code not loading: Ensure the project sources directory contains the referenced files
No data after project selection: Check that SARIF files exist for the selected project

Development

The visualizer consists of:

server.py - Python HTTP server with API endpoints
index.html - Main HTML interface
app.js - Frontend JavaScript logic
styles.css - CSS styling
config.json - Configuration file

File Structure

visualizer/
├── index.html          # Main HTML file
├── styles.css          # CSS styling
├── app.js             # Main JavaScript application
├── server.py          # Python HTTP server
└── README.md          # This file

API Endpoints (server-side)

The server provides the following API endpoints:

GET /api/projects - List all available projects
GET /api/cwes - List all available CWE types
GET /api/sarif/{path} - Get SARIF file content
GET /api/source/{project}/{file} - Get source code file
GET /api/project_metadata/{project_slug} - Get project metadata
GET /api/models?project={slug}&cwe={id} - List models for a project/CWE
GET /api/project_cwes?project={slug} - List CWEs for a project
GET /api/source_projects - List available source code projects
GET /api/local_file/{project}/{relative_path}?line={n} - Render a local file with optional line highlight
GET /api/dir?project={slug}&path={subdir} - Directory listing API

Configuration

The server is configured to look for:

Outputs directory: ../output (contains SARIF files)
Project sources: ../data/project-sources (contains source code)

You can modify these default paths in server.py:

OUTPUTS_DIR = "../output"
PROJECT_SOURCES_DIR = "../data/project-sources"

Data Format

The visualizer expects:

SARIF Files: Located in output/{project}/{run-id}/cwe-{cwe_id}/results.sarif
Source Code: Located in data/project-sources/{project}/

SARIF Structure

The visualizer parses SARIF files and extracts:

Vulnerability traces (results)
Code flows (data flow paths)
File locations and line numbers
Severity levels and rule information

Project Structure

output/
└── crate__crate_5.5.1_CVE-2023-51982_5.5.1/
    ├── cwe-352/
    │   ├── api_labels_gemini-1.5-flash.json
    │   ├── Spec.yml
    │   └── MySinks.qll
    ├── cwe-352wLLM/
    ├── cwe-352wLLM-final/
    └── cwe-352wLLM-posthoc-filter/

Usage Guide

1. Filtering Traces

CWE Type: Filter by specific vulnerability types (e.g., CWE-22, CWE-78)
Project: Filter by specific projects
Model: Filter by AI model used for analysis

2. Exploring Traces

Click on any trace in the left panel
View the trace description and severity
Examine the code flow steps
Click on flow steps to view source code

Source code is displayed with syntax highlighting
Vulnerable lines are highlighted in red
Click on code flow steps to jump to specific lines
Line numbers are shown for easy navigation

4. Understanding the Results

Code Flow: Shows the path of data from source to sink
Source Code: Displays the actual vulnerable code
Severity: Indicates the security impact level
Rule ID: Shows the specific vulnerability rule

Troubleshooting

Common Issues

Server won't start:
- Make sure Python 3 is installed
- Check that the outputs and project-sources directories exist
- Verify port 8000 is not in use
No traces shown:
- Check that SARIF files exist in the outputs directory
- Verify the file structure matches the expected format
- Check browser console for error messages
Source code not loading:
- Ensure project sources are in the correct directory
- Check that file paths in SARIF match source file locations
- Verify file permissions

Debug Mode

Open browser developer tools (F12) and check the console for:

API request errors
Data loading issues
JavaScript errors

The visualizer exports a global object window.irisVisualizer for debugging:

// Access loaded data
console.log(window.irisVisualizer.allTraces);
console.log(window.irisVisualizer.currentTrace);

Development

Adding New Features

New Filter Types: Add to the HTML and update populateFilters() in app.js

Adding CWEs

We are always open to supporting new CWEs. We recommend any of the CWEs in the OWASP top 25 that we don't currently support.

To add a CWE, you will need to provide the CodeQL queries and add the CWE queries to queries.py.

Typically the structure of the queries would be

cwe-*
├── cwe-*wLLM.ql
│   
└── My[CodeQLCWEQueryModuleName].qll

cwe-*wLLM.ql is the wrapper query that imports the module *.qll file. The *.qll file is the module library - this is where the logic for the sources and sinks is implemented.

Find the CWE definition on the Mitre CWE site. A strong understanding of the CWE will help you in the following steps.
We recommend using CodeQL's CWE queries for examples. You can find CodeQL's CWE queries in the CodeQL github repository. In java/ql/src/Security/CWE, locate the CWE you're interested in adding. Within each CWE directory, locate the .ql file. Often there are multiple .ql files - a quick heuristic is to pick the .ql file with the most general name, and most similar to the CWE name.

For example - CWE-022 has TaintedPath.ql and ZipSlip.ql. We used TaintedPath.ql.

Once you've found the corresponding .ql file for the CWE - make note of this file. This will be the wrapper query. Within the file, there should be an import statement that refers to the module related to the CWE. Often it will be prefixed with semmle.code.java.security and end with Query. Within the CodeQL repository, find the module in codeql/java/ql/lib/semmle/code/java/security.
Within the cwe-queries directory of iris, create a new folder titled cwe-[CWE number]. Within the folder copy the .ql and the .qll files. Rename them with the prefix My. Within the .qll file - there may be multiple modules suffixed with Config. Find the Config that includes the .qll name in it - - this is where the source and sink predicates are defined.

Within the module, replace the predicates with the following

predicate isSource(DataFlow::Node source) {
    isGPTDetectedSource(source)
 }

  predicate isSink(DataFlow::Node sink) {
    isGPTDetectedSink(sink)
  }

  predicate isBarrier(DataFlow::Node sanitizer) {
    sanitizer.getType() instanceof BoxedType or
    sanitizer.getType() instanceof PrimitiveType or
    sanitizer.getType() instanceof NumberType
  }

  predicate isAdditionalFlowStep(DataFlow::Node n1, DataFlow::Node n2) {
    isGPTDetectedStep(n1, n2)
  }

Also add the following imports:

 import MySources
 import MySinks
 import MySummaries

Remove the former predicate definitions and anything else in the file related to the former predicates. Now in the .ql file, update the imports to refer to the renamed .qll module.

Now within the queries.py file, add the CWE and its queries to the QUERIES dictionary. Note - if the CWE is double digits - for the id use 0[number]. For example - CWE 22 would be cwe-022. Use the following format - we use CWE-22 as an example:

"cwe-[number]wLLM": {
    "name": "cwe-[number]wLLM",
    "type": "cwe-query",
    "cwe_id": "022",
    "cwe_id_short": "22",
    "cwe_id_tag": "CWE-22",
    "desc": "Path Traversal or Zip Slip",
    "queries": [
      "cwe-queries/cwe-022/cwe-022wLLM.ql",
      "cwe-queries/cwe-022/MyTaintedPathQuery.qll",
    ],
    "prompts": {
      "cwe_id": "CWE-022",
      "desc": "Path Traversal or Zip Slip",
      "long_desc": """\
        A path traversal vulnerability allows an attacker to access files \
        on your web server to which they should not have access. They do this by tricking either \
        the web server or the web application running on it into returning files that exist outside \
        of the web root folder. Another attack pattern is that users can pass in malicious Zip file \
        which may contain directories like "../". Typical sources of this vulnerability involves \
        obtaining information from untrusted user input through web requests, getting entry directory \
        from Zip files. Sinks will relate to file system manipulation, such as creating file, listing \
        directories, and etc.""",
      "examples": [
        {
          "package": "java.util.zip",
          "class": "ZipEntry",
          "method": "getName",
          "signature": "String getName()",
            "sink_args": [],
          "type": "source",
        },
        {
          "package": "java.io",
          "class": "FileInputStream",
          "method": "FileInputStream",
          "signature": "FileInputStream(File file)",
          "sink_args" : ["file"],
          "type": "sink",
        },
        {
          "package": "java.net",
          "class": "URL",
          "method": "URL",
          "signature": "URL(String url)",
            "sink_args": [],
          "type": "taint-propagator",
        },
        {
            "package": "java.io",
            "class": "File",
            "method": "File",
            "signature": "File(String path)",
            "sink_args": [],
          "type": "taint-propagator",
        },
      ]
    }
  },

For the long_desc key - look up definitions of the CWE and find a clear description that summarizes what the CWE is and how it's exploited.

For the examples, you will need to provide sources and sinks. A CodeQL source is a value that an attacker can use for malicious operations within a system. A CodeQL sink is a program point that accepts a malicious source, and ends up using the malicious data. You can use the Github Advisory Database to find examples of the CWE. Or the definition may provide common abstractions which you can then search for Java's most used libraries for the related abstraction.

Add a hint related to CWE for contextual analysis prompt in prompts.py. Hints are stored in POSTHOC_FILTER_HINTS. The key should be the CWE number and the value include sentences that describe extra details to look out for when detecting the CWE. Sites that have definitions for the CWE will often have more specific guidance on the CWE.
Test out the query. You can provide the --test-run parameter when running iris.py to see if the CodeQL queries compile. Afterwards, you can try a test run with a small model on one of the Java projects associated with the CWE. The GitHub Advisory Database is an easy way to find a vulnerable project given the CWE.

Contributing and Feedback

Feel free to address any open issues or add your own issue and fix. We love feedback! Please adhere to the following guidelines.

Create a Github issue outlining the piece of work. Solicit feedback from anyone who has recently contributed to the component of the repository you plan to contribute to.
Checkout a branch from main - preferably name your branch [github username]/[brief description of contribution]
Create a pull request that refers to the created github issue in the commit message.
To link to the github issue, in your commit for example you would simply add in the commit message: [what the PR does briefly] #[commit issue]
Then when you push your commit and create your pull request, Github will automatically link the commit back to the issue. Add more details in the pull request, and request reviewers from anyone who has recently modified related code.
After 1 approval, merge your pull request.

Changelog

IRIS v2 (unreleased)

Features:

Adds support for several new CWEs:
- CWE-089 (SQL Injection)
- CWE-295 (Improper Certificate Validation)
- CWE-352 (Cross-Site Request Forgery)
- CWE-502 (Deserialization of Untrusted Data)
- CWE-611 (Improper Restriction of XML External Entity Reference)
- CWE-807 (Reliance on Untrusted Inputs in a Security Decision)
- CWE-918 (Server-Side Request Forgery)
Reworks scripts to depend on OpenJDK instead of Oracle
Adds support for Gemini

IRIS v1

Features:

Introduces IRIS
Added support for 4 CWEs:
- CWE-022 (Path Traversal)
- CWE-078 (OS Command Injection)
- CWE-079 (Cross-Site Scripting)
- CWE-094 (Code Injection)

Team

IRIS is a collaborative effort between researchers at the University of Pennsylvania and Cornell University. Please reach out to us if you have questions about IRIS.

Students

Claire Wang, University of Pennsylvania

Amartya Das, Ward Melville High School

Derin Gezgin, Connecticut College

Zhengdong (Forest) Huang, Southern University of Science and Technology

Nevena Stojkovic, Massachusetts Institute of Technology

Faculty

Ziyang Li, Johns Hopkins University, previously PhD student at the University of Pennsylvania

Saikat Dutta, Cornell University

Mayur Naik, University of Pennsylvania

Citation

Consider citing our ICLR'25 paper:

@inproceedings{li2025iris,
title={LLM-Assisted Static Analysis for Detecting Security Vulnerabilities},
author={Ziyang Li and Saikat Dutta and Mayur Naik},
booktitle={International Conference on Learning Representations},
year={2025},
url={https://arxiv.org/abs/2405.17238}
}

Arxiv Link

IRIS Documentation