IRIS

IRIS is a neurosymbolic framework that combines LLMs with static analysis for security vulnerability detection. IRIS uses LLMs to generate source and sink specifications and to filter false positive vulnerable paths.

Overview

At a high level, IRIS takes a project and a CWE (vulnerability class, such as path traversal vulnerability or CWE-22) as input, statically analyzes the project, and outputs a set of potential vulnerabilities (of type CWE) in the project.

iris workflow

Key Features

  • Combines LLMs with static analysis for enhanced vulnerability detection
  • Supports multiple vulnerability classes (CWEs)
  • Works with various LLM models
  • Provides detailed analysis and results
  • Easy to set up with Docker or native installation

Getting Started

To get started with IRIS:

  1. Check out the Environment Setup guide
  2. Follow our Quickstart tutorial
  3. Learn about Supported CWEs and Models
  4. If you are interested in contributing, read our Guidelines

Resources

Workflow

IRIS accepts a codebase and a CWE (Common Weakness Enumeration) as input, and uses a neurosymbolic approach to identify security vulnerabilities of type CWE in the project. To acheieve this, IRIS uses the following steps:

iris workflow

  1. First we create CodeQL queries to collect external APIs in the project and all internal function parameters.
  2. We use an LLM to classify the external APIs as potential sources, sinks, or taint propagators. In another query, we use an LLM to classify the internal function parameters as potential sources. We call these taint specifications.
  3. Using the taint specifications from step 2, we build a project-specific and cwe-specific (e.g., for CWE 22) CodeQL query.
  4. Then we run the query to find vulnerabilities in the given project and post-process the results.
  5. We provide the LLM the post-processed results to filter out false positives and determine whether a CWE is detected.

Dataset

We have curated a dataset of Java projects, containing 127 real-world previously known vulnerabilities across 11 popular vulnerability classes. The dataset is also available to use with the Hugging Face datasets library.

CWE-Bench-Java

CWE-Bench-Java on Hugging Face

Results

These results are from the ICLR 2025 version of IRIS, and do not include newly added CWEs/ projects.

Results on the effectiveness of IRIS across 121 projects and 9 LLMs can be found at /results. Each model has a unique CSV file, with the following structure as an example.

CWE IDCVEAuthorPackageTagRecallAlertsPathsTP AlertsTP PathsPrecisionF1
CWE-022CVE-2016-10726DSpaceDSpace4.4031630000

None refers to data that was not collected, while N/A refers to a measure that cannot be calculated, either because of missing data or a division by zero.

docker build -f Dockerfile --platform linux/x86_64 -t iris:latest .
docker run --platform=linux/amd64 -it iris:latest

Note: Read the instructions for "Native Setup" ahead if you intend to configure Java build tools (JDK, Maven, Gradle) or CodeQL.

Native Setup (Mac/Linux)

Step 1: Setup Conda environment

conda env create -f environment.yml
conda activate iris

If you have a CUDA-capable GPU and want to enable hardware acceleration, install the appropriate CUDA toolkit, for example:

$ conda install pytorch-cuda=12.1 -c nvidia -c pytorch

Replace 12.1 with the CUDA version compatible with your GPU and drivers, if needed.

Step 2: Configure Java build tools

To apply IRIS to Java projects, you need to specify the paths to your Java build tools (JDK, Maven, Gradle) in the dep_configs.json file in the project root.

The versions of these tools required by each project are specified in data/build_info.csv. For instance, perwendel__spark_CVE-2018-9159_2.7.1 requires JDK 8 and Maven 3.5.0. You can install and manage these tools easily using SDKMAN!.

# Install SDKMAN!
curl -s "https://get.sdkman.io" | bash
source "$HOME/.sdkman/bin/sdkman-init.sh"

# Install Java 8 and Maven 3.5.0
sdk install java 8.0.452-amzn
sdk install maven 3.5.0

Step 3: Configure CodeQL

IRIS relies on the CodeQL Action bundle, which includes CLI utilities and pre-defined queries for various CWEs and languages ("QL packs").

If you already have CodeQL installed, specify its location via the CODEQL_DIR environment variable in src/config.py. Otherwise, download an appropriate version of the CodeQL Action bundle from the CodeQL Action releases page.

  • For the latest version: Visit the latest release and download the appropriate bundle for your OS:

    • codeql-bundle-osx64.tar.gz for macOS
    • codeql-bundle-linux64.tar.gz for Linux
  • For a specific version (e.g., 2.15.0): Go to the CodeQL Action releases page, find the release tagged codeql-bundle-v2.15.0, and download the appropriate bundle for your platform.

After downloading, extract the archive in the project root directory:

tar -xzf codeql-bundle-<platform>.tar.gz

This should create a sub-directory codeql/ with the executable codeql inside.

Lastly, add the path of this executable to your PATH environment variable:

export PATH="$PWD/codeql:$PATH"

Note: Also adjust the environment variable CODEQL_QUERY_VERSION in src/config.py according to the instructions therein. For instance, for CodeQL v2.15.0, this should be 0.8.0.

Quickstart

Make sure you have followed all of the environment setup instructions before proceeding!

To quickly try IRIS on the example project perwendel__spark_CVE-2018-9159_2.7.1, run the following commands:

# Build the project
python scripts/fetch_and_build.py --filter perwendel__spark_CVE-2018-9159_2.7.1

# Generate the CodeQL database
python scripts/build_codeql_dbs.py --project perwendel__spark_CVE-2018-9159_2.7.1

# Run IRIS analysis
python src/iris.py --query cwe-022wLLM --run-id test --llm qwen2.5-coder-7b perwendel__spark_CVE-2018-9159_2.7.1

This will build the project, generate the CodeQL database, and analyze it for CWE-022 vulnerabilities using the specified LLM (qwen2.5-coder-7b). The output of these three steps will be stored under data/build-info/, data/codeql-dbs/, and output/ respectively.

Supported CWEs

Here are the following CWEs supported, that you can specify as an argument to --query when using src/iris.py.

  • cwe-022wLLM - CWE-022 (Path Traversal)
  • cwe-078wLLM - CWE-078 (OS Command Injection)
  • cwe-079wLLM - CWE-079 (Cross-Site Scripting)
  • cwe-089wLLM - CWE-089 (SQL Injection)
  • cwe-094wLLM - CWE-094 (Code Injection)
  • cwe-295wLLM - CWE-295 (Improper Certificate Validation)
  • cwe-352wLLM - CWE-352 (Cross-Site Request Forgery)
  • cwe-502wLLM - CWE-502 (Deserialization of Untrusted Data)
  • cwe-611wLLM - CWE-611 (Improper Restriction of XML External Entity Reference)
  • cwe-807wLLM - CWE-807 (Reliance on Untrusted Inputs in a Security Decision)
  • cwe-918wLLM - CWE-918 (Server-Side Request Forgery)

Supported Models

We support the following models with our models API wrapper (found in src/models) in the project. Listed below are the arguments you can use for --llm when using src/iris.py. You're free to use your own way of instantiating models or adding on to the existing library. Some of them require your own API key or license agreement on HuggingFace.

List of Models

Codegen

  • codegen-16b-multi
  • codegen25-7b-instruct
  • codegen25-7b-multi

Codellama

Standard Models

  • codellama-70b-instruct
  • codellama-34b
  • codellama-34b-python
  • codellama-34b-instruct
  • codellama-13b-instruct
  • codellama-7b-instruct

CodeT5p

  • codet5p-16b-instruct
  • codet5p-16b
  • codet5p-6b
  • codet5p-2b

DeepSeek

  • deepseekcoder-33b
  • deepseekcoder-7b
  • deepseekcoder-v2-15b

Gemini

  • All Gemini models are supported, including those below, other model names can be found in the Gemini API documentation.
  • gemini-1.5-pro
  • gemini-1.5-flash
  • gemini-pro
  • gemini-pro-vision
  • gemini-1.0-pro-vision

Gemma

  • gemma-7b
  • gemma-7b-it
  • gemma-2b
  • gemma-2b-it
  • codegemma-7b-it
  • gemma-2-27b
  • gemma-2-9b

GPT

  • gpt-4
  • gpt-3.5
  • gpt-4-1106
  • gpt-4-0613

LLaMA

LLaMA-2

  • llama-2-7b-chat
  • llama-2-13b-chat
  • llama-2-70b-chat
  • llama-2-7b
  • llama-2-13b
  • llama-2-70b

LLaMA-3

  • llama-3-8b
  • llama-3.1-8b
  • llama-3-70b
  • llama-3.1-70b
  • llama-3-70b-tai

Mistral

  • mistral-7b-instruct
  • mixtral-8x7b-instruct
  • mixtral-8x7b
  • mixtral-8x22b
  • mistral-codestral-22b

Qwen

  • qwen2.5-coder-7b
  • qwen2.5-coder-1.5b
  • qwen2.5-14b
  • qwen2.5-32b
  • qwen2.5-72b

StarCoder

  • starcoder
  • starcoder2-15b

WizardLM

WizardCoder

  • wizardcoder-15b
  • wizardcoder-34b-python
  • wizardcoder-13b-python

WizardLM Base

  • wizardlm-70b
  • wizardlm-13b
  • wizardlm-30b

Ollama

You need to install the ollama package manually.

  • qwen2.5-coder:latest
  • qwen2.5:32b
  • llama3.2:latest
  • deepseek-r1:32b
  • deepseek-r1:latest

IRIS Results Visualizer

A web-based visualizer for IRIS static analysis results stored in SARIF format.

iris visualizer

Features

  • Interactive visualization of IRIS analysis results
  • Code flow visualization with clickable nodes
  • Source code display with line highlighting
  • Filtering by project, CWE, and trace
  • Configurable paths and settings
  • User-driven project selection
  • Project metadata panel (CVE, CWE, GitHub repo, commits)

Configuration

The visualizer uses a config.json file to configure paths and settings. The configuration file is automatically created with default values if it doesn't exist.

Configuration Options

{
    "server": {
        "port": 8000,
        "host": "localhost"
    },
    "paths": {
        "outputs_dir": "../output",
        "project_sources_dir": "../data/project-sources",
        "project_info_csv": "../data/project_info.csv"
    },
    "ui": {
        "max_source_code_height": "600px",
        "default_project": "perwendel__spark_CVE-2018-9159_2.7.1"
    }
}

Path Configuration

  • outputs_dir: Path to the directory containing IRIS analysis outputs (SARIF files)
  • project_sources_dir: Path to the directory containing project source code
  • project_info_csv: Path to the CSV file that stores project metadata (CVE, CWE, repo, commits)

Server Configuration

  • port: Port number for the HTTP server (default: 8000)
  • host: Host address for the server (default: localhost)

UI Configuration

  • max_source_code_height: Maximum height for the source code display area
  • default_project: Project that will be pre-selected when the UI first loads

Usage

  1. Configure paths: Edit config.json to point to your outputs and source directories
  2. Start the server: Run python3 server.py
  3. Open in browser: Navigate to http://localhost:8000
  4. Select a project: Choose a project from the dropdown to load its analysis results
  5. Filter and explore: Use the CWE and model filters to explore specific vulnerabilities

User Flow

  1. Project Selection: The visualizer starts with an empty state. Users must first select a project from the dropdown.
  2. Data Loading: Once a project is selected, the visualizer loads all available SARIF files for that project.
  3. Filtering: Users can then filter by CWE and model to focus on specific vulnerability types.
  4. Trace Exploration: Click on traces to view detailed information, code flows, and source code.

Directory Structure

The visualizer expects the following directory structure:

output/
├── project1/
│   ├── cwe-352/
│   ├── cwe-352wLLM/
│   │   ├── results.sarif
│   │   ├── api_labels.json
│   │   └── ...
│   └── cwe-352wLLM-final/
|   └── ...
└── project2/
    ├── cwe-022/
    └── cwe-022wLLM/

data/project-sources/
├── project1/
│   ├── src/
│   │   └── main.java
│   └── ...
└── project2/
    └── ...

API Endpoints (server-side)

  • GET /api/projects - List available projects
  • GET /api/cwes - List available CWEs
  • GET /api/sarif/{path} - Get SARIF file content
  • GET /api/source/{project}/{file} - Get source code file
  • GET /api/project_metadata/{project_slug} - Get metadata (CVE/CWE/GitHub info) for a project
  • GET /api/models?project={slug}&cwe={id} - List models that produced results for a given project/CWE
  • GET /api/project_cwes?project={slug} - List CWEs that have results for a given project
  • GET /api/source_projects - List source code projects available on disk
  • GET /api/local_file/{project}/{relative_path}?line={n} - Render a source file in HTML with optional line highlight
  • GET /api/dir?project={slug}&path={subdir} - Directory listing for the in-browser file explorer
  • GET /api/config - Get client configuration

Customization

To use the visualizer with different data sources:

  1. Update the paths section in config.json
  2. Ensure your outputs directory follows the expected structure
  3. Ensure your project sources directory contains the source code files referenced in the SARIF files

Troubleshooting

  • 404 errors: Check that the paths in config.json are correct and the directories exist
  • Empty project list: Verify that the outputs directory contains the expected folder structure
  • Source code not loading: Ensure the project sources directory contains the referenced files
  • No data after project selection: Check that SARIF files exist for the selected project

Development

The visualizer consists of:

  • server.py - Python HTTP server with API endpoints
  • index.html - Main HTML interface
  • app.js - Frontend JavaScript logic
  • styles.css - CSS styling
  • config.json - Configuration file

File Structure

visualizer/
├── index.html          # Main HTML file
├── styles.css          # CSS styling
├── app.js             # Main JavaScript application
├── server.py          # Python HTTP server
└── README.md          # This file

API Endpoints (server-side)

The server provides the following API endpoints:

  • GET /api/projects - List all available projects
  • GET /api/cwes - List all available CWE types
  • GET /api/sarif/{path} - Get SARIF file content
  • GET /api/source/{project}/{file} - Get source code file
  • GET /api/project_metadata/{project_slug} - Get project metadata
  • GET /api/models?project={slug}&cwe={id} - List models for a project/CWE
  • GET /api/project_cwes?project={slug} - List CWEs for a project
  • GET /api/source_projects - List available source code projects
  • GET /api/local_file/{project}/{relative_path}?line={n} - Render a local file with optional line highlight
  • GET /api/dir?project={slug}&path={subdir} - Directory listing API

Configuration

The server is configured to look for:

  • Outputs directory: ../output (contains SARIF files)
  • Project sources: ../data/project-sources (contains source code)

You can modify these default paths in server.py:

OUTPUTS_DIR = "../output"
PROJECT_SOURCES_DIR = "../data/project-sources"

Data Format

The visualizer expects:

  1. SARIF Files: Located in output/{project}/{run-id}/cwe-{cwe_id}/results.sarif
  2. Source Code: Located in data/project-sources/{project}/

SARIF Structure

The visualizer parses SARIF files and extracts:

  • Vulnerability traces (results)
  • Code flows (data flow paths)
  • File locations and line numbers
  • Severity levels and rule information

Project Structure

output/
└── crate__crate_5.5.1_CVE-2023-51982_5.5.1/
    ├── cwe-352/
    │   ├── api_labels_gemini-1.5-flash.json
    │   ├── Spec.yml
    │   └── MySinks.qll
    ├── cwe-352wLLM/
    ├── cwe-352wLLM-final/
    └── cwe-352wLLM-posthoc-filter/

Usage Guide

1. Filtering Traces

  • CWE Type: Filter by specific vulnerability types (e.g., CWE-22, CWE-78)
  • Project: Filter by specific projects
  • Model: Filter by AI model used for analysis

2. Exploring Traces

  • Click on any trace in the left panel
  • View the trace description and severity
  • Examine the code flow steps
  • Click on flow steps to view source code

3. Source Code Navigation

  • Source code is displayed with syntax highlighting
  • Vulnerable lines are highlighted in red
  • Click on code flow steps to jump to specific lines
  • Line numbers are shown for easy navigation

4. Understanding the Results

  • Code Flow: Shows the path of data from source to sink
  • Source Code: Displays the actual vulnerable code
  • Severity: Indicates the security impact level
  • Rule ID: Shows the specific vulnerability rule

Troubleshooting

Common Issues

  1. Server won't start:

    • Make sure Python 3 is installed
    • Check that the outputs and project-sources directories exist
    • Verify port 8000 is not in use
  2. No traces shown:

    • Check that SARIF files exist in the outputs directory
    • Verify the file structure matches the expected format
    • Check browser console for error messages
  3. Source code not loading:

    • Ensure project sources are in the correct directory
    • Check that file paths in SARIF match source file locations
    • Verify file permissions

Debug Mode

Open browser developer tools (F12) and check the console for:

  • API request errors
  • Data loading issues
  • JavaScript errors

The visualizer exports a global object window.irisVisualizer for debugging:

// Access loaded data
console.log(window.irisVisualizer.allTraces);
console.log(window.irisVisualizer.currentTrace);

Development

Adding New Features

  1. New Filter Types: Add to the HTML and update populateFilters() in app.js

Adding CWEs

We are always open to supporting new CWEs. We recommend any of the CWEs in the OWASP top 25 that we don't currently support.

To add a CWE, you will need to provide the CodeQL queries and add the CWE queries to queries.py.

Typically the structure of the queries would be

cwe-*
├── cwe-*wLLM.ql
│   
└── My[CodeQLCWEQueryModuleName].qll

cwe-*wLLM.ql is the wrapper query that imports the module *.qll file. The *.qll file is the module library - this is where the logic for the sources and sinks is implemented.

  1. Find the CWE definition on the Mitre CWE site. A strong understanding of the CWE will help you in the following steps.
  2. We recommend using CodeQL's CWE queries for examples. You can find CodeQL's CWE queries in the CodeQL github repository. In java/ql/src/Security/CWE, locate the CWE you're interested in adding. Within each CWE directory, locate the .ql file. Often there are multiple .ql files - a quick heuristic is to pick the .ql file with the most general name, and most similar to the CWE name.

For example - CWE-022 has TaintedPath.ql and ZipSlip.ql. We used TaintedPath.ql.

  1. Once you've found the corresponding .ql file for the CWE - make note of this file. This will be the wrapper query. Within the file, there should be an import statement that refers to the module related to the CWE. Often it will be prefixed with semmle.code.java.security and end with Query. Within the CodeQL repository, find the module in codeql/java/ql/lib/semmle/code/java/security.
  2. Within the cwe-queries directory of iris, create a new folder titled cwe-[CWE number]. Within the folder copy the .ql and the .qll files. Rename them with the prefix My. Within the .qll file - there may be multiple modules suffixed with Config. Find the Config that includes the .qll name in it - - this is where the source and sink predicates are defined.

Within the module, replace the predicates with the following

predicate isSource(DataFlow::Node source) {
    isGPTDetectedSource(source)
 }

  predicate isSink(DataFlow::Node sink) {
    isGPTDetectedSink(sink)
  }

  predicate isBarrier(DataFlow::Node sanitizer) {
    sanitizer.getType() instanceof BoxedType or
    sanitizer.getType() instanceof PrimitiveType or
    sanitizer.getType() instanceof NumberType
  }

  predicate isAdditionalFlowStep(DataFlow::Node n1, DataFlow::Node n2) {
    isGPTDetectedStep(n1, n2)
  }

Also add the following imports:

 import MySources
 import MySinks
 import MySummaries

Remove the former predicate definitions and anything else in the file related to the former predicates. Now in the .ql file, update the imports to refer to the renamed .qll module.

  1. Now within the queries.py file, add the CWE and its queries to the QUERIES dictionary. Note - if the CWE is double digits - for the id use 0[number]. For example - CWE 22 would be cwe-022. Use the following format - we use CWE-22 as an example:
"cwe-[number]wLLM": {
    "name": "cwe-[number]wLLM",
    "type": "cwe-query",
    "cwe_id": "022",
    "cwe_id_short": "22",
    "cwe_id_tag": "CWE-22",
    "desc": "Path Traversal or Zip Slip",
    "queries": [
      "cwe-queries/cwe-022/cwe-022wLLM.ql",
      "cwe-queries/cwe-022/MyTaintedPathQuery.qll",
    ],
    "prompts": {
      "cwe_id": "CWE-022",
      "desc": "Path Traversal or Zip Slip",
      "long_desc": """\
        A path traversal vulnerability allows an attacker to access files \
        on your web server to which they should not have access. They do this by tricking either \
        the web server or the web application running on it into returning files that exist outside \
        of the web root folder. Another attack pattern is that users can pass in malicious Zip file \
        which may contain directories like "../". Typical sources of this vulnerability involves \
        obtaining information from untrusted user input through web requests, getting entry directory \
        from Zip files. Sinks will relate to file system manipulation, such as creating file, listing \
        directories, and etc.""",
      "examples": [
        {
          "package": "java.util.zip",
          "class": "ZipEntry",
          "method": "getName",
          "signature": "String getName()",
            "sink_args": [],
          "type": "source",
        },
        {
          "package": "java.io",
          "class": "FileInputStream",
          "method": "FileInputStream",
          "signature": "FileInputStream(File file)",
          "sink_args" : ["file"],
          "type": "sink",
        },
        {
          "package": "java.net",
          "class": "URL",
          "method": "URL",
          "signature": "URL(String url)",
            "sink_args": [],
          "type": "taint-propagator",
        },
        {
            "package": "java.io",
            "class": "File",
            "method": "File",
            "signature": "File(String path)",
            "sink_args": [],
          "type": "taint-propagator",
        },
      ]
    }
  },

For the long_desc key - look up definitions of the CWE and find a clear description that summarizes what the CWE is and how it's exploited.

For the examples, you will need to provide sources and sinks. A CodeQL source is a value that an attacker can use for malicious operations within a system. A CodeQL sink is a program point that accepts a malicious source, and ends up using the malicious data. You can use the Github Advisory Database to find examples of the CWE. Or the definition may provide common abstractions which you can then search for Java's most used libraries for the related abstraction.

  1. Add a hint related to CWE for contextual analysis prompt in prompts.py. Hints are stored in POSTHOC_FILTER_HINTS. The key should be the CWE number and the value include sentences that describe extra details to look out for when detecting the CWE. Sites that have definitions for the CWE will often have more specific guidance on the CWE.

  2. Test out the query. You can provide the --test-run parameter when running iris.py to see if the CodeQL queries compile. Afterwards, you can try a test run with a small model on one of the Java projects associated with the CWE. The GitHub Advisory Database is an easy way to find a vulnerable project given the CWE.

Contributing and Feedback

Feel free to address any open issues or add your own issue and fix. We love feedback! Please adhere to the following guidelines.

  1. Create a Github issue outlining the piece of work. Solicit feedback from anyone who has recently contributed to the component of the repository you plan to contribute to.
  2. Checkout a branch from main - preferably name your branch [github username]/[brief description of contribution]
  3. Create a pull request that refers to the created github issue in the commit message.
  4. To link to the github issue, in your commit for example you would simply add in the commit message: [what the PR does briefly] #[commit issue]
  5. Then when you push your commit and create your pull request, Github will automatically link the commit back to the issue. Add more details in the pull request, and request reviewers from anyone who has recently modified related code.
  6. After 1 approval, merge your pull request.

Changelog

IRIS v2 (unreleased)

Features:

  • Adds support for several new CWEs:
    • CWE-089 (SQL Injection)
    • CWE-295 (Improper Certificate Validation)
    • CWE-352 (Cross-Site Request Forgery)
    • CWE-502 (Deserialization of Untrusted Data)
    • CWE-611 (Improper Restriction of XML External Entity Reference)
    • CWE-807 (Reliance on Untrusted Inputs in a Security Decision)
    • CWE-918 (Server-Side Request Forgery)
  • Reworks scripts to depend on OpenJDK instead of Oracle
  • Adds support for Gemini

IRIS v1

Features:

  • Introduces IRIS
  • Added support for 4 CWEs:
    • CWE-022 (Path Traversal)
    • CWE-078 (OS Command Injection)
    • CWE-079 (Cross-Site Scripting)
    • CWE-094 (Code Injection)

Team

IRIS is a collaborative effort between researchers at the University of Pennsylvania and Cornell University. Please reach out to us if you have questions about IRIS.

Students

Claire Wang, University of Pennsylvania

Amartya Das, Ward Melville High School

Derin Gezgin, Connecticut College

Zhengdong (Forest) Huang, Southern University of Science and Technology

Nevena Stojkovic, Massachusetts Institute of Technology

Faculty

Ziyang Li, Johns Hopkins University, previously PhD student at the University of Pennsylvania

Saikat Dutta, Cornell University

Mayur Naik, University of Pennsylvania

Cornell University University of Pennsylvania

Citation

Consider citing our ICLR'25 paper:

@inproceedings{li2025iris,
title={LLM-Assisted Static Analysis for Detecting Security Vulnerabilities},
author={Ziyang Li and Saikat Dutta and Mayur Naik},
booktitle={International Conference on Learning Representations},
year={2025},
url={https://arxiv.org/abs/2405.17238}
}

Arxiv Link