IRIS
IRIS is a neurosymbolic framework that combines LLMs with static analysis for security vulnerability detection. IRIS uses LLMs to generate source and sink specifications and to filter false positive vulnerable paths.
Overview
At a high level, IRIS takes a project and a CWE (vulnerability class, such as path traversal vulnerability or CWE-22) as input, statically analyzes the project, and outputs a set of potential vulnerabilities (of type CWE) in the project.
Key Features
- Combines LLMs with static analysis for enhanced vulnerability detection
- Supports multiple vulnerability classes (CWEs)
- Works with various LLM models
- Provides detailed analysis and results
- Easy to set up with Docker or native installation
Getting Started
To get started with IRIS:
- Check out the Environment Setup guide
- Follow our Quickstart tutorial
- Learn about Supported CWEs and Models
- If you are interested in contributing, read our Guidelines
Resources
Workflow
IRIS accepts a codebase and a CWE (Common Weakness Enumeration) as input, and uses a neurosymbolic approach to identify security vulnerabilities of type CWE in the project. To acheieve this, IRIS uses the following steps:
- First we create CodeQL queries to collect external APIs in the project and all internal function parameters.
- We use an LLM to classify the external APIs as potential sources, sinks, or taint propagators. In another query, we use an LLM to classify the internal function parameters as potential sources. We call these taint specifications.
- Using the taint specifications from step 2, we build a project-specific and cwe-specific (e.g., for CWE 22) CodeQL query.
- Then we run the query to find vulnerabilities in the given project and post-process the results.
- We provide the LLM the post-processed results to filter out false positives and determine whether a CWE is detected.
Dataset
We have curated a dataset of Java projects, containing 127 real-world previously known vulnerabilities across 11 popular vulnerability classes. The dataset is also available to use with the Hugging Face datasets library.
CWE-Bench-Java on Hugging Face
Results
These results are from the ICLR 2025 version of IRIS, and do not include newly added CWEs/ projects.
Results on the effectiveness of IRIS across 121 projects and 9 LLMs can be found at /results
. Each model has a unique CSV file, with the following structure as an example.
CWE ID | CVE | Author | Package | Tag | Recall | Alerts | Paths | TP Alerts | TP Paths | Precision | F1 |
---|---|---|---|---|---|---|---|---|---|---|---|
CWE-022 | CVE-2016-10726 | DSpace | DSpace | 4.4 | 0 | 31 | 63 | 0 | 0 | 0 | 0 |
None
refers to data that was not collected, while N/A
refers to a measure that cannot be calculated, either because of missing data or a division by zero.
Using Docker (Recommended)
docker build -f Dockerfile --platform linux/x86_64 -t iris:latest .
docker run --platform=linux/amd64 -it iris:latest
Note: Read the instructions for "Native Setup" ahead if you intend to configure Java build tools (JDK, Maven, Gradle) or CodeQL.
Native Setup (Mac/Linux)
Step 1: Setup Conda environment
conda env create -f environment.yml
conda activate iris
If you have a CUDA-capable GPU and want to enable hardware acceleration, install the appropriate CUDA toolkit, for example:
$ conda install pytorch-cuda=12.1 -c nvidia -c pytorch
Replace 12.1 with the CUDA version compatible with your GPU and drivers, if needed.
Step 2: Configure Java build tools
To apply IRIS to Java projects, you need to specify the paths to your Java build tools (JDK, Maven, Gradle) in the dep_configs.json
file in the project root.
The versions of these tools required by each project are specified in data/build_info.csv
. For instance, perwendel__spark_CVE-2018-9159_2.7.1
requires JDK 8 and Maven 3.5.0. You can install and manage these tools easily using SDKMAN!.
# Install SDKMAN!
curl -s "https://get.sdkman.io" | bash
source "$HOME/.sdkman/bin/sdkman-init.sh"
# Install Java 8 and Maven 3.5.0
sdk install java 8.0.452-amzn
sdk install maven 3.5.0
Step 3: Configure CodeQL
IRIS relies on the CodeQL Action bundle, which includes CLI utilities and pre-defined queries for various CWEs and languages ("QL packs").
If you already have CodeQL installed, specify its location via the CODEQL_DIR
environment variable in src/config.py
. Otherwise, download an appropriate version of the CodeQL Action bundle from the CodeQL Action releases page.
-
For the latest version: Visit the latest release and download the appropriate bundle for your OS:
codeql-bundle-osx64.tar.gz
for macOScodeql-bundle-linux64.tar.gz
for Linux
-
For a specific version (e.g., 2.15.0): Go to the CodeQL Action releases page, find the release tagged
codeql-bundle-v2.15.0
, and download the appropriate bundle for your platform.
After downloading, extract the archive in the project root directory:
tar -xzf codeql-bundle-<platform>.tar.gz
This should create a sub-directory codeql/
with the executable codeql
inside.
Lastly, add the path of this executable to your PATH
environment variable:
export PATH="$PWD/codeql:$PATH"
Note: Also adjust the environment variable CODEQL_QUERY_VERSION
in src/config.py
according to the instructions therein. For instance, for CodeQL v2.15.0, this should be 0.8.0
.
Quickstart
Make sure you have followed all of the environment setup instructions before proceeding!
To quickly try IRIS on the example project perwendel__spark_CVE-2018-9159_2.7.1
, run the following commands:
# Build the project
python scripts/fetch_and_build.py --filter perwendel__spark_CVE-2018-9159_2.7.1
# Generate the CodeQL database
python scripts/build_codeql_dbs.py --project perwendel__spark_CVE-2018-9159_2.7.1
# Run IRIS analysis
python src/iris.py --query cwe-022wLLM --run-id test --llm qwen2.5-coder-7b perwendel__spark_CVE-2018-9159_2.7.1
This will build the project, generate the CodeQL database, and analyze it for CWE-022 vulnerabilities using the specified LLM (qwen2.5-coder-7b). The output of these three steps will be stored under data/build-info/
, data/codeql-dbs/
, and output/
respectively.
Supported CWEs
Here are the following CWEs supported, that you can specify as an argument to --query
when using src/iris.py
.
cwe-022wLLM
- CWE-022 (Path Traversal)cwe-078wLLM
- CWE-078 (OS Command Injection)cwe-079wLLM
- CWE-079 (Cross-Site Scripting)cwe-089wLLM
- CWE-089 (SQL Injection)cwe-094wLLM
- CWE-094 (Code Injection)cwe-295wLLM
- CWE-295 (Improper Certificate Validation)cwe-352wLLM
- CWE-352 (Cross-Site Request Forgery)cwe-502wLLM
- CWE-502 (Deserialization of Untrusted Data)cwe-611wLLM
- CWE-611 (Improper Restriction of XML External Entity Reference)cwe-807wLLM
- CWE-807 (Reliance on Untrusted Inputs in a Security Decision)cwe-918wLLM
- CWE-918 (Server-Side Request Forgery)
Supported Models
We support the following models with our models API wrapper (found in src/models
) in the project. Listed below are the arguments you can use for --llm
when using src/iris.py
. You're free to use your own way of instantiating models or adding on to the existing library. Some of them require your own API key or license agreement on HuggingFace.
List of Models
Codegen
codegen-16b-multi
codegen25-7b-instruct
codegen25-7b-multi
Codellama
Standard Models
codellama-70b-instruct
codellama-34b
codellama-34b-python
codellama-34b-instruct
codellama-13b-instruct
codellama-7b-instruct
CodeT5p
codet5p-16b-instruct
codet5p-16b
codet5p-6b
codet5p-2b
DeepSeek
deepseekcoder-33b
deepseekcoder-7b
deepseekcoder-v2-15b
Gemini
- All Gemini models are supported, including those below, other model names can be found in the Gemini API documentation.
gemini-1.5-pro
gemini-1.5-flash
gemini-pro
gemini-pro-vision
gemini-1.0-pro-vision
Gemma
gemma-7b
gemma-7b-it
gemma-2b
gemma-2b-it
codegemma-7b-it
gemma-2-27b
gemma-2-9b
GPT
gpt-4
gpt-3.5
gpt-4-1106
gpt-4-0613
LLaMA
LLaMA-2
llama-2-7b-chat
llama-2-13b-chat
llama-2-70b-chat
llama-2-7b
llama-2-13b
llama-2-70b
LLaMA-3
llama-3-8b
llama-3.1-8b
llama-3-70b
llama-3.1-70b
llama-3-70b-tai
Mistral
mistral-7b-instruct
mixtral-8x7b-instruct
mixtral-8x7b
mixtral-8x22b
mistral-codestral-22b
Qwen
qwen2.5-coder-7b
qwen2.5-coder-1.5b
qwen2.5-14b
qwen2.5-32b
qwen2.5-72b
StarCoder
starcoder
starcoder2-15b
WizardLM
WizardCoder
wizardcoder-15b
wizardcoder-34b-python
wizardcoder-13b-python
WizardLM Base
wizardlm-70b
wizardlm-13b
wizardlm-30b
Ollama
You need to install the ollama
package manually.
qwen2.5-coder:latest
qwen2.5:32b
llama3.2:latest
deepseek-r1:32b
deepseek-r1:latest
IRIS Results Visualizer
A web-based visualizer for IRIS static analysis results stored in SARIF format.
Features
- Interactive visualization of IRIS analysis results
- Code flow visualization with clickable nodes
- Source code display with line highlighting
- Filtering by project, CWE, and trace
- Configurable paths and settings
- User-driven project selection
- Project metadata panel (CVE, CWE, GitHub repo, commits)
Configuration
The visualizer uses a config.json
file to configure paths and settings. The configuration file is automatically created with default values if it doesn't exist.
Configuration Options
{
"server": {
"port": 8000,
"host": "localhost"
},
"paths": {
"outputs_dir": "../output",
"project_sources_dir": "../data/project-sources",
"project_info_csv": "../data/project_info.csv"
},
"ui": {
"max_source_code_height": "600px",
"default_project": "perwendel__spark_CVE-2018-9159_2.7.1"
}
}
Path Configuration
- outputs_dir: Path to the directory containing IRIS analysis outputs (SARIF files)
- project_sources_dir: Path to the directory containing project source code
- project_info_csv: Path to the CSV file that stores project metadata (CVE, CWE, repo, commits)
Server Configuration
- port: Port number for the HTTP server (default: 8000)
- host: Host address for the server (default: localhost)
UI Configuration
- max_source_code_height: Maximum height for the source code display area
- default_project: Project that will be pre-selected when the UI first loads
Usage
- Configure paths: Edit
config.json
to point to your outputs and source directories - Start the server: Run
python3 server.py
- Open in browser: Navigate to
http://localhost:8000
- Select a project: Choose a project from the dropdown to load its analysis results
- Filter and explore: Use the CWE and model filters to explore specific vulnerabilities
User Flow
- Project Selection: The visualizer starts with an empty state. Users must first select a project from the dropdown.
- Data Loading: Once a project is selected, the visualizer loads all available SARIF files for that project.
- Filtering: Users can then filter by CWE and model to focus on specific vulnerability types.
- Trace Exploration: Click on traces to view detailed information, code flows, and source code.
Directory Structure
The visualizer expects the following directory structure:
output/
├── project1/
│ ├── cwe-352/
│ ├── cwe-352wLLM/
│ │ ├── results.sarif
│ │ ├── api_labels.json
│ │ └── ...
│ └── cwe-352wLLM-final/
| └── ...
└── project2/
├── cwe-022/
└── cwe-022wLLM/
data/project-sources/
├── project1/
│ ├── src/
│ │ └── main.java
│ └── ...
└── project2/
└── ...
API Endpoints (server-side)
GET /api/projects
- List available projectsGET /api/cwes
- List available CWEsGET /api/sarif/{path}
- Get SARIF file contentGET /api/source/{project}/{file}
- Get source code fileGET /api/project_metadata/{project_slug}
- Get metadata (CVE/CWE/GitHub info) for a projectGET /api/models?project={slug}&cwe={id}
- List models that produced results for a given project/CWEGET /api/project_cwes?project={slug}
- List CWEs that have results for a given projectGET /api/source_projects
- List source code projects available on diskGET /api/local_file/{project}/{relative_path}?line={n}
- Render a source file in HTML with optional line highlightGET /api/dir?project={slug}&path={subdir}
- Directory listing for the in-browser file explorerGET /api/config
- Get client configuration
Customization
To use the visualizer with different data sources:
- Update the
paths
section inconfig.json
- Ensure your outputs directory follows the expected structure
- Ensure your project sources directory contains the source code files referenced in the SARIF files
Troubleshooting
- 404 errors: Check that the paths in
config.json
are correct and the directories exist - Empty project list: Verify that the outputs directory contains the expected folder structure
- Source code not loading: Ensure the project sources directory contains the referenced files
- No data after project selection: Check that SARIF files exist for the selected project
Development
The visualizer consists of:
server.py
- Python HTTP server with API endpointsindex.html
- Main HTML interfaceapp.js
- Frontend JavaScript logicstyles.css
- CSS stylingconfig.json
- Configuration file
File Structure
visualizer/
├── index.html # Main HTML file
├── styles.css # CSS styling
├── app.js # Main JavaScript application
├── server.py # Python HTTP server
└── README.md # This file
API Endpoints (server-side)
The server provides the following API endpoints:
GET /api/projects
- List all available projectsGET /api/cwes
- List all available CWE typesGET /api/sarif/{path}
- Get SARIF file contentGET /api/source/{project}/{file}
- Get source code fileGET /api/project_metadata/{project_slug}
- Get project metadataGET /api/models?project={slug}&cwe={id}
- List models for a project/CWEGET /api/project_cwes?project={slug}
- List CWEs for a projectGET /api/source_projects
- List available source code projectsGET /api/local_file/{project}/{relative_path}?line={n}
- Render a local file with optional line highlightGET /api/dir?project={slug}&path={subdir}
- Directory listing API
Configuration
The server is configured to look for:
- Outputs directory:
../output
(contains SARIF files) - Project sources:
../data/project-sources
(contains source code)
You can modify these default paths in server.py
:
OUTPUTS_DIR = "../output"
PROJECT_SOURCES_DIR = "../data/project-sources"
Data Format
The visualizer expects:
- SARIF Files: Located in
output/{project}/{run-id}/cwe-{cwe_id}/results.sarif
- Source Code: Located in
data/project-sources/{project}/
SARIF Structure
The visualizer parses SARIF files and extracts:
- Vulnerability traces (results)
- Code flows (data flow paths)
- File locations and line numbers
- Severity levels and rule information
Project Structure
output/
└── crate__crate_5.5.1_CVE-2023-51982_5.5.1/
├── cwe-352/
│ ├── api_labels_gemini-1.5-flash.json
│ ├── Spec.yml
│ └── MySinks.qll
├── cwe-352wLLM/
├── cwe-352wLLM-final/
└── cwe-352wLLM-posthoc-filter/
Usage Guide
1. Filtering Traces
- CWE Type: Filter by specific vulnerability types (e.g., CWE-22, CWE-78)
- Project: Filter by specific projects
- Model: Filter by AI model used for analysis
2. Exploring Traces
- Click on any trace in the left panel
- View the trace description and severity
- Examine the code flow steps
- Click on flow steps to view source code
3. Source Code Navigation
- Source code is displayed with syntax highlighting
- Vulnerable lines are highlighted in red
- Click on code flow steps to jump to specific lines
- Line numbers are shown for easy navigation
4. Understanding the Results
- Code Flow: Shows the path of data from source to sink
- Source Code: Displays the actual vulnerable code
- Severity: Indicates the security impact level
- Rule ID: Shows the specific vulnerability rule
Troubleshooting
Common Issues
-
Server won't start:
- Make sure Python 3 is installed
- Check that the outputs and project-sources directories exist
- Verify port 8000 is not in use
-
No traces shown:
- Check that SARIF files exist in the outputs directory
- Verify the file structure matches the expected format
- Check browser console for error messages
-
Source code not loading:
- Ensure project sources are in the correct directory
- Check that file paths in SARIF match source file locations
- Verify file permissions
Debug Mode
Open browser developer tools (F12) and check the console for:
- API request errors
- Data loading issues
- JavaScript errors
The visualizer exports a global object window.irisVisualizer
for debugging:
// Access loaded data
console.log(window.irisVisualizer.allTraces);
console.log(window.irisVisualizer.currentTrace);
Development
Adding New Features
- New Filter Types: Add to the HTML and update
populateFilters()
inapp.js
Adding CWEs
We are always open to supporting new CWEs. We recommend any of the CWEs in the OWASP top 25 that we don't currently support.
To add a CWE, you will need to provide the CodeQL queries and add the CWE queries to queries.py
.
Typically the structure of the queries would be
cwe-*
├── cwe-*wLLM.ql
│
└── My[CodeQLCWEQueryModuleName].qll
cwe-*wLLM.ql
is the wrapper query that imports the module *.qll
file. The *.qll
file is the module library - this is where the logic for the sources and sinks is implemented.
- Find the CWE definition on the Mitre CWE site. A strong understanding of the CWE will help you in the following steps.
- We recommend using CodeQL's CWE queries for examples. You can find CodeQL's CWE queries in the CodeQL github repository. In
java/ql/src/Security/CWE
, locate the CWE you're interested in adding. Within each CWE directory, locate the.ql
file. Often there are multiple.ql
files - a quick heuristic is to pick the.ql
file with the most general name, and most similar to the CWE name.
For example - CWE-022 has TaintedPath.ql
and ZipSlip.ql
. We used TaintedPath.ql
.
- Once you've found the corresponding
.ql
file for the CWE - make note of this file. This will be the wrapper query. Within the file, there should be an import statement that refers to the module related to the CWE. Often it will be prefixed withsemmle.code.java.security
and end withQuery
. Within the CodeQL repository, find the module incodeql/java/ql/lib/semmle/code/java/security
. - Within the
cwe-queries
directory of iris, create a new folder titledcwe-[CWE number]
. Within the folder copy the.ql
and the.qll
files. Rename them with the prefixMy
. Within the.qll
file - there may be multiple modules suffixed withConfig
. Find the Config that includes the.qll
name in it - - this is where the source and sink predicates are defined.
Within the module, replace the predicates with the following
predicate isSource(DataFlow::Node source) {
isGPTDetectedSource(source)
}
predicate isSink(DataFlow::Node sink) {
isGPTDetectedSink(sink)
}
predicate isBarrier(DataFlow::Node sanitizer) {
sanitizer.getType() instanceof BoxedType or
sanitizer.getType() instanceof PrimitiveType or
sanitizer.getType() instanceof NumberType
}
predicate isAdditionalFlowStep(DataFlow::Node n1, DataFlow::Node n2) {
isGPTDetectedStep(n1, n2)
}
Also add the following imports:
import MySources
import MySinks
import MySummaries
Remove the former predicate definitions and anything else in the file related to the former predicates. Now in the .ql
file, update the imports to refer to the renamed .qll
module.
- Now within the
queries.py
file, add the CWE and its queries to theQUERIES
dictionary. Note - if the CWE is double digits - for the id use 0[number]. For example - CWE 22 would becwe-022
. Use the following format - we use CWE-22 as an example:
"cwe-[number]wLLM": {
"name": "cwe-[number]wLLM",
"type": "cwe-query",
"cwe_id": "022",
"cwe_id_short": "22",
"cwe_id_tag": "CWE-22",
"desc": "Path Traversal or Zip Slip",
"queries": [
"cwe-queries/cwe-022/cwe-022wLLM.ql",
"cwe-queries/cwe-022/MyTaintedPathQuery.qll",
],
"prompts": {
"cwe_id": "CWE-022",
"desc": "Path Traversal or Zip Slip",
"long_desc": """\
A path traversal vulnerability allows an attacker to access files \
on your web server to which they should not have access. They do this by tricking either \
the web server or the web application running on it into returning files that exist outside \
of the web root folder. Another attack pattern is that users can pass in malicious Zip file \
which may contain directories like "../". Typical sources of this vulnerability involves \
obtaining information from untrusted user input through web requests, getting entry directory \
from Zip files. Sinks will relate to file system manipulation, such as creating file, listing \
directories, and etc.""",
"examples": [
{
"package": "java.util.zip",
"class": "ZipEntry",
"method": "getName",
"signature": "String getName()",
"sink_args": [],
"type": "source",
},
{
"package": "java.io",
"class": "FileInputStream",
"method": "FileInputStream",
"signature": "FileInputStream(File file)",
"sink_args" : ["file"],
"type": "sink",
},
{
"package": "java.net",
"class": "URL",
"method": "URL",
"signature": "URL(String url)",
"sink_args": [],
"type": "taint-propagator",
},
{
"package": "java.io",
"class": "File",
"method": "File",
"signature": "File(String path)",
"sink_args": [],
"type": "taint-propagator",
},
]
}
},
For the long_desc
key - look up definitions of the CWE and find a clear description that summarizes what the CWE is and how it's exploited.
For the examples, you will need to provide sources and sinks. A CodeQL source is a value that an attacker can use for malicious operations within a system. A CodeQL sink is a program point that accepts a malicious source, and ends up using the malicious data. You can use the Github Advisory Database to find examples of the CWE. Or the definition may provide common abstractions which you can then search for Java's most used libraries for the related abstraction.
-
Add a hint related to CWE for contextual analysis prompt in
prompts.py
. Hints are stored inPOSTHOC_FILTER_HINTS
. The key should be the CWE number and the value include sentences that describe extra details to look out for when detecting the CWE. Sites that have definitions for the CWE will often have more specific guidance on the CWE. -
Test out the query. You can provide the --test-run parameter when running
iris.py
to see if the CodeQL queries compile. Afterwards, you can try a test run with a small model on one of the Java projects associated with the CWE. The GitHub Advisory Database is an easy way to find a vulnerable project given the CWE.
Contributing and Feedback
Feel free to address any open issues or add your own issue and fix. We love feedback! Please adhere to the following guidelines.
- Create a Github issue outlining the piece of work. Solicit feedback from anyone who has recently contributed to the component of the repository you plan to contribute to.
- Checkout a branch from
main
- preferably name your branch[github username]/[brief description of contribution]
- Create a pull request that refers to the created github issue in the commit message.
- To link to the github issue, in your commit for example you would simply add in the commit message: [what the PR does briefly] #[commit issue]
- Then when you push your commit and create your pull request, Github will automatically link the commit back to the issue. Add more details in the pull request, and request reviewers from anyone who has recently modified related code.
- After 1 approval, merge your pull request.
Changelog
IRIS v2 (unreleased)
Features:
- Adds support for several new CWEs:
- CWE-089 (SQL Injection)
- CWE-295 (Improper Certificate Validation)
- CWE-352 (Cross-Site Request Forgery)
- CWE-502 (Deserialization of Untrusted Data)
- CWE-611 (Improper Restriction of XML External Entity Reference)
- CWE-807 (Reliance on Untrusted Inputs in a Security Decision)
- CWE-918 (Server-Side Request Forgery)
- Reworks scripts to depend on OpenJDK instead of Oracle
- Adds support for Gemini
IRIS v1
Features:
- Introduces IRIS
- Added support for 4 CWEs:
- CWE-022 (Path Traversal)
- CWE-078 (OS Command Injection)
- CWE-079 (Cross-Site Scripting)
- CWE-094 (Code Injection)
Team
IRIS is a collaborative effort between researchers at the University of Pennsylvania and Cornell University. Please reach out to us if you have questions about IRIS.
Students
Claire Wang, University of Pennsylvania
Amartya Das, Ward Melville High School
Derin Gezgin, Connecticut College
Zhengdong (Forest) Huang, Southern University of Science and Technology
Nevena Stojkovic, Massachusetts Institute of Technology
Faculty
Ziyang Li, Johns Hopkins University, previously PhD student at the University of Pennsylvania
Saikat Dutta, Cornell University
Mayur Naik, University of Pennsylvania
Citation
Consider citing our ICLR'25 paper:
@inproceedings{li2025iris,
title={LLM-Assisted Static Analysis for Detecting Security Vulnerabilities},
author={Ziyang Li and Saikat Dutta and Mayur Naik},
booktitle={International Conference on Learning Representations},
year={2025},
url={https://arxiv.org/abs/2405.17238}
}