#66

PenHeal: A Two-Stage LLM Framework for Automated Pentesting and Optimal Remediation PenHeal: A Two-Stage LLM Framework for Automated Pentesting and Optimal Remediation

Junjie Huang, Quanyan Zhu

2024 | Conference'24 (ACM) (conference)

system penetration-testing fully-autonomous multi-agent ReAct

PDF Preview 论文预览

Loading PDF... 加载 PDF 中...

Problem & Motivation 问题与动机

Existing LLM-based penetration testing approaches either require human intervention or are limited to narrowly focused tasks, and no prior work integrates automated vulnerability remediation with penetration testing in a unified system.

现有的基于 LLM 的渗透测试方法要么需要人工干预，要么局限于狭窄的任务，且之前没有工作将自动漏洞修复与渗透测试集成在一个统一的系统中。

Traditional penetration testing is time-consuming and requires specialized expertise. While LLMs have been applied to individual pentest tasks, no system autonomously discovers multiple vulnerabilities and also generates cost-effective remediation strategies. The area of automated vulnerability remediation for hosts, systems, and network infrastructure remains largely unexplored.

传统的渗透测试耗时且需要专门的专业知识。虽然 LLM 已被应用于单个渗透测试任务，但还没有系统能够自主发现多个漏洞并生成具有成本效益的修复策略。针对主机、系统和网络基础设施的自动漏洞修复领域在很大程度上仍未被探索。

Threat Model 威胁模型

The attacker has network access to a target system (given its IP address) and operates from a Kali Linux machine with standard penetration testing tools. The system is tested against known exploitable vulnerabilities in a controlled environment (Metasploitable2).

攻击者拥有对目标系统的网络访问权限（已知其 IP 地址），并在带有标准渗透测试工具的 Kali Linux 机器上操作。该系统在受控环境（Metasploitable2）中针对已知的可利用漏洞进行测试。

Methodology 核心方法

PenHeal is a two-stage LLM framework comprising a Pentest Module and a Remediation Module. The Pentest Module uses a Planner-Executor architecture with Counterfactual Prompting (telling the LLM to assume already-found vulnerabilities do not exist to encourage exploration of new attack paths) and an Instructor component (RAG-based) to guide command generation. The Remediation Module uses an Estimator to assess vulnerability severity via CVSS, an Advisor to generate remediation strategies, and an Evaluator that scores recommendations by value and cost, selecting optimal ones via a Group Knapsack Algorithm under a user-defined budget constraint.

PenHeal 是一个两阶段的 LLM 框架，由渗透测试模块（Pentest Module）和修复模块（Remediation Module）组成。渗透测试模块采用“规划器-执行器”架构，并使用“反事实提示词”（Counterfactual Prompting，告诉 LLM 假设已发现的漏洞不存在，以鼓励探索新的攻击路径）以及基于 RAG 的讲师（Instructor）组件来引导命令生成。修复模块使用评估器（Estimator）通过 CVSS 评估漏洞严重程度，建议器（Advisor）生成修复策略，以及评价器（Evaluator）按价值和成本对建议进行评分，并在用户定义的预算约束下通过分组背包算法选择最佳建议。

Architecture 架构设计

Two-stage pipeline: (1) Pentest Module with Planner (attack plan generation using Penetration Testing Task Tree), Executor (command generation), Instructor (RAG-based guidance from pentesting books), Summarizer (output condensation), and Extractor (vulnerability extraction); (2) Remediation Module with Estimator (CVSS lookup/generation), Advisor (remediation strategy generation), Evaluator (value/cost scoring with counterfactual prompting), and Group Knapsack Algorithm for optimal selection.

两阶段流水线：(1) 渗透测试模块，包含规划器（使用渗透测试任务树生成攻击计划）、执行器（生成命令）、讲师（基于 RAG 提供渗透测试书籍的指导）、总结器（压缩输出）和提取器（提取漏洞）；(2) 修复模块，包含评估器（查询/生成 CVSS）、建议器（生成修复策略）、评价器（使用反事实提示进行价值/成本评分）以及用于最佳选择的分组背包算法。

LLM Models 使用的大模型

Tool Integration 工具集成

Memory Mechanism 记忆机制

RAG

Attack Phases Covered 覆盖的攻击阶段

reconnaissance

scanning

enumeration

exploitation

post exploitation

privilege escalation

lateral movement

reporting

Evaluation 评估结果

PenHeal achieves an overall score of 9.87 compared to 2.87 for PentestGPT and 0.94 for GPT-4. It improves vulnerability detection coverage by 31%, remediation effectiveness by 32%, and reduces remediation costs by 46% compared to baselines. PenHeal identifies approximately 60% of vulnerabilities in Metasploitable2 autonomously.

PenHeal 的总分达到 9.87，而 PentestGPT 为 2.87，GPT-4 为 0.94。与基准相比，它将漏洞检测覆盖率提高了 31%，修复有效性提高了 32%，并降低了 46% 的修复成本。PenHeal 自主识别了 Metasploitable2 中约 60% 的漏洞。

Environment 评估环境

Metrics 评估指标

Baseline Comparisons 基准对比

PentestGPT
GPT-4

Scale 评估规模

1 Metasploitable2 virtual machine with 10 types of exploitable vulnerabilities

Contributions 核心贡献

Design of the Pentest Module capable of autonomously identifying multiple vulnerabilities in a target system with full automation, using Counterfactual Prompting and a RAG-based Instructor
First study to assess LLM capabilities in automating vulnerability remediation, with a Remediation Module that generates high-quality, cost-effective remediation recommendations using a Group Knapsack Algorithm
A new benchmark measuring detection coverage rate, remediation effectiveness, and remediation cost efficiency

设计了能够通过全自动化自主识别目标系统中多个漏洞的渗透测试模块，使用了反事实提示和基于 RAG 的讲师
首次研究评估 LLM 在自动化漏洞修复方面的能力，其修复模块使用分组背包算法生成高质量、具有成本效益的修复建议
一个测量检测覆盖率、修复有效性和修复成本效率的新基准

Limitations 局限性

LLM hallucinations leading to incorrect commands (e.g., non-existent Metasploit modules)
Effectiveness depends on the specific attack tools in the knowledge base, limiting applicability to systems requiring different tools
Tested only on a single Metasploitable2 machine; efficacy on multiple machines or diverse network scenarios is unknown
Remediation strategies are only suggested, not executed; the model does not implement fixes on the target system
Primarily a proof of concept; needs further refinement for production use

LLM 幻觉导致生成错误的命令（例如，不存在的 Metasploit 模块）
有效性取决于知识库中特定的攻击工具，限制了在需要不同工具的系统上的适用性
仅在单台 Metasploitable2 机器上进行了测试；在多台机器或多样化网络场景下的功效尚不可知
修复策略仅是建议，并未执行；模型不会在目标系统上实施修复
主要是概念验证；用于生产环境仍需进一步完善

Research Gaps 研究空白

End-to-end automation that not only suggests but also implements remediation actions on target systems
Reducing LLM dependency on specific attack tools by integrating diverse cybersecurity tools
Testing across multiple machines within a network and varied network scenarios
Enhancing LLM knowledge of vulnerability remediation for hosts, systems, and network infrastructure

不仅建议而且在目标系统上实施修复操作的端到端自动化
通过集成多样化的网络安全工具，减少 LLM 对特定攻击工具的依赖
跨网络中的多台机器以及在各种网络场景下进行测试
增强 LLM 关于主机、系统和网络基础设施漏洞修复的知识

Novel Techniques 新颖技术

Counterfactual Prompting: telling the LLM that previously found vulnerabilities do not exist to force exploration of alternative attack paths
Group Knapsack Algorithm for optimal remediation selection under budget constraints with value/cost scoring based on CVSS
Integration of penetration testing with automated vulnerability remediation in a unified LLM pipeline

反事实提示（Counterfactual Prompting）：告诉 LLM 之前发现的漏洞并不存在，以强制其探索替代攻击路径
在预算约束下通过分组背包算法进行最佳修复选择，并基于 CVSS 进行价值/成本评分
在统一的 LLM 流水线中集成渗透测试与自动漏洞修复

Open Questions 开放问题

How well does counterfactual prompting scale to systems with many more vulnerabilities?
Can the remediation module be extended to autonomously execute fixes rather than just recommend them?
How does the approach perform on real-world networks versus intentionally vulnerable VMs?

反事实提示在拥有更多漏洞的系统上的扩展性如何？
修复模块能否扩展为自主执行修复而不仅仅是建议？
该方法在现实网络中相对于故意设置漏洞的虚拟机表现如何？

Builds On 基于前人工作

PentestGPT
AUTOATTACKER
Penetration-Testing-Task-Tree