A comprehensive, step-by-step guide to running penetration tests the way that actually works — faster, deeper, and more powerful — after hundreds of engagements and countless tools.
Introduction
Penetration testing hasn't changed because of AI. The kill chain is still the kill chain: scope → recon → enumeration → exploitation → post-exploitation → reporting. What has changed is the execution layer — how fast you can move, how consistently you can capture artifacts, and how well you can turn raw output into decisions.
Most "AI pentesting" content today is either:
- a chatbot generating one-off commands with no state, or
- a flashy demo that collapses the real workflow into a single prompt.
That's not how real engagements work.
Real engagements are won by pipeline discipline: every phase produces artifacts (hosts, ports, screenshots, creds, notes), and those artifacts drive the next phase. The bottleneck isn't knowing what to do — it's doing it reliably, documenting it, and iterating without losing context.
That's why this guide is centered around a specific model that consistently works in practice:
Cursor as the control plane + MCP as the tool bus + a modern PT toolset as the execution layer.
Cursor isn't "another chat." It's an IDE that can see your project, your files, your outputs, and your knowledge base. Add MCP, and the same interface can orchestrate real tooling — recon, scanning, Burp workflows, exploit chains — while keeping everything inside a repeatable project structure. The result is not "automatic hacking." It's manual methodology with amplified throughput: faster recon loops, better correlation of findings, cleaner artifacts, and dramatically improved reporting velocity.
This post is a step-by-step playbook for building that workflow:
- how to set up a PT machine and keep it stable,
- how to structure engagements so the AI has correct context,
- how to run the kill chain as a pipeline (not random commands),
- and how to use AI where it's strongest: orchestration, analysis, iteration, and narrative building — while you stay in control of scope, judgment, and risk.
If you want a workflow that feels like a high-quality manual PT — just faster, deeper, and harder to derail — this is the one.
Table of Contents
- Why Cursor? The Short Answer After Many PTs
- How AI-Driven PT Works: The Big Picture
- Building and Configuring Your PT Toolset
- Starting a New PT Project: Structure and Knowledge Base
- The Kill Chain as a Pipeline: Step by Step
- Examples and Deep Dives
- References and Further Reading
1. Why Cursor? The Short Answer After Many PTs
After running many penetration tests with many different tools — Nmap, Burp Suite, Metasploit, Hydra, HexStrike-AI, HackerAI, standalone LLMs, and MCP in isolation — the approach that consistently delivers the best results is working with Cursor as the central interface.
Why Cursor wins:
- Single surface for everything. You stay in one environment: natural language, code generation, script execution, and tool orchestration. No constant switching between terminal, browser, and chat.
- Full workflow, not just "chat with a tool." Cursor is an IDE with an AI that sees your project, your files, and your context. When you add MCP (Model Context Protocol), the same AI can drive HexStrike-AI, Burp, Shodan, and other tools. That's not a chatbot with plugins — it's an ops copilot that can run a full PT from recon to report. See: HexStrike + Gemini vs. HackerAI: "Ops Copilot" vs. "Chatbot with Tools".
- Project and knowledge base in one place. You create a project folder, add scope docs, previous reports, and tool outputs. The AI uses all of that as context. That's how you get "like a regular manual PT, just faster and deeper."
- Pipeline thinking. Each phase of the kill chain produces outputs (scan results, credentials, screenshots). You feed those into the next phase. Cursor + MCP lets you do that in a structured way: same rigor as manual PT, with automation and AI-assisted decisions.
So: the best way to work, based on real experience, is Cursor + MCP + your PT toolset (e.g. HexStrike-AI, Burp MCP) — with a clean toolset, a dedicated project layout, and a project-specific knowledge base. The rest of this guide details how to build that.
2. How AI-Driven PT Works: The Big Picture
AI-driven PT is still a manual methodology in spirit: you follow a kill chain, you analyze each step, you decide when to go deeper or go back. The difference is how you execute and analyze.
- You work in Cursor (or another MCP-capable client) connected to your tools via MCP (e.g. HexStrike-AI, Burp Suite MCP).
- You give high-level instructions in natural language ("enumerate this subnet," "find web vulns on this host," "crack this hash with context from the client name").
- The AI proposes and runs tool invocations (Nmap, Nuclei, Burp, Hydra, etc.), reads outputs, and suggests next steps.
- You review results at each stage. You can ask to re-run with different options, go back to recon, or drill into a specific finding.
- Outputs form a pipeline: recon → scanning → exploitation → post-exploitation → reporting. Each step's artifacts (files, logs, screenshots) are stored in the project and used as context for the next.
So: same phases as a regular manual PT, with AI and automation making each phase faster and more thorough. For a full lab example, see HexStrike + Cursor (MCP): From Single Target → Full Subnet Compromise (Lab PT Walkthrough).
3. Building and Configuring Your PT Toolset
Before you run a single engagement, your environment must be ready. That means: base OS (Kali, Arch, or custom), tools updated, API keys set, wordlists and rainbow tables in place, and dependencies installed. Do this once (and maintain it); then every new PT starts from a known-good state.
3.1 Choose and Harden Your Base System
- Kali Linux — Easiest: most tools preinstalled, good for getting started. See HexStrike on Kali Linux 2025.4: A Comprehensive Guide.
- Arch (BlackArch or plain Arch + tools) — Rolling updates, minimal by default; you add only what you need.
- Custom (e.g. Ubuntu/Debian + your own tool list) — Full control; you maintain updates and dependencies yourself.
Regardless of base:
- Use a dedicated VM or physical machine for PT (never on a production or corporate main workstation without policy approval).
- Snapshot the VM after a clean toolset build so you can revert if something breaks.
- Keep the system updated regularly (
apt update && apt full-upgradeor equivalent).
3.2 Install and Update Core Tools
Install or verify these (versions and names may differ by distro):
- Recon / scanning: Nmap, Masscan, Nikto, Nuclei, theHarvester, Sublist3r, Amass, SpiderFoot. Refs: Nmap, theHarvester, Sublist3r, OWASP Amass, SpiderFoot.
- Web: Burp Suite (Pro or Community), OWASP ZAP, SQLMap, Dirb/Gobuster/FFuf. Refs: Burp Suite, Getting More from Burp Suite with LLMs.
- Exploitation: Metasploit Framework, searchsploit, custom PoC scripts. Ref: Metasploit.
- Credential testing: Hydra, Medusa, Ncrack, John the Ripper, Hashcat. Refs: Hydra, John the Ripper, Hashcat.
- Wireless (if in scope): Aircrack-ng suite. Ref: Wifi cracking with Aircrack-ng.
- AI/MCP stack: HexStrike-AI (or your chosen MCP server), Cursor (or another MCP client). Refs: HexStrike AI: Install, Configure, and Run MCP with Gemini, OpenAI, Cursor, Llama, Burp Suite MCP + Gemini CLI.
After install, run tool-specific update commands (e.g. msfupdate, nuclei -update-templates, searchsploit -u).
3.3 API Keys and External Services
Configure once; reuse for every project:
- Shodan — Recon and exposed-asset discovery. Create key at shodan.io, set in env (e.g.
SHODAN_API_KEY) or in HexStrike/Shodan config. Ref: Shodan — you can find everything, Integrating Shodan with HexStrike-AI Using Gemini-CLI.
- VirusTotal (optional) — Hash/file/URL lookups. Use env or config.
- Censys / ZoomEye / etc. (optional) — Same idea: key in env or tool config so MCP/AI can use them when needed.
Store keys in a secure store (e.g. env file not in git, or a secrets manager) and document in your internal runbook where each tool reads them from.
3.4 Wordlists, Username Lists, and Directory Dictionaries
Create a single, well-organized directory (e.g. /opt/wordlists or ~/wordlists) and keep it updated:
Best resource:
https://weakpass.com/wordlists/
- Passwords:
- RockYou (and variants), SecLists
Passwords/, weak passwords from breaches, company/context-specific lists. - For AI-assisted cracking: HexStrike + Gemini. AI-Assisted SMB Credential Brute-Force, AI-Driven ZIP Password Recovery with HexStrike-AI and Gemini-CLI.
- Usernames:
- SecLists
Usernames/, generated lists from OSINT (names, emails, IDs). Feed these into Hydra/Medusa and into AI context for smarter guessing. - Web/directory:
- SecLists
Discovery/Web-Content/, DirBuster lists, custom lists for tech stacks (e.g.admin,wp-admin,.git, API paths). - Subdomains:
- SecLists, Amass wordlists, custom from previous engagements (sanitized).
Point HexStrike and your scripts at this directory so the AI can "choose the right list" based on target (e.g. AI-Driven PDF/Office Password Recovery, Office Documents).
3.5 Rainbow Tables and Hash Cracking Resources
- Rainbow tables: For LM/NTLM and other algorithms where you've decided to use them (storage-heavy). Store in a known path and document which hash types they cover.
- Hashcat/John rules: Install and update rule sets (e.g. Hashcat rules, John rules) so the AI can suggest "try rule X with this wordlist" for credential and file-cracking tasks. Refs: John the Ripper, Hashcat.
3.6 Dependencies and Runtimes
- Python 3 — Many tools and HexStrike depend on it. Use a venv per project if you have custom scripts.
- Ruby (Metasploit), Go (many modern tools) — Install as required by your distro.
- GPU drivers (for Hashcat) — Install and test if you use GPU cracking.
- MCP server dependencies — Whatever HexStrike-AI or Burp MCP need (Node/Python, etc.); document in your setup guide.
After this section, your "PT machine" is a single, repeatable environment: base OS + updated tools + API keys + wordlists + cracking resources + dependencies. Then every new engagement starts from the same foundation.
4. Starting a New PT Project: Structure and Knowledge Base
For every new penetration test, do the following before running any attacks. This keeps engagements consistent and lets the AI use project-specific context.
4.1 Create a Project Directory
Create one folder per engagement, for example:
~/pt-projects/
client-name-2025-01/
scope.txt # In-scope IPs/domains, OOB, rules of engagement
notes.md # Timeline, credentials found, open questions
recon/
scan/
exploit/
postex/
report/
kb/ # Knowledge base (see below)- scope.txt (or scope.md): In-scope targets, out-of-scope, and any constraints (e.g. "no DoS," "only these subnets").
- notes.md: Running log: what you did, what you found, credentials, next steps. The AI can use this file when you ask "what have we tried on this host?"
- recon / scan / exploit / postex / report: Store phase outputs here (see pipeline below). Use subfolders per host or per phase if needed (e.g.
scan/nmap/,scan/burp/).
4.2 Build a Project-Specific Knowledge Base (KB)
Put in kb/ (or a dedicated subfolder) everything that should guide this engagement:
- Official docs: Product/version docs for in-scope systems (e.g. CMS, API framework, VPN). PDFs or cleaned HTML are fine.
- Previous reports: Sanitized excerpts or summaries from past PTs for the same client (if allowed). Helps the AI suggest "last time we saw X, try Y."
- Scope and RoE: Copy or link scope, rules of engagement, and any communication constraints.
- Asset list (if provided): IPs, hostnames, critical apps. The AI can use this to prioritize and avoid out-of-scope targets.
You don't need to paste every doc into the chat — Cursor (and the AI) can read files in the project. So you can say: "Using the docs in kb/, suggest the best way to test this API." This is how you get "like a regular manual PT": the same kind of context a human would use, but available to the model.
4.3 Open the Project in Cursor and Connect MCP
- Open the project root (e.g.
~/pt-projects/client-name-2025-01) in Cursor. - Ensure MCP is configured so Cursor talks to HexStrike-AI (and Burp MCP if you use it). Ref: HexStrike AI: Install, Configure, and Run MCP with Gemini, OpenAI, Cursor, Llama.
From here on, all recon, scanning, exploitation, and note-taking can be driven from Cursor with the project and KB as context.
5. The Kill Chain as a Pipeline: Step by Step
Run the engagement as a pipeline: each phase produces outputs that feed the next. After each phase, analyze results; if something is unclear or you need more data, go back to an earlier step (e.g. more recon or a different scan). This is the same discipline as a manual PT — just faster and with AI assistance.
5.1 Reconnaissance (Recon)
- Goals: Discover in-scope targets, subdomains, tech stack, and public exposure.
- Actions:
- Passive: Shodan, theHarvester, Amass, Sublist3r, SpiderFoot (with API keys where needed).
- Active (only within RoE): light Nmap (e.g. -sV -sC on allowed IPs).
- Outputs to save in project:
recon/domains.txt,recon/subdomains.txt,recon/ips.txt,recon/shodan_*.json(or similar),recon/theharvester_*.json.- AI use: Ask Cursor/HexStrike to run Shodan/theHarvester/Amass for the scope, then "summarize findings and suggest next scans." Refs: Integrating Shodan with HexStrike-AI, HexStrike + Cursor for OSINT: From One Email to a Full Exposure Map.
- Pipeline: Recon outputs become the target list for scanning (hosts, ports, services).
5.2 Scanning and Enumeration (Scan)
- Goals: Open ports, services, versions, and potential vulns (e.g. Nuclei, Nikto, default creds).
- Actions:
- Nmap (version + scripts) on targets from recon.
- Nuclei with appropriate templates.
- Web: Burp/ZAP crawls and scans; Dirb/Gobuster/FFuf for paths.
- Optional: vulnerability scanners (Nessus, etc.) if in scope.
- Outputs to save:
scan/nmap/<host>.xmland.nmap,scan/nuclei/*.json,scan/burp/*.xmlor project file,scan/dirb_*.txt.- AI use: "From
recon/ips.txtand scope, run Nmap and Nuclei; put results inscan/. Then prioritize by criticality." Refs: Reinventing Recon: Nmap Meets ChatGPT, Getting More from Burp Suite with LLMs, AI-Driven Web Application Pentesting with HexStrike-AI. - Pipeline: Scan results (CVEs, services, paths) feed into exploitation (exploit selection, payload design).
5.3 Exploitation
- Goals: Gain initial access (or prove impact) using findings from scan + recon.
- Actions:
- Metasploit, searchsploit, custom PoCs.
- Web: Burp-based exploitation (SQLi, XSS, SSRF, etc.).
- Credential stuffing / brute-force where allowed (using wordlists from your toolset).
- Wireless: only if in scope (AI-Driven Wireless Penetration Testing. One Prompt WIFI cracking).
- Outputs to save:
exploit/<host>_<service>_notes.md, screenshots, hashes/creds innotes.md(or a creds file with care).- AI use: "Given this Nmap/Nuclei output in
scan/, suggest and run Metasploit modules or PoCs; document results inexploit/." Ref: HexStrike+OpenAI Codex. AI-Driven Exploitation of Metasploitable. - Pipeline: Obtained access and credentials feed post-exploitation (lateral movement, privilege escalation, persistence).
5.4 Post-Exploitation
- Goals: Escalate privileges, move laterally, and (if in scope) demonstrate full impact (e.g. domain compromise).
- Actions:
- Local enumeration (WinPEAS, LinPEAS, etc.), credential dumping, pass-the-hash, Kerberos attacks (e.g. ADCS ESC8), DCSync.
- Pivot and re-scan from compromised host (using project context and scope).
- Outputs to save:
postex/<host>_*.md, dumps (stored securely), screenshots.- AI use: "We have access to host X and creds Y; suggest next steps for privilege escalation and lateral movement given scope." You can reference internal docs (e.g. ADCS, BloodHound) in
kb/. - Pipeline: Post-ex findings and evidence feed reporting.
5.5 Analysis and Iteration
- After each phase:
- Review AI-suggested and tool outputs.
- If results are insufficient or ambiguous, go back: e.g. more recon, different Nmap options, or a deeper Burp scan.
- Update
notes.mdand phase folders so the next step has full context.
This loop (recon → scan → exploit → postex → analyze → possibly back) is exactly like a manual PT; the difference is that the AI can propose and run many of the commands and correlate findings for you. For an end-to-end example: AI-Driven Pentesting at Home: Using HexStrike-AI for Full Network Discovery and Exploitation.
5.6 Reporting
- Goals: Clear, evidence-based report (findings, impact, remediation).
- Inputs: All artifacts in
recon/,scan/,exploit/,postex/, andnotes.md. - AI use: "Draft the executive summary and findings sections from
notes.mdand the evidence in the project; use scope inscope.txtfor in-scope summary." You then edit and approve. Ref: Augmenting Digital Forensics with AI: How ChatGPT Transforms Investigation Workflows (same idea: AI helps turn technical output into structured narrative).
6. Examples and Deep Dives
- Full lab PT (single host → subnet): HexStrike + Cursor (MCP): From Single Target → Full Subnet Compromise (Lab PT Walkthrough).
- OSINT from one email: HexStrike + Cursor for OSINT: From One Email to a Full Exposure Map.
- Web + cloud with Cursor and MCP: AI-Assisted Web and Cloud Penetration Testing with Cursor + MCP HexStrike and Burp Suite MCP.
- Credential attacks (SMB/SSH): HexStrike + Gemini. AI-Assisted SMB Exposure Credential Brute-Force, HexStrike + Gemini. AI-Assisted SSH Credential Brute-Force.
- Password/file recovery (ZIP, PDF, Office): AI-Driven ZIP Password Recovery with HexStrike-AI and Gemini-CLI, AI-Driven PDF Password Recovery, AI-Driven Office Documents Password Recovery.
- Tool dev and env setup with Cursor: Hacker Tool Development Workflow: Android Rubber Ducky Payloads in Cursor AI, Building a USB Rubber Ducky with Arduino Leonardo with Cursor.
- Threat hunting (defender side, same mindset): Endpoint Threat Hunting: Proactive Detection on Windows, Linux, and macOS, Protocol-Level Network Threat Hunting: A Wireshark-Centric Guide, Threat Hunting with the Pyramid of Pain.
- Broader picture: HexStrike-AI: A Force Multiplier for Red Teams — and a Dangerous Shift in the Threat Landscape, The 20x Employee: A Strategic Framework for Unlocking Hyper-Productivity with Artificial Intelligence.
7. References and Further Reading
My Articles (by topic)
- Setup and MCP: HexStrike AI: Install, Configure, and Run MCP with Gemini, OpenAI, Cursor, Llama, HexStrike on Kali Linux 2025.4: A Comprehensive Guide.
- Burp + AI/MCP: Getting More from Burp Suite with LLMs, Burp Suite MCP + Gemini CLI.
- Recon and OSINT: Reinventing Recon: Nmap Meets ChatGPT, Shodan, theHarvester, Sublist3r, OWASP Amass, SpiderFoot.
- Core tools (no AI): Nmap, Burp Suite, Metasploit, Hydra, John the Ripper, Hashcat, Aircrack-ng.
External frameworks and references
- MITRE ATT&CK: attack.mitre.org — Map your actions and findings to tactics and techniques.
- PTES (Penetration Testing Execution Standard): ptes.org — High-level PT methodology.
- OWASP: owasp.org — Web and API testing (e.g. Top 10, Testing Guide).
- NIST SP 800–115 (Technical Guide to Information Security Testing and Assessment): For scope and methodology alignment in formal engagements.
Summary
- Best way to work after many PTs: Cursor + MCP + your PT toolset (e.g. HexStrike-AI, Burp MCP), with a clean, repeatable environment and a project-based workflow.
- Environment: One base system (Kali/Arch/custom), all tools updated, API keys set, wordlists and cracking resources in one place, dependencies documented.
- Per engagement: One project directory, scope and RoE, a small knowledge base (docs, previous reports), and phase folders (recon → scan → exploit → postex → report).
- Execution: Run the kill chain as a pipeline: each phase's outputs feed the next; analyze after each step and go back when needed; use the AI to propose and run tools and to draft reports from your artifacts.
Same rigor as a manual PT — faster, deeper, and more powerful.
This guide is based on many real penetration tests and the articles linked above. For the full narrative on AI in cybersecurity and the evolution from simple LLM use to Cursor + MCP, see The AI Revolution in Cybersecurity: A Comprehensive Journey Through Modern AI-Driven Security Operations (and the summary on my Medium).
Andrey Pautov