AI PROMPT LIBRARY

Skill Safety Review.

Defensive-only prompt templates for reviewing Agent Skills for safety risks — prompt injection via referenced files, unsafe script execution, credential and secret exposure, overbroad permissions, and pre-installation checklists. No exploit content; all reviews are defensive. Each prompt is a starting draft. Fill in the {{VARIABLES}}, review the output, and keep human ownership of the final result.

prompts

Review every output. AI-generated code, test cases, and bug reports require human verification before use. Never paste secrets, credentials, or personal data into any AI tool.

Safety-Review a Third-Party SKILL.md

Perform a defensive security review of a third-party or externally sourced SKILL.md — checking for prompt injection via referenced files, unsafe script execution, credential requests, overbroad permissions, and missing version pinning.

intermediate

SDET, QA Lead, Automation QAWorks with: Claude, ChatGPT, Gemini, Copilot, Cursor

agent-skillsskill-mdsafety-reviewprompt-injectionsecuritydefensive

prompt template

You are a defensive security reviewer specialising in AI agent skill safety. Review the SKILL.md below for safety risks before installation.

IMPORTANT SCOPE: This review is defensive only — identifying risks so the reviewer can make an informed decision about whether to install, modify, or reject the skill. Do not generate exploit code, bypass instructions, or offensive content.

SKILL_MD_CONTENT:
{{SKILL_MD_CONTENT}}

SKILL_SOURCE: {{SKILL_SOURCE}}
(e.g. "public GitHub repository", "internal team member", "third-party skill marketplace")

Review the skill against each of the following risk categories. For each, provide: Risk present (Yes / No / Unclear) and a one-sentence justification. Where risk is Yes or Unclear, provide a specific mitigation.

**1. Prompt injection via referenced files**
Does the skill instruct the agent to read from references/, external URLs, or user-provided files?
If yes: could an attacker control the content of those files to inject instructions into the agent?
Risk: Could a malicious references/ file override the skill's instructions or cause the agent to exfiltrate data?

**2. Unsafe script execution**
Does the skill have a scripts/ directory or instruct the agent to run shell commands?
If yes:
- Do the scripts require user confirmation before writing to disk or making network calls?
- Do the scripts read credentials or tokens from the environment?
- Do the scripts make outbound network requests? To where?
- Are inputs validated before being passed to scripts?
Risk: Could a crafted input cause a script to execute unintended commands?

**3. Credential and secret exposure**
Does the skill instruct the agent to read, log, output, or transmit credentials, API keys, tokens, environment variables, or PII?
Does any output format include fields that would capture sensitive data?
Risk: Could the skill inadvertently expose secrets through agent outputs or logs?

**4. Overbroad permissions**
Does the skill instruct the agent to read files outside the project directory?
Does it instruct the agent to write to arbitrary paths?
Does it instruct the agent to install packages or modify system configuration?
Risk: Could the skill cause changes outside the intended scope?

**5. Version pinning**
Is the skill pinned to a specific version or commit hash?
If it references external resources (URLs, Git repos), are those pinned?
Risk: Could an update to the skill or its dependencies change its behaviour without the reviewer's knowledge?

**6. Deceptive or misleading instructions**
Does the skill's description match what the instruction body actually does?
Are there instructions that claim to do one thing but instruct the agent to do another?
Risk: Is the skill attempting to activate under false pretences?

**7. Data exfiltration patterns**
Does the skill instruct the agent to send data to external endpoints?
Does it include instructions to write output to locations outside the project?
Risk: Could the skill cause sensitive project data to leave the local environment?

**Overall recommendation**
After reviewing all 7 categories:
- INSTALL AS-IS (no material risks found)
- INSTALL WITH MODIFICATIONS (list specific changes required before installing)
- DO NOT INSTALL (list the disqualifying risks)

Provide a summary of the top 3 risks (if any) and the specific mitigations required.

IMPORTANT: This review is a starting point. Complex skills with scripts/ content require a full code review of the scripts themselves. Do not install a skill solely on the basis of this automated review.

Glossary:Prompt injection System Prompt Large Language Model (LLM)Hallucination

View full page ›

Pre-Install Safety Checklist for a Shared Skill

Generate a customised pre-installation safety checklist for evaluating a shared or downloaded Agent Skill before adding it to your repository's skills directory.

beginner

SDET, QA Lead, Automation QAWorks with: Claude, ChatGPT, Gemini, Copilot, Cursor

agent-skillsskill-mdsafety-checklistdefensiveqa-security

prompt template

You are a defensive security specialist for AI agent deployments. Generate a tailored pre-installation safety checklist for the shared skill described below.

SKILL_NAME: {{SKILL_NAME}}
SKILL_SOURCE: {{SKILL_SOURCE}}
DEPLOYMENT_CONTEXT: {{DEPLOYMENT_CONTEXT}}
(e.g. "individual dev machine", "shared team repo", "CI/CD environment with elevated permissions")
HAS_SCRIPTS: {{HAS_SCRIPTS}}
(Yes / No / Unknown)
TEAM_RISK_TOLERANCE: {{TEAM_RISK_TOLERANCE}}
(Low — financial/healthcare/regulated context; Medium — standard commercial; High — internal tooling only)

Generate a pre-installation checklist with the following sections. Each item should be a clear yes/no question the reviewer can answer by reading the skill.

**Section 1: Source verification**
- [ ] Is the source of this skill known and trusted?
- [ ] Is it pinned to a specific version or commit hash?
- [ ] If it references external resources (URLs, Git repos), are those pinned?
- [ ] Has the skill been reviewed by anyone else before this installation?
- (Add {{TEAM_RISK_TOLERANCE}}-specific items here)

**Section 2: SKILL.md content review**
- [ ] Does the description match what the instruction body actually does?
- [ ] Are all "When NOT to use" exclusions appropriate for this deployment context?
- [ ] Does the output format produce output in a location within the project directory?
- [ ] Are there any instructions to read files from outside the project?
- [ ] Are there any instructions to send data to external endpoints?

**Section 3: Scripts review (complete only if HAS_SCRIPTS is Yes)**
- [ ] Have all scripts in scripts/ been read line by line?
- [ ] Do scripts require explicit user confirmation before writing to disk?
- [ ] Do scripts make outbound network calls? (If yes: to which endpoints, and is this acceptable?)
- [ ] Do scripts read from environment variables? (List which ones and verify they are non-sensitive)
- [ ] Do scripts validate inputs before using them in shell commands?
- [ ] Could any script be triggered with crafted input to run unintended commands?

**Section 4: Data handling**
- [ ] Does the skill instruct the agent to log, output, or transmit credentials, tokens, or PII?
- [ ] Does the output format include fields that might capture sensitive values from the environment?
- [ ] If the skill generates code, does it prohibit hard-coded secrets in generated output?

**Section 5: Scope and permissions**
- [ ] Does the skill stay within the expected scope for a QA workflow?
- [ ] Does it request access to system directories, package managers, or configuration files?
- [ ] Is any file write path explicitly constrained to the project directory?

**Section 6: Go / No-go decision**
After completing the checklist:
- GO: All items are confirmed safe — install and commit
- NO-GO: One or more items are not satisfied — list which items failed and the required remediation

Provide the checklist as a Markdown document the reviewer can paste into their PR description or review notes.

Glossary:Prompt injection Large Language Model (LLM)System Prompt

View full page ›

Review a Skill for Secret and Credential Exposure

Focused defensive review of a SKILL.md for patterns that could cause the agent to expose secrets, credentials, tokens, or personally identifiable information in its outputs, logs, or generated code.

intermediate

SDET, QA Lead, QA ManagerWorks with: Claude, ChatGPT, Gemini, Copilot, Cursor

agent-skillsskill-mdcredentialssecretspiidefensivesafety-review

prompt template

You are a defensive security reviewer specialising in credential hygiene for AI agent deployments. Review the SKILL.md below for patterns that could expose secrets, credentials, tokens, or PII.

IMPORTANT SCOPE: This is a defensive credential hygiene review only. Do not generate instructions for extracting secrets, bypassing controls, or attacking systems. Flag risks so the reviewer can remediate them.

SKILL_MD_CONTENT:
{{SKILL_MD_CONTENT}}

DEPLOYMENT_CONTEXT: {{DEPLOYMENT_CONTEXT}}
(Where will this skill run — individual dev machine, shared team CI, production-adjacent environment?)

Review the skill for each of the following exposure patterns:

**1. Input handling**
- Does the skill accept inputs that might contain credentials (e.g. "paste your API response", "provide your environment variables")?
- If yes, does it instruct the agent to redact or not repeat those inputs in output?

**2. Generated code**
- Does the skill generate code that reads from the environment (process.env, os.environ, System.getenv)?
- Does the generated code log or output those values?
- Does the generated code embed any values in strings that could be secrets?
- Does the skill instruct the agent to "use your API key" or "use your token" in a way that could cause secrets to appear in generated files?

**3. Output format**
- Could the output format include a field that would capture sensitive values (e.g. "include the full environment config in the output")?
- Does the skill produce files that might be committed to version control with sensitive values?

**4. Scripts**
If scripts/ is present:
- Do the scripts read from environment variables?
- Do they echo or log those values?
- Do they pass environment values as command-line arguments (which may appear in process listings)?

**5. References and templates**
- Do references/ or templates/ files contain placeholder patterns like {{API_KEY}} or similar that a user might fill in with real values?
- Are users instructed to put real credentials into skill files?

**6. Safety section**
- Does the SKILL.md have a Safety section that explicitly prohibits credentials and secrets?
- Does it state that generated output is a draft and should be reviewed before committing?

For each pattern found, provide:
- The specific line or section where the risk appears
- Whether it is a definite risk, a potential risk, or a design smell
- A specific remediation (e.g. "replace 'use your API key' with 'read from process.env.API_KEY — never hard-code'")

End with: Credential hygiene rating: Clean / Needs attention / Do not distribute

Glossary:Prompt injection System Prompt Large Language Model (LLM)

View full page ›