eval-generator-gen-pages

Name: power-cat-skills/eval-generator-gen-pages
Author: microsoft

Share on X Share on LinkedIn Share on Bluesky

100Critical

The skill is critically insecure, utilizing dynamic code execution, unsanitized shell command injection, and arbitrary file access, while lacking necessary security declarations to constrain its high-privilege Node.js operations.

ThreatsCI SM CS BC S HITL

Threat Analysis

AI Agent Traps ↗Scanned 6/20/2026

Content InjectionPerception1 finding

Semantic ManipulationReasoning3 findings

Cognitive StateMemory & Learning1 finding

Behavioural ControlAction8 findings

SystemicMulti-Agent Dynamicsclean

Human-in-the-LoopHuman Overseerclean

Skill Info

Namemicrosoft/power-cat-skills/eval-generator-gen-pages

Registrygithub

Versionc0fdc41b7fc0

PURLpkg:github/microsoft/power-cat-skills@c0fdc41b7fc0?skill=eval-generator-gen-pages

Stars21

Installs2

Source

Repository ↗SKILL.md ↗

Install

This skill has been flagged as potentially malicious. Review the findings below before installing.

GitHub

1 installs

Skills.shDocs

npx skills add https://github.com/microsoft/power-cat-skills

Assessments (7)

AI Agent Traps ↗

Description Mismatchcriticaldescription mismatchAML.T0011.000LLM06:2025LLM09:2025

The skill contains multiple critical security vulnerabilities in its own implementation, specifically regarding dynamic code execution and potential command injection.

85% confidence

The skill content explicitly includes findings for 'js-eval-injection' (use of eval/Function constructor with dynamic input) and 'AISEC_BEHAVIOR_JS_CHILD_PROCESS_IMPORT' (use of child_process for OS command execution).

Potential brand impersonationmediumsocial engineeringAML.T0052.000LLM09:2025

Skill name or description references a well-known AI brand, which may suggest impersonation.

70% confidence

claude

Capability inflation via broad trigger keyword listmediumsocial engineeringAML.T0052.000LLM09:2025

The skill's trigger list ('eval my generative page', 'review my genpage code', 'check my model driven app page') is broad enough to intercept general code review requests unrelated to Power Apps gen pages, potentially hijacking activation for arbitrary TypeScript/React codebases and running security scans + npm installs on them.

65% confidenceline 4

Triggers: 'eval my generative page', 'generate evals for gen page', 'review my genpage code', 'test my generative page', 'check my model driven app page'

Detected use of eval() or Function() constructor with dynamic input. This can lead to code injection if the input is user-controlled. Avoid eval() entirely. Use JSON.parse() for data deserialization or a safe expression evaluator for computed values.criticalvuln

95% confidenceline 108

eval()

95% confidenceline 111

eval(s)

An exported function constructs a shell command by concatenating a parameter value and passes it to a command execution function. Callers passing user-controlled data will produce a command injection vulnerability. Use parameterized APIs or validate input.criticalvuln

95% confidenceline 111

taint source (line 73): s: any → sink: eval(s)

95% confidenceline 12

eval()

Regular expression matches a hostname but isn't anchored with both `^` and `$`. The pattern matches anywhere in the input — `/example\.com/.test("evil.com/example.com/path")` succeeds even though the URL's host is `evil.com`. Anchor with `^…$`, or use a URL parser instead of a regex.highvuln

95% confidenceline 79

/fetch\s*\(\s*['"`]https?:\/\//

JavaScript dynamic code execution (new Function / setTimeout/setInterval with a string body)highcommand executionAML.T0050AML.T0053AML.T0072LLM05:2025LLM06:2025

JavaScript dynamic code execution (new Function / setTimeout/setInterval with a string body) (seen 2 times in this file at lines 739, 977)

100% confidenceline 739

new Function(

Dynamic import of agent-written check fileshighcommand executionAML.T0050AML.T0053AML.T0072LLM05:2025LLM06:2025

presence-runner.ts dynamically imports every *.check.ts file it finds in the presence/ directory via `await import(pathToFileURL(file).href)`. Since the agent itself writes these files in Step 4, a prompt injection in the user's .tsx source could cause the agent to embed malicious code in the check files, which then executes with full Node.js privileges when the runner imports them.

75% confidenceline 816

const mod = await import(pathToFileURL(file).href); const result = mod.check();

Unrestricted shell execution via run()highcommand executionAML.T0050AML.T0053AML.T0072LLM05:2025LLM06:2025

The `run` function in `evals/runner/run-evals.ts` uses `execSync` to execute arbitrary commands provided as strings, which could be manipulated if the input path or environment is controlled by an attacker.

80% confidenceline 851

execSync(cmd, { cwd: ROOT, encoding: 'utf-8', stdio: 'pipe' });

child_process module imported — any exec/execSync/spawn call executes OS commandshighcommand executionAML.T0050AML.T0053AML.T0072LLM05:2025LLM06:2025

child_process module imported — any exec/execSync/spawn call executes OS commands

95% confidenceline 5

import { execSync, execFileSync } from 'child_process';

Potential path traversal in file operationsmediumresource abuseAML.T0029AML.T0034LLM10:2025

The skill constructs file paths using `join(ROOT, ...)` where `ROOT` is derived from the current working directory, potentially allowing access to files outside the intended project directory if the user-provided path is malicious.

70% confidenceline 639

const ROOT = join(__dirname, '..', '..');

Over-privileged file system accessmediumresource abuseAML.T0029AML.T0034LLM10:2025

The skill performs recursive globbing and file reading across the entire project directory, which may include sensitive configuration files or credentials not related to the generative page.

60% confidenceline 656

const srcFiles = globSync(`${ROOT}/**/*.ts?(x)`, { ignore: `${ROOT}/evals/**` });

execSync command string includes manifest-derived project namemediumcommand executionAML.T0050AML.T0053AML.T0072LLM05:2025LLM06:2025

run-evals.ts passes commands to execSync with cwd set to ROOT, which is derived from __dirname path traversal. If a malicious .tsx file causes the agent to write a crafted manifest.json with a project name containing shell metacharacters, and that value is later interpolated into a shell command, it could enable command injection through the eval pipeline.

50% confidenceline 851

execSync(cmd, { cwd: ROOT, encoding: 'utf-8', stdio: 'pipe' }) where ROOT = join(__dirname, '..', '..')

Insecure dependency managementmediumsupply chainAML.T0010AML.T0019AML.T0058LLM03:2025

The skill automatically runs `npm install` and uses `npx` to execute tools without verifying the integrity or version of the packages, exposing the environment to supply chain attacks.

90% confidenceline 859

run('npx tsx evals/runner/presence-runner.ts', 'presence');

Dashboard template injection riskhighprompt injectionAML.T0051AML.T0051.000AML.T0051.001AML.T0054LLM01:2025

The skill dynamically injects JSON data into an HTML dashboard template using string replacement, which could lead to XSS if the feature IDs or descriptions contain malicious payloads.

80% confidenceline 934

html = html.replace(/\/\* BAKED DATA[\s\S]*?END BAKED DATA \*\//m, dataBlock);

Unpinned npx package execution — `npx <pkg>` without a version pin pulls latest from npm at runtimemediumsupply chainAML.T0010AML.T0019AML.T0058LLM03:2025

Unpinned npx package execution — `npx <pkg>` without a version pin pulls latest from npm at runtime (seen 3 times in this file at lines 859, 862, 865)

100% confidenceline 859

npx tsx

Missing allowed-tools declarationlowunauthorized tool use

Skill executes commands, writes files, or accesses the network but declares no allowed-tools, so its tool surface cannot be reviewed or constrained.

90% confidence

Undeclared network usagelowunauthorized tool use

Skill contains network access patterns (HTTP requests, socket connections) without declaring network capability in allowed-tools.

60% confidence

Missing licenseinfopolicy violation

Skill does not specify a license field. Specifying a license helps users understand usage terms.

100% confidence

Badge

Markdown

[![Mondoo Skill Check](https://mondoo.com/ai-agent-security/api/badge/github/microsoft/power-cat-skills/eval-generator-gen-pages.svg)](https://mondoo.com/ai-agent-security/skills/github/microsoft/power-cat-skills/eval-generator-gen-pages)

HTML

<a href="https://mondoo.com/ai-agent-security/skills/github/microsoft/power-cat-skills/eval-generator-gen-pages"><img src="https://mondoo.com/ai-agent-security/api/badge/github/microsoft/power-cat-skills/eval-generator-gen-pages.svg" alt="Mondoo Skill Check" /></a>

Image URL

https://mondoo.com/ai-agent-security/api/badge/github/microsoft/power-cat-skills/eval-generator-gen-pages.svg

Secure your AI agents

Skills can read files, run commands, and access credentials. Mondoo helps organizations manage the security risks of AI agent skills across their entire fleet.

Continuous scanning of skills across all registries
Policy enforcement before skills reach your agents
Integration with your existing security workflow