ai-writing-detector is a TypeScript/Node.js CLI for analyzing prose with a
rule-based AI-writing heuristic. It looks for vocabulary, structural patterns,
vague claims, promotional language, and several statistical signals that are
commonly associated with AI-generated text.
The project is designed around explainable scoring rather than model inference. Instead of sending text to an external service, it inspects the input locally and builds a score from explicit detectors and linguistic analyzers.
The repository currently includes:
- AI-associated vocabulary and phrase matching
- Structural patterns such as rule-of-three, negative parallelism, outline-style conclusions, and false ranges
- Vague attribution, superficial phrasing, and overgeneralization
- Promotional language, excessive emphasis, and elegant variation
- Statistical analysis including lexical diversity, sentence-length variation, passive voice, transition density, readability, punctuation, and rare-word usage
- Score normalization and classification into:
Likely Human-WrittenPossibly AI-GeneratedLikely AI-Generated
See ROADMAP.md for implementation status and REQUIREMENTS.md for the original challenge brief.
This repository was built with a custom agent loop documented in LOOP.md.
The important part is that the loop is agent-driven, not a static shell script:
- An LLM agent running in OpenCode executes
/opsx-loop - The loop works phase-by-phase from ROADMAP.md
- For each phase, the agent creates or updates openspec artifacts, writes tests, implements code, runs quality checks, updates docs, and archives the phase
- Progress is resumable, so re-running the loop continues from the next unchecked phase
Typical loop commands:
/opsx-loop
/opsx-loop 2
/opsx-loop 3-5At a high level, each phase follows this pattern:
- Read the next incomplete phase from the roadmap.
- Create or reuse one openspec change for that phase.
- Write specs, design notes, and tasks for the phase as a unit.
- Add tests before implementation.
- Implement the phase tasks and mark them complete.
- Run checks, update docs, commit, and archive the change.
That loop is why the repo contains both implementation code and the supporting
spec artifacts under openspec/.
npm install# Build
npm run build
# Test
npm test
npm run test:watch
# Lint and format
npm run lint
npm run lint:fix
npm run format
# Type checking
npm run typecheckBuild first:
npm run buildThen run the CLI against a file:
npm run start -- analyze samples/human-written/technology.txtOr pass text on stdin:
cat samples/ai-generated/business.txt | npm run start -- analyze ignored.txt --stdinCurrent command surface:
npm run start -- analyze <file>
npm run start -- analyze <file> --stdinThe current CLI entrypoint validates input and prints basic text statistics for the supplied text:
- character count
- word count
This behavior comes from src/cli.ts and src/output/display.ts.
The repository also contains the full scoring and report-generation pipeline in
src/scoring/ and src/report/, including classification and formatted report
output. Those modules are implemented and tested, but they are not yet wired
into the default CLI command.
src/
├── analyzers/ # Statistical and linguistic analysis
├── detectors/ # Rule-based pattern detectors
├── input/ # File and stdin handling
├── output/ # Console display helpers
├── report/ # Report assembly and CLI formatting
├── scoring/ # Score aggregation, normalization, classification
└── utils/ # Tokenization and statistics helpers
- This tool analyzes natural-language writing, not source code authorship.
- It is heuristic-based, so results should be treated as signals, not proof.
- Short or heavily edited text can reduce accuracy.
Sample inputs for manual testing are included in:
samples/ai-generated/samples/human-written/

