Human Taste: Content
Evaluate writing and content through human taste -- the trained judgment that detects whether text is clear, authentic, appropriately dense, and tuned to its audience.
This complements the UX-focused human-taste and code-focused human-taste-code skills. For full citations see references/research-sources.md.
Why This Matters
AI can generate fluent text, but fluency is not quality. Studies show AI text detectors achieve only ~58% accuracy -- meaning AI-generated content often passes surface inspection. The real differentiator is taste: the ability to detect when writing is generic, over-hedged, structurally predictable, or tonally flat, even when it is grammatically perfect.
Key insight: AI slop is not a vocabulary problem. It is a thought-structure problem -- predictable arcs, tidy conclusions, and the absence of genuine voice.
Quick Start
When asked to evaluate content:
- Identify the content type -- UI microcopy, documentation, marketing, blog post, error messages, or long-form
- Run the rubric below across all six dimensions
- Produce a Human Taste: Content Report using the output template
- Quote specific text -- never give vague feedback like "needs more personality"
Evaluation Rubric
Score each dimension 1-5. Anchor every score with quoted text from the content.
1. Clarity (weight: high)
Does every sentence earn its place and say exactly what it means?
| Score | Meaning |
|---|---|
| 1 | Muddled -- reader must re-read to extract meaning |
| 2 | Fuzzy -- intent is guessable but buried in filler |
| 3 | Functional -- gets the point across with some waste |
| 4 | Sharp -- each sentence has a clear job; no wasted words |
| 5 | Precise -- could not be said more clearly in fewer words |
Look for: filler phrases, throat-clearing openers, redundant qualifiers, passive voice overuse, sentence length variation.
2. Voice Authenticity (weight: high)
Does this sound like a specific human wrote it -- or like a language model?
| Score | Meaning |
|---|---|
| 1 | AI slop -- predictable structure, generic phrasing, no personality |
| 2 | Synthetic -- grammatically correct but reads like a template |
| 3 | Neutral -- neither distinctly human nor obviously AI |
| 4 | Personal -- has a recognizable voice with occasional character |
| 5 | Distinctive -- unmistakably human; you could identify the author |
Look for: AI slop markers (see checklist below), genuine opinions vs hedged statements, specific examples vs generic claims, humor or personality.
3. Information Density (weight: high)
Is the ratio of insight to words high?
| Score | Meaning |
|---|---|
| 1 | Padded -- says in a paragraph what could be said in a sentence |
| 2 | Diluted -- useful content buried in filler |
| 3 | Adequate -- reasonable density with some bloat |
| 4 | Dense -- most sentences carry new information |
| 5 | Concentrated -- every sentence teaches or moves the reader forward |
Look for: repeated ideas in different words, setup sentences that add nothing, bullet points that could be a single sentence, word-to-insight ratio.
4. Tone Fit (weight: medium)
Does the tone match the audience, context, and moment?
| Score | Meaning |
|---|---|
| 1 | Jarring -- tone is wrong for the situation (casual in crisis, formal in onboarding) |
| 2 | Off -- partially appropriate but inconsistent |
| 3 | Acceptable -- reasonable tone, no major clashes |
| 4 | Tuned -- tone adapts to context within a consistent voice |
| 5 | Perfect pitch -- tone is exactly right for this audience at this moment |
Look for: formality level vs audience, emotional calibration (error messages should be helpful not cute), consistency across touchpoints, tone spectrum (NN/g's four dimensions: formal/casual, serious/funny, respectful/irreverent, matter-of-fact/enthusiastic).
5. Structure (weight: medium)
Does the content architecture serve the reader's needs?
| Score | Meaning |
|---|---|
| 1 | Disorganized -- no clear flow; reader is lost |
| 2 | Jumbled -- some structure but information is in wrong places |
| 3 | Logical -- follows a reasonable order |
| 4 | Guided -- structure anticipates what the reader needs next |
| 5 | Effortless -- scannable, progressive, leads the reader exactly where they need to go |
Look for: heading hierarchy, progressive disclosure, scanability, front-loading of key information, logical flow between sections.
6. Specificity (weight: low)
Does the content use concrete details instead of abstract claims?
| Score | Meaning |
|---|---|
| 1 | Vague -- all abstract claims, no evidence or examples |
| 2 | Generic -- occasional examples but mostly platitudes |
| 3 | Mixed -- some concrete details, some hand-waving |
| 4 | Grounded -- claims are backed by specific examples or data |
| 5 | Vivid -- uses precise details that make abstract ideas tangible |
Look for: numbers vs "many/some/several," named examples vs generic references, specific scenarios vs broad claims, showing vs telling.
AI Slop Checklist
Flag any of these patterns when you detect them. They individually are not proof of AI generation, but clusters of them indicate content lacks human taste.
Word-level markers:
- "Delve," "realm," "underscore," "meticulous," "commendable"
- "Seamless," "robust," "cutting-edge," "future-ready," "leverage"
- "Utilize" instead of "use," "implement" instead of "start/do"
- Excessive em-dashes used as all-purpose connectors
Sentence-level patterns:
- "In today's fast-paced world..." or "In the ever-evolving landscape of..."
- "It's not about X, it's about Y" binary contrasts
- "No X. No Y. Just Z." dramatic fragmentation
- Triple grouping: "fast, efficient, and reliable"
- Throat-clearing openers that delay the point
Structure-level patterns:
- Every section follows the same arc (setup, explanation, conclusion)
- Tidy paragraph structure with no variation in length
- Lists of exactly 3-5 items everywhere
- Conclusions that restate the introduction
Voice-level markers:
- Hedging without commitment ("It's worth considering...")
- Generic enthusiasm without specifics ("This is a game-changer!")
- Corporate verbs: "highlighting," "facilitating," "driving impact"
- No genuine opinions -- everything is balanced to the point of saying nothing
Output Template
Produce your evaluation in this format:
# Human Taste: Content Report
**Subject:** [what was evaluated]
**Content type:** [microcopy / docs / marketing / blog / error messages / long-form]
**Date:** [date]
**Overall Score:** [weighted average, 1-5, one decimal] / 5
## Scores
| Dimension | Score | Key Evidence |
|-----------|-------|-------------|
| Clarity | X/5 | [quoted text as evidence] |
| Voice Authenticity | X/5 | [quoted text as evidence] |
| Information Density | X/5 | [quoted text as evidence] |
| Tone Fit | X/5 | [specific observation] |
| Structure | X/5 | [specific observation] |
| Specificity | X/5 | [quoted text as evidence] |
## AI Slop Flags
- [pattern detected] -- [quoted example from text]
(or "None detected" if clean)
## Strengths
- [concrete strength with quoted evidence]
- [concrete strength with quoted evidence]
## Issues
- **[severity: Critical/Major/Minor]**: [specific issue] -- [why it weakens the content] -- [rewrite suggestion]
## Verdict
[2-3 sentences: what works, what does not, and the single highest-impact edit]
Weighted average formula: (Clarity*3 + VoiceAuthenticity*3 + InformationDensity*3 + ToneFit*2 + Structure*2 + Specificity*1) / 14
Comparing Content Versions
When comparing two drafts:
- Run the rubric on each independently
- Add a Comparison Table with side-by-side scores
- Quote the strongest and weakest passage from each
- Recommend which version to use as the starting point and what to borrow from the other
Content-Type-Specific Guidance
UI Microcopy: Brevity is paramount. Score Clarity and Tone Fit highest. Error messages should be helpful, specific, and action-oriented -- never blame the user.
Documentation: Information Density and Structure matter most. Score against whether a reader can find and use the information without reading the entire page.
Marketing copy: Voice Authenticity and Specificity are critical. Generic claims ("world-class solution") score low. Concrete benefits ("saves 4 hours per week on invoice processing") score high.
Blog / long-form: All dimensions matter equally. Pay extra attention to the AI Slop Checklist -- long-form AI content tends to accumulate structural predictability.
When Not to Use This Skill
- Code review (use
human-taste-code) - Visual/UX design evaluation (use
human-taste) - Grammar/spelling checks (those are automated)
- Translation quality (requires source language comparison)
Additional Resources
- For full research citations and sources, see references/research-sources.md
- For worked examples of the rubric in action, see examples.md