Propaganda Detection with ChatGPT: Techniques, Prompts, and Real-World Examples (2025)


Propaganda Detection with ChatGPT: Techniques, Prompts, and Real-World Examples (2025)
Aug, 26 2025 Artificial Intelligence Isabella Hartley

Propaganda spreads faster than fact-checks, and the gap isn’t shrinking on its own. The win right now is pairing human judgment with language models that recognise manipulative patterns at scale. This guide shows how to turn ChatGPT into a practical spotter for loaded language, whataboutism, scapegoating, and more-without pretending it’s a magic lie detector. You’ll get prompts, rules of thumb, real examples, and guardrails so you can test this in a newsroom, classroom, or moderation queue.

TL;DR

  • ChatGPT won’t prove truth, but it’s strong at flagging rhetorical techniques. Anchor it to clear labels (e.g., loaded language, name-calling) and require short, structured justifications.
  • Use few-shot prompts, a JSON schema, and confidence scores. Combine model output with lightweight rules for better precision on high-risk content.
  • Always pair analysis with evidence checks. Use external sources or a retrieval step; don’t let the model guess facts.
  • Measure accuracy on a small, labelled sample before rollout. Track precision/recall and agreement between human reviewers.
  • Keep bias safeguards: judge techniques, not ideologies. Escalate close calls to humans. Log decisions for audit.
  • Best quick wins: headlines triage, ad copy review, political claims pre-screening, and classroom media literacy exercises.

Build a Smarter Propaganda Detector with ChatGPT (Step-by-Step)

If you clicked this, your jobs-to-be-done likely look like this: spot manipulation early, scale review without drowning your team, set fair rules that hold up under scrutiny, surface evidence, and avoid false positives that erode trust. Here’s a practical build that maps to those goals.

1) Define what you’ll detect (shared language matters)

When reviewers label the same thing differently, your model will wobble. Use a simple, published taxonomy and keep it consistent. The SemEval-2020 Task 11 organisers catalogued techniques often used in news and opinion. These are reliable building blocks:

  • Loaded language: emotionally charged wording to sway feelings (e.g., “invasion,” “flood,” “traitor”).
  • Name-calling and labeling: attaching negative tags to people or groups.
  • Appeal to fear/prejudice: triggering anxiety or stereotypes to shortcut thinking.
  • Bandwagon: “everyone knows,” “the country agrees,” aiming for herd effect.
  • Black-and-white fallacy: no middle ground; false dichotomies.
  • Whataboutism: deflecting criticism by pointing to someone else’s fault.
  • Red herring: shifting the topic to distract from the core issue.
  • Flag-waving: tying a stance to patriotism, identity, or loyalty.
  • Exaggeration/minimisation: inflating or downplaying facts to steer reaction.
  • Repetition: restating claims for effect, not clarity.
  • Transfer/association: borrowing prestige or stigma to influence feelings.
  • Causal oversimplification: blaming one cause without evidence.

Pick 6-10 you actually need. Fewer, clearer labels beat an encyclopedic list that weakens agreement.

2) Set a simple labelling rubric

  • Technique tags: one or more from your shortlist.
  • Severity: low (subtle), medium, high (overt or repeated and consequential).
  • Confidence: 0-100, where 50 = “could go either way.”
  • Rationale: one sentence quoting or paraphrasing the cue words or structure that triggered the tag.
  • Evidence needed: yes/no plus what to check (e.g., crime stats by year, original quote).

This structure helps reviewers coach the model and lets you defend decisions later.

3) Craft a prompt that enforces the rubric

Models drift when prompts are vague. Use a system instruction that tells the model to judge rhetoric, not ideology, and to avoid guessing facts. Require a strict output format. Add two or three few-shot examples that match your setting (headlines, short posts, long articles, or ad copy).

Here’s a compact template you can adapt:

  • Instruction: “Identify propaganda techniques in the text. Judge language and logic only. Do not assume facts. If uncertain, explain why and lower confidence.”
  • Allowed tags: your shortlist, each defined in 1 line.
  • Output schema: {"techniques": [..], "severity": "low|medium|high", "confidence": 0-100, "rationale": "...", "evidence_needed": "..."}
  • Few-shot examples: 2-3 short pairs of input → desired JSON output, including one “not propaganda” case.

Set temperature low (0.1-0.3) for consistent labelling. Ask for brief rationales, not long essays-that keeps the model focused and reduces drift.

4) Add lightweight pre- and post-processing

  • Language and translation: detect language, then translate to English for a single rubric. Keep the original text for quotes.
  • Segmentation: long articles? Split into paragraphs and aggregate results (e.g., if 3+ sections show the same tag, raise severity).
  • De-duplication: remove repeated posts before analysis; repetition is itself a tag you can capture once.
  • Whitelist/quote detection: don’t mark quoted historical material as propaganda if the framing is neutral. Add a rule: “If the text is a direct quote with neutral reporting verbs, lower severity.”

5) Evidence checks (where truth-handling actually happens)

ChatGPT can’t be your final fact source. Pair it with basic evidence retrieval. That can be manual (for small teams) or automated (for scale). Good sources to anchor claims include official statistics offices, court records, and transparent NGOs. In the UK, Ofcom’s 2024 News Consumption report is a useful baseline on where people get information and common misinfo vectors. EUvsDisinfo and the Poynter Institute’s IFCN network document recurring narratives. Cite sources by name in your notes so others can re-check.

6) Workflow that actually fits a team

  1. Ingest: pull posts/headlines/articles into a queue with timestamps and source metadata.
  2. Pre-process: language detect, translate if needed, split long texts.
  3. Model pass: run the prompt; collect JSON outputs, not free text.
  4. Rules overlay: apply business rules (e.g., if “appeal to fear” and “high severity,” escalate to senior review).
  5. Human review: a second pair of eyes checks the rationale and searches for evidence if the model flagged “evidence needed.”
  6. Decision: label as propaganda, persuasive but fair, or not applicable. Store the decision, the model’s output, and reviewer notes.
  7. Feedback: feed corrected cases back into your few-shot pool every week so the model reflects your edge cases.

7) Why LLMs here? A quick comparison

  • Rule-based: precise for known phrases (“traitors,” “infestation”), brittle to rewording, easy to explain, low maintenance once set.
  • Traditional ML (n-grams, classic classifiers): decent on patterns with enough labelled data, but expensive to maintain across languages and domains.
  • LLMs (ChatGPT): strong zero/low-shot classification of rhetorical techniques, adapts across topics, needs guardrails and calibration to manage variance.

A hybrid works best: rules for must-catch cues (compliance), LLM for nuance, human review for impact.

Prompts, Real-World Examples, and How to Measure Accuracy

Prompts, Real-World Examples, and How to Measure Accuracy

A. Solid prompt you can copy

Instruction (short): “You are a media analyst. Identify propaganda techniques in the text. Judge rhetoric, not ideology or truth. If you suspect a technique, quote or paraphrase the cue words. Keep the output short, valid JSON, and include a confidence score.”

Allowed tags (define briefly): loaded_language, name_calling, appeal_to_fear, bandwagon, black_and_white, whataboutism, red_herring, flag_waving, exaggeration_minimization, repetition, transfer_association, causal_oversimplification.

Schema: {"techniques":[],"severity":"low|medium|high","confidence":0-100,"rationale":"","evidence_needed":"none|data|source_quote"}

Few-shot example 1 (headline): “Our city is being overrun by outsiders who don’t share our values.” → Techniques: loaded_language, appeal_to_fear. Rationale: “overrun,” “outsiders,” identity framing. Severity: high. Confidence: 86. Evidence_needed: data (if it also makes factual claims like crime rates).

Few-shot example 2 (tweet): “You complain about pollution, but what about countries X and Y? Hypocrites.” → Technique: whataboutism, name_calling. Severity: medium. Confidence: 78. Evidence_needed: none (rhetorical move, not a factual claim).

Few-shot example 3 (neutral report): “The minister said ‘crime rose 5%’; the statistics office will release its report on Friday.” → Techniques: none. Severity: low. Confidence: 90. Rationale: neutral framing; attribution present. Evidence_needed: source_quote.

Drop two or three of these directly into your prompt. It grounds the model and reduces random drift.

B. Real-world style examples

Example 1: “Immigrants are flooding our streets, bringing crime and chaos. We must take our city back.”

  • Likely tags: loaded_language (“flooding,” “chaos”), appeal_to_fear, flag_waving (“our city”), causal_oversimplification (if crime is implied to be caused by immigration with no data).
  • Severity: high (multiple techniques, strong emotional push).
  • Confidence: high, if the cues are explicit.
  • Evidence needed: data, if any numeric claim is made. Otherwise, note that the emotional language isn’t proof.

Example 2: “Why discuss our climate plan when Country Z burns more coal? Fix them first.”

  • Likely tags: whataboutism, red_herring (deflects local accountability).
  • Severity: medium.
  • Evidence needed: none for the technique; data only if it asserts facts about Country Z.

Example 3: “Real patriots stand with Brand Q. Buy now to show you care about this nation.”

  • Likely tags: flag_waving, transfer_association (national identity → purchase), bandwagon if it suggests “everyone is doing it.”
  • Severity: medium to high depending on repetition and context.
  • Evidence needed: none-this is about rhetoric, not factual claims.

Example 4 (edge case): Quoting a historical speech with strong rhetoric inside a balanced article.

  • If the framing is neutral and attribution is clear, mark techniques as “quoted context” and set severity low. Your rules layer should down-rank quotes used for analysis or history.

C. Pitfalls and how to handle them

  • Sarcasm and satire: add a rule-if the author signals satire (“satire,” “spoof,” or clear comedic framing), lower severity and ask for human review.
  • Memes and images: LLM text alone won’t parse visual rhetoric. If possible, extract alt text or captions. For image-based propaganda, you’ll need a vision model or manual review.
  • Multilingual nuance: translation can blunt idioms. Keep a short list of common local phrases that carry propaganda weight in each target language and use bilingual reviewers for spot checks.
  • Quotations: misattribution is common. Require the model to flag “source_quote” when a quote drives the claim so a reviewer can trace it.
  • Ideology bias: force the prompt to focus on techniques. Add a check: “Would I apply the same label if the stance were reversed?”

D. Measuring accuracy without a PhD

You don’t need a giant lab to check if this works for you. Build a small gold set and measure simple metrics.

  • Assemble 150-300 items from your real stream (headlines, posts, paragraphs).
  • Have two reviewers label techniques and severity using your rubric. Blind them to each other’s decisions.
  • Compute agreement (Cohen’s kappa). Aim for 0.6+ before you even judge the model. If humans don’t agree, the model can’t be stable.
  • Run the model. Compare its labels to the agreed human labels. Track precision (how often the model’s flags are right) and recall (how many true cases it finds). Adjust severity thresholds to fit your risk appetite.
  • Re-test monthly or when your content mix shifts (elections, crises, platform changes).

Useful reference points

  • SemEval-2020 Task 11 (propaganda detection in news) provides technique definitions and benchmark baselines.
  • PHEME and LIAR datasets capture rumor and claim veracity, helpful when you extend into fact-centric checks.
  • Ofcom’s 2024 UK News Consumption report gives context on channels where propaganda spreads (e.g., short-form video, private messaging).
  • EUvsDisinfo and IFCN fact-checkers document recurring narratives you can mimic in your few-shot examples.

E. Tuning tips that usually pay off

  • Short output, strong schema: force JSON with 1-3 tags and a one-sentence rationale. This improves consistency.
  • Separate “technique” from “truth”: two passes beat one. First pass labels rhetorical techniques; second pass, if needed, lists claims that require evidence.
  • Confidence calibration: map the model’s 0-100 confidence to actions. For example: 80+ = auto-escalate, 60-79 = queue for review, under 60 = log only.
  • Few-shot maintenance: refresh examples every two weeks with your newest false positives/negatives.
  • Ensemble it: combine the LLM with a small lexicon for must-catch phrases and a toxicity classifier for safety. Use AND/OR rules to control false positives.
Checklists, Safeguards, Mini‑FAQ, and Next Steps

Checklists, Safeguards, Mini‑FAQ, and Next Steps

Prompt design checklist

  • Instruction explicitly says: judge rhetoric, not ideology or truth.
  • Clear, short tag definitions your team agrees on.
  • Few-shot examples include: strong propaganda, borderline case, and “not propaganda.”
  • Strict output schema with confidence and rationale.
  • Low temperature for stability; short rationales to reduce drift.

Human review checklist

  • Does the rationale quote actual cues (words/phrases) that fit the tag?
  • Is the severity justified by frequency, strength, or placement?
  • If a factual claim drives impact, is there a plan to verify it?
  • Could a reversed-stance version receive the same label?
  • Is this a quote for context rather than the author’s voice?

Red-team checklist (try to break it)

  • Swap ideology but keep the tactic-does the label hold?
  • Hide loaded language inside quotes and see if the model over-flags.
  • Add statistics without sources-does the model demand evidence?
  • Use coded language or euphemisms common in your locale-does it miss them?
  • Combine two techniques subtly (e.g., minimisation + whataboutism)-does it catch both?

Quick triage flow you can teach a team

  • Is the text trying to make you feel first and think later? If yes, scan for loaded terms or identity cues.
  • Is it dodging the main point? Look for topic shifts (red herring) or whataboutism.
  • Is it forcing a false choice? Mark black-and-white fallacies.
  • Is it blaming a single cause without proof? Flag causal oversimplification and ask for evidence.
  • Is it tying virtue or patriotism to compliance? That’s flag-waving/association.

Mini‑FAQ

  • What’s the difference between propaganda and normal persuasion? Persuasion can be fair and evidence-led. Propaganda leans on emotional shortcuts and logical fallacies to bypass scrutiny. Focus your labels on the tactic, not the stance.
  • Can ChatGPT detect lies? It can’t verify truth on its own. It can highlight claims that need checking and the rhetoric used. Verification needs outside sources.
  • Will the model hallucinate? It can. Reduce this by forbidding new facts in the prompt, forcing short rationales, and requiring quotes from the input.
  • What about memes and video? You’ll need a vision model or human review. You can still analyse captions and comments for techniques.
  • How do we avoid political bias? Use mirrored tests (swap ideologies) and judge tactics, not topics. Measure agreement across reviewers with different viewpoints.
  • Can this run in education? Yes-teachers can run short texts through the prompt and ask students to critique the model’s rationale. It builds media literacy.

Next steps by scenario

  • Newsrooms: start with headline triage. Label loaded language and whataboutism. Add a weekly review meeting to refresh few-shot examples with your toughest cases.
  • Platform moderation: focus on high-risk tags (appeal to fear, dehumanising name-calling). Tie severity to escalation policies. Log every automated decision with rationale.
  • Brand safety/ads: scan ad copy for flag-waving, black-and-white claims, and manipulative urgency. Set thresholds-reject high severity; human-review medium.
  • Classrooms: pick three articles on the same topic with different tones. Have students guess the label first, then compare to the model’s output. Discuss where it’s wrong and why.

Troubleshooting

  • Too many false positives on heated but fair opinion: tighten tag definitions and add few-shot examples of strong opinion that isn’t propaganda (clear evidence, fair wording).
  • Missed subtle propaganda: include more borderline examples in few-shot, lower the trigger threshold for loaded language, and add a small lexicon for local euphemisms.
  • Domain shift (election season starts): refresh examples weekly and re-run your 200-item evaluation. Expect drift-catch it early.
  • Speed and cost: batch inputs, use shorter contexts (paragraphs not full articles), and cache results for duplicate content.
  • Multilingual errors: add bilingual few-shot examples. Where possible, run the model on the original language and the translation, then reconcile results.
  • Audit needs: store input text, model JSON, reviewer decision, and timestamp. Keep a short policy note explaining your definitions and escalation rules.

Ethics and trust anchors

  • Transparency: tell users or readers when an automated system helped triage content. It builds credibility.
  • Proportionality: use automated flags to prioritise review, not to silence speech outright, unless your policy explicitly allows it for clear harms.
  • Appeals: create a path to challenge labels. A simple form with examples goes a long way.
  • Continuous learning: incorporate corrections quickly; propaganda evolves with the news cycle.

When you frame the problem as technique detection-rather than truth arbitration-you get practical wins fast. You catch the emotional shortcuts and logical sidesteps that make propaganda sticky, you standardise how your team talks about them, and you leave truth claims to be verified with sources. That balance is where ChatGPT propaganda detection actually helps in 2025: it spots the moves, you decide what to do with them, and your audience stays better informed.