I built a Chrome extension that tries to detect persuasive or commercially biased signals inside AI-generated responses.
The idea came from a concern:
Traditional ads are visibly separate from content.
In LLM interfaces, if monetization ever shifts toward embedded recommendations, that separation could disappear.
This project doesn’t assume ads are already happening.
It’s more of a guardrail experiment.
## What it does
The extension works in two directions:
### 1. Forward Shield
Before you send a message, it analyzes your input and warns if you're sharing potentially exploitable personal information (health, finances, vulnerabilities, etc.).
Scores semantic alignment between your prompt and the recommendation
Links detected influence to previous disclosures in the conversation
Produces a transparency explanation
Everything runs locally.
Optional LLM-as-judge mode if you provide your own API key.
## Why build this?
- LLM interactions are unusually intimate compared to search/social.
- Recommendation framing inside answers is structurally different from banner ads.
This is an attempt to explore what “ad transparency” would look like at the conversational layer.
## Questions I’d love feedback on
- Is this solving a real future problem or a hypothetical one?
- Are heuristics sufficient, or is this fundamentally unsolvable without model-level access?
- Would you trust a client-side tool like this?
- What obvious failure modes am I missing?
I built a Chrome extension that tries to detect persuasive or commercially biased signals inside AI-generated responses.
The idea came from a concern:
Traditional ads are visibly separate from content. In LLM interfaces, if monetization ever shifts toward embedded recommendations, that separation could disappear.
This project doesn’t assume ads are already happening. It’s more of a guardrail experiment.
## What it does
The extension works in two directions:
### 1. Forward Shield
Before you send a message, it analyzes your input and warns if you're sharing potentially exploitable personal information (health, finances, vulnerabilities, etc.).
### 2. Backward Trace
After an AI response appears, it:
Detects persuasive signals (urgency, authority framing, brand mentions, CTAs)
Scores semantic alignment between your prompt and the recommendation
Links detected influence to previous disclosures in the conversation
Produces a transparency explanation
Everything runs locally. Optional LLM-as-judge mode if you provide your own API key.
## Why build this?
- LLM interactions are unusually intimate compared to search/social.
- Recommendation framing inside answers is structurally different from banner ads.
This is an attempt to explore what “ad transparency” would look like at the conversational layer.
## Questions I’d love feedback on
- Is this solving a real future problem or a hypothetical one? - Are heuristics sufficient, or is this fundamentally unsolvable without model-level access? - Would you trust a client-side tool like this? - What obvious failure modes am I missing?
You’re worried about invisible persuasion while asking users to paste their own API keys into your extension
[dead]