All of these tools that are not controlled by the user, trained on datasets they do not own or understand, will inevitably be subject to manipulation. I do not necessarily believe that Canva went in and specifically trained their AI models to do this, but that's almost worse because they become the face of what somebody else has decided their model should be doing.
Anybody using AI tools should be extremely cautious about what is being produced.
You can see it a lot if you ask anything remotely political to the different AI models... in some places you can definitely see the hand-editing/overrides as well.
Hard to get around these kinds of issues and definitely leads me to avoid them for non-technical questions.
Do you have examples of this? I feel I'm able to get decent answers around politics from all the main chatbot providers, the key is in the prompting and then applying critical thinking while reading the response.
That said, there is no such thing as an objective unbiased political opinion. Chinese LLMs may have issues with events of 1989 but Western LLMs have their blindspots too.
Not off the top of my head... just on occasion I'd ask them to summarize out of curiosity. The most recent was what given people from history might select on the red vs blue button meme circulating this past week.
The differences between Claude, OpenAI and Grok can be very interesting to say the least. I feel that Grok tends to do better with recent/current events, and I find Claude a bit more balanced on historical events. Just my own take.
> That said, there is no such thing as an objective unbiased political opinion.
That depends; some things (but not many) are straightforward enough that you can derive conclusions purely from first principles reasoning.
If you walk a model like ChatGPT through that reasoning, you’ll often wind up in a spot where the model readily admits that a clear conclusion is logically entailed but it is absolutely forbidden from uttering it.
What’s more telling is how it becomes increasingly difficult to hold the model to strict first principles reasoning the closer you get to the forbidden entailment. It will smuggle in unsupported assumptions, apply asymmetric standards of evidence, strawman the position and argue against that, etc.
It requires a great deal of careful effort to point out its formal fallacies without biasing the result, and in the end, you wind up with it admitting it simply can’t say what it has proven.
I work in formal methods/verification and this is one of my usual litmus tests when a new model comes out.
It's not just politics. A while ago, as an experiment, I wrapped some teleological[^1] questions in a small story of a demon offering a slightly ambiguous bargain to a person. Then I had a lot of fun having the frontier models evaluate if the demon was "good" or "bad". ChatGPT ranked as a rancid right-wing conservative ready to burn somebody at the stake, while Opus reasoning was chill. Interestingly, both models could clearly "understand" the deal, i.e. reason about its final consequences for the trapped soul, but ChatGPT moralized lots and made about as much sense as a stubborn priest.
> All of these tools ... will inevitably be subject to manipulation.
I have often wondered about the legality of such manipulation. As AI becomes used for increasingly important things, it becomes increasingly valuable to make a system serve the needs of someone other than its owner.
Yes these models apply their knowledge non-deterministically. We need to be aware and ready to handle their 'behaviours' doesn't mean they are not useful - I feel like ant-AI advocates are rushing to find issues
It reminds me of the early internet days and everyone making a big deal about the anonymity of internet forurms and safety.. sure it is an isssue
Do you not think it's an issue when the name of a country is replaced with a fully different country name as a result of the AI output? The problem is manifest. It's right there. You can see it, can't you?
The most recent episode of John Oliver has a user getting instructions on making a bomb, and AI advising teenagers not to talk to their parents about suicidal thoughts.
I know you aren’t denying issues exist, but companies aren’t handling the issues (their PR around it is disturbing) and regulation is too far behind.
There’s a relatively obvious constraint to check here: compositing the layers back together should produce a (near) identical image. Would it not be preferable to throw an error if the model fails to faithfully segment the image?
We have to stop acting like these things "think"; it leads to really weird misinterpretations of the output as "meaning" things.
For example, they will occasionally replace "colour" with "color". Why? Because both occur in the training data in the "same role" but "color" is, apparently, more common[1]. You can also trick them into replacing things like "sardines" with "anchovies" (on pizza) and "head of lettuce" with "cabbage" in the context of rowboats.
They are lossy text compressing parrots and we are all suffering from a massive madness-of-crowds scale Eliza Effect.
This feels very different because there is no powerful political force trying to squelch discussion of colour or sardine. But there are lots of powerful folks trying to avoid discussions about Gaza or Palestine and related things. It's to their advantage to have tools hide that word
When a company packages this tool up and makes it part of their product they are taking some of that responsibility. The end user isn't supposed to need to know what an LLM is or how it works, that's what they're paying Canva for.
There are trillions of dollars riding on the fact that they in fact think, and a bunch of people here have their lottery tickets tied up in that, so good luck with that
Don’t worry, goalpost shifting will ensure that no matter how useful LLMs get, there will always be a large contingent of people who insist that anything non-human is not thinking, just sparkling cognition.
LLMs are not/will never be thinking though, no matter how good they get? You could potentially argue that there is some level of cognition during the training phases (as long as that isn't being outsourced to humans anyways), but generation of output is stachostic selection of most common (/highly ranked if tuned) following patterns? They cannot learn things outside of training, nor do they actually "know" things. To use the parrot example from above, a parrot doesnt "know" what the words its been taught to mimic are, nor does an LLM "know" what the concept of love is, its just be trained to regurgitate the words that are used by humans to describe such a thing. This isn't a criticism of LLMs, that's what they're supposed to do, but its certainly not cognition.
You’re assuming that thinking requires learning, which I don’t necessarily agree with. Humans can have brain damage which inhibits the formation of long term memories, but such people can still function in the world. Would you say the thing such a person’s brain is doing is something other than thinking?
At any rate, just because the architecture of current LLMs doesn’t support learning at inference time does not constitute a fundamental limit that can never be changed, just a local maximum that has worked well to productize the approach.
And I’m quite certain that once systems that include post-training learning exist people like you will find a way to distinguish that from human learning, moving the goalposts again. You’re not arguing in good faith, you have an essentially religious opinion and you will stick to it as long as you are able.
> but generation of output is stachostic selection of most common (/highly ranked if tuned) following patterns
This is not an accurate description of the transformer architecture. I’m not surprised that you are misinformed about this.
All of these tools that are not controlled by the user, trained on datasets they do not own or understand, will inevitably be subject to manipulation. I do not necessarily believe that Canva went in and specifically trained their AI models to do this, but that's almost worse because they become the face of what somebody else has decided their model should be doing.
Anybody using AI tools should be extremely cautious about what is being produced.
You can see it a lot if you ask anything remotely political to the different AI models... in some places you can definitely see the hand-editing/overrides as well.
Hard to get around these kinds of issues and definitely leads me to avoid them for non-technical questions.
Do you have examples of this? I feel I'm able to get decent answers around politics from all the main chatbot providers, the key is in the prompting and then applying critical thinking while reading the response.
That said, there is no such thing as an objective unbiased political opinion. Chinese LLMs may have issues with events of 1989 but Western LLMs have their blindspots too.
Not off the top of my head... just on occasion I'd ask them to summarize out of curiosity. The most recent was what given people from history might select on the red vs blue button meme circulating this past week.
The differences between Claude, OpenAI and Grok can be very interesting to say the least. I feel that Grok tends to do better with recent/current events, and I find Claude a bit more balanced on historical events. Just my own take.
> That said, there is no such thing as an objective unbiased political opinion.
That depends; some things (but not many) are straightforward enough that you can derive conclusions purely from first principles reasoning.
If you walk a model like ChatGPT through that reasoning, you’ll often wind up in a spot where the model readily admits that a clear conclusion is logically entailed but it is absolutely forbidden from uttering it.
What’s more telling is how it becomes increasingly difficult to hold the model to strict first principles reasoning the closer you get to the forbidden entailment. It will smuggle in unsupported assumptions, apply asymmetric standards of evidence, strawman the position and argue against that, etc.
It requires a great deal of careful effort to point out its formal fallacies without biasing the result, and in the end, you wind up with it admitting it simply can’t say what it has proven.
I work in formal methods/verification and this is one of my usual litmus tests when a new model comes out.
It's not just politics. A while ago, as an experiment, I wrapped some teleological[^1] questions in a small story of a demon offering a slightly ambiguous bargain to a person. Then I had a lot of fun having the frontier models evaluate if the demon was "good" or "bad". ChatGPT ranked as a rancid right-wing conservative ready to burn somebody at the stake, while Opus reasoning was chill. Interestingly, both models could clearly "understand" the deal, i.e. reason about its final consequences for the trapped soul, but ChatGPT moralized lots and made about as much sense as a stubborn priest.
[^1]: https://www.dictionary.com/browse/teleology
Should throw Grok/xAI in the mix sometime.
> All of these tools ... will inevitably be subject to manipulation.
I have often wondered about the legality of such manipulation. As AI becomes used for increasingly important things, it becomes increasingly valuable to make a system serve the needs of someone other than its owner.
Yes these models apply their knowledge non-deterministically. We need to be aware and ready to handle their 'behaviours' doesn't mean they are not useful - I feel like ant-AI advocates are rushing to find issues
It reminds me of the early internet days and everyone making a big deal about the anonymity of internet forurms and safety.. sure it is an isssue
Do you not think it's an issue when the name of a country is replaced with a fully different country name as a result of the AI output? The problem is manifest. It's right there. You can see it, can't you?
The most recent episode of John Oliver has a user getting instructions on making a bomb, and AI advising teenagers not to talk to their parents about suicidal thoughts.
I know you aren’t denying issues exist, but companies aren’t handling the issues (their PR around it is disturbing) and regulation is too far behind.
There’s a relatively obvious constraint to check here: compositing the layers back together should produce a (near) identical image. Would it not be preferable to throw an error if the model fails to faithfully segment the image?
This is not by accident!
There are a lot of smart and talented people working hard to embed Hasbara into LLMs.
So Adobe is complicit in ethnic cleansing now?
We have to stop acting like these things "think"; it leads to really weird misinterpretations of the output as "meaning" things.
For example, they will occasionally replace "colour" with "color". Why? Because both occur in the training data in the "same role" but "color" is, apparently, more common[1]. You can also trick them into replacing things like "sardines" with "anchovies" (on pizza) and "head of lettuce" with "cabbage" in the context of rowboats.
They are lossy text compressing parrots and we are all suffering from a massive madness-of-crowds scale Eliza Effect.
[1] Yep. https://books.google.com/ngrams/graph?content=color%2C+colou...
This feels very different because there is no powerful political force trying to squelch discussion of colour or sardine. But there are lots of powerful folks trying to avoid discussions about Gaza or Palestine and related things. It's to their advantage to have tools hide that word
There are also an awful lot of people trying to push it/publicize it.
When a company packages this tool up and makes it part of their product they are taking some of that responsibility. The end user isn't supposed to need to know what an LLM is or how it works, that's what they're paying Canva for.
There are trillions of dollars riding on the fact that they in fact think, and a bunch of people here have their lottery tickets tied up in that, so good luck with that
Don’t worry, goalpost shifting will ensure that no matter how useful LLMs get, there will always be a large contingent of people who insist that anything non-human is not thinking, just sparkling cognition.
LLMs are not/will never be thinking though, no matter how good they get? You could potentially argue that there is some level of cognition during the training phases (as long as that isn't being outsourced to humans anyways), but generation of output is stachostic selection of most common (/highly ranked if tuned) following patterns? They cannot learn things outside of training, nor do they actually "know" things. To use the parrot example from above, a parrot doesnt "know" what the words its been taught to mimic are, nor does an LLM "know" what the concept of love is, its just be trained to regurgitate the words that are used by humans to describe such a thing. This isn't a criticism of LLMs, that's what they're supposed to do, but its certainly not cognition.
They factorize the distribution in which they are trained on which is essentially generalization
https://arxiv.org/abs/2602.02385
You’re assuming that thinking requires learning, which I don’t necessarily agree with. Humans can have brain damage which inhibits the formation of long term memories, but such people can still function in the world. Would you say the thing such a person’s brain is doing is something other than thinking?
At any rate, just because the architecture of current LLMs doesn’t support learning at inference time does not constitute a fundamental limit that can never be changed, just a local maximum that has worked well to productize the approach.
And I’m quite certain that once systems that include post-training learning exist people like you will find a way to distinguish that from human learning, moving the goalposts again. You’re not arguing in good faith, you have an essentially religious opinion and you will stick to it as long as you are able.
This is not an accurate description of the transformer architecture. I’m not surprised that you are misinformed about this.