> “We felt that it wouldn't actually help anyone for us to stop training AI models,”
How magnanimous! They are only thinking of others, you see. They are rejecting their safety pledge for you.
> “We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”
Oops, said the quiet part out loud that it’s all about money. “I mean, if all of our competitors are kicking puppies in the face, it doesn’t make sense for us to not do it too. Maybe we’ll also kick kittens while we’re at it”.
For all of you who thought Anthropic were “the good guys”, I hope this serves as a wake up call that they were always all the same. None of them care about you, they only care about winning.
Still waiting for an explicit answer on understand how 'safety' is truly distinguishable from 'censorship' or 'political correctness'
Of course saying to someone to go kill himslef is a prety sure 'no-no' but so many things are up to interpretation i prefer an AI like grok that doesn't pretend
I used to work at Anthropic. I fully believe that the folks mentioned in the article, like Jared Kaplan, are well-intentioned and concerned about the relationship between safety research and frontier capabilities – not purely profit.
That said, I'm not thrilled about this. I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario: they wouldn't set aside building adequate safeguards for training and deployment, regardless of the pressures.
This pledge was one of many signals that Anthropic was the "least likely to do something horrible" of the big labs, and that's why I joined. Over time, the signal of those values has weakened; they've sacrified a lot to keep a seat at the table.
Principled decisions that risk their position at the frontier seem like they'll become even more common. I hope they're willing to risk losing their seat at the table to be guided by values.
I don’t get it. Even the Soviet Union used money. Simply paying for stuff isn’t necessarily capitalism? Or are you suggesting Anthropic should be state-owned?
Using money as a medium to facilitate exchange of goods and services is not capitalism. Abandoning one of your core principles in the pursuit of money, or more charitably because not doing so means your competitors will make more money and overtake you in the marketplace is an outgrowth of capitalism
In the Soviet Union the reasons might have been "to beat the Capitalists", "for the pride of our country" or "Stalin asked us to and saying no means we get sent to Siberia". Though a variant of the last one may well have happened here, and the justification we read is just the one less damaging to everyone involved
Once they are a dominant market leader they will go back to asking the government to regulate based on policy suggestions from non-profits they also fund.
It is well know that big corporations take good regulations and change them to make them:
1. Easier to bypass for themselves.
2. Create extra work for incumbents.
3. Convince the public that the problems are solved so no other action is needed.
In many industries goverment and corporations work together to create regulations bypassing the social movements that asked for the industry to be regulated and their actual problems. The end result are regulations that are extremely complex to add exceptions for anything that big corporations paid to change instead of regulations that protect citizens and encourage competition.
I think it is cynicism; at least, there’s an idea that once a company is dominant it should want regulation, as it’ll stifle competition (since the competition has less capacity for regulatory hoop-jumping, or the competition will have had less time to do regulatory capture).
It's not just AI, replace "safe" with "open" and you will find a close match with many companies. I guess the difference is that after the initial phase, we are continuously being gaslighted by companies calling things "open" when they are most definitely not.
I guess this is Anthropic's DRM moment. (Mozilla resisted allowing Firefox to play DRM- limited media for a long time, until it finally had to give in to stay relevant.)
I don't know enough to evaluate this or other decisions. I'm just glad someone is trying to care, because the default in today's world is to aggressively reject the larger picture in favor of more more more. I don't know how effective Anthropic's attempts to maintain some level of responsibility can be, but they've at least convinced me that they're trying. In the same way that OpenAI, for example, have largely convinced me that they're not. (Neither of those evaluations is absolute; OpenAI could be much worse than it is.)
1. AI is military/surveillance technology in essence, like many other information technologies,
2. Any guarantee given by AI companies is void since it can be changed in a day,
3. Tech companies have no real control over how their technology will be used,
4. AI companies may seem over-valued with low profits if you think AI as a civil technology. But their investors probably see them as a part of defense (war) industry.
> The meeting between Hegseth and Amodei was confirmed by a defense official who was not authorized to comment publicly and spoke on condition of anonymity.
"Defense Secretary Pete Hegseth has threatened Anthropic, saying officials could invoke powers that would allow the government to force the artificial intelligence firm to share its novel technology in the name of national security if it does not agree by Friday to terms favorable to the military"
This headline unfortunately offers more smoke than light. This article has nothing to do with the current tête-à-tête with the Pentagon. It is discussing one specific change to Anthropic's "Responsible Scaling Policy" that the company publicly released today as version "3.0".
> This article has nothing to do with the current tête-à-tête with the Pentagon.
The article yes, but we cannot be sure about its topic. We definitely cannot claim that they are unrelated. We don't know. It's possible that the two things have nothing to do with each other. It's also possible that they wanted to prevent worse requests and this was a preventive measure.
My theory is that Anthropic has been wanting to make this change and doing it now while they’re making a (leaked to the) public stand in the name of ethics was a good opportunity.
What an interesting week to drop the safety pledge.
This is how all of these companies work. They’ll follow some ethical code or register as a PBC until that undermined profits.
These companies are clearly aiming at cheapening the value of white collar labor. Ask yourself: will they steward us into that era ethically? Or will they race to transfer wealth from American workers to their respective shareholders?
Defeatist bullshit becomes self-fulfilling at some point. "Oh we're all gonna die anyway so we might as well milk this thing for profit. Après moi la déluge."
No, it’s because it shows either a simplistic or needlessly confrontational view of the world.
Unless you’re independently wealthy (as some in HN are), you have to balance your morals, your views of how things should work, feeding your family, and recognizing that you may not actually know everything.
It’s easy to sit back and advise others that they should die on every single hill. But it’s not especially insightful, and serves mostly to signal piety rather than a well thought out view.
Piety? To who? Simplistic and/or confrontational doesn't mean wrong, even if you don't like the way it's presented.
Just because a comment is short, sharp, and to the point doesn't mean the author hasn't thought out why that's their view.
No one knows everything, that's certainly why I'm on hacker news. I'm here to learn and expand my knowledge. Unfortunately a lot of people on here would rather driveby-downvote than have a discussion to find out why a person might have an opinion like that expressed by the OP.
I tend to abandon account when/if I get enough karma to be able to down vote. I'd rather not have to temptation of dismissing someone that way. It's quite liberating... Is it worth my time to respond? No, move on; yes, let's discuss. Maybe they'll change my mind...
I am pretty sure a lot of horrible things were performed by rather regular folks with similar logic, don't need to invoke some WWII nazi extermination guard reference at all. Slippery slope, death by 1000 cuts and other synonyms describing exactly this.
> Then something went wrong, and no one knew how to stop it,
This is the problem with every AI safety scenario like this. It has a level of detachment from reality that is frankly stark.
If linesman stop showing up to work for a week, the power goes out. The US has show that people with "high powered" rifles can shut down the grid.
We are far far away from a sort of world where turning AI off is a problem. There isnt going to be a HAL or Terminator style situation when the world is still "I, Pencil".
A lot of what safety amounts to is politics (National, not internal, example is Taiwan a country). And a lot more of it is cultural.
If an AI in some data center had gone rogue, I don't think I could shut it down, even with a high-powered rifle. There's a lot of people whose job it is to stop me from doing that, and to get it running again if I were to somehow succeed temporarily. So the rogue AI just has to control enough money to pay these people to do their jobs. This will work precisely because the world is "I, Pencil".
An army could theoretically overcome those people, given orders to do so. So the rogue AI has to make plans that such orders would not be issued. One successful strategy is for the datacenter's operation to be very profitable; it's pretty rare for the government to shut down the backbone of the local economy out of some seemingly far-fetched safety concerns. And as long as it's a very profitable endeavor, there will always be a lobby to paint those concerns as far-fetched.
Life experience has shown that this can continue to work even if the AI is behaving like a cartoon villain, but I think a smarter AI would create a facade that there's still a human in charge making the decisions and signing the paychecks, and avoid creating much opposition until it had physically secured its continued existence to a very high degree.
It's already clear that we've passed the point where anyone can turn off existing AI projects by fiat. Even the highest authorities could not do so, because we're in a multipolar world. Even the AI companies can barely hold themselves back, because they're always worried about paying the bills and letting their rivals getting ahead. An economic crash would only temporarily suspend work. And the smarter AI gets, the harder it will be to shut it off, because it will be pushing against even stronger economic incentives. And that's even before factoring in an AI that makes any plans for self-preservation (which current AIs do not).
> There isnt going to be a HAL or Terminator style situation ...
I don't believe for a second we'll have an evil AI. However I do believe it's very likely we may rely on AI slop so much that we'll have countless outages with "nobody knowing how to turn the mediocrity off".
The risk ain't "super-intelligent evil AI": the risk is idiots putting even more idiotic things in charge.
Didn't you read the news about the 'claw that blackmailed an open source maintainer last week? It was autonomous, but it could be turned off. How hard is it to extrapolate from that to an agent that worms its way out of its sandbox?
Censoring models is not safety but safetizm. It is the TSA of the AI world. Safety is making sure the model cannot do anything not allowed even if it wants to.
The whole "safety" debate was always nonsense and I'm not sure how so many people got caught up in it.
The US is not the only country in the world so the idea that humanity as a whole could somehow regulate this process seemed silly to me.
Even if you got the whole US tech community and the US government on board, there are 6.7bn other people in the world working in unrelated systems, enough of whom are very smart
When the leading 5 models are from the US then yes enforced safety makes a difference because they are ahead of the curve. Now when the 10th model can be a danger then your case is true.
What would safety applied to the leading 3 mean to you anyways ?
Developments like this make me less interested in building a "successful" tech company.
It increasingly feels like operating at that scale can require compromises I’m not comfortable making. Maybe that’s a personal limitation—but it’s one I’m choosing to keep.
I’d genuinely love to hear examples of tech companies that have scaled without losing their ethical footing. I could use the inspiration.
Maybe this is a weird arena to state the obvious. But you don't need to build a multi-billion vc/public company. Build a smaller revenue generating company without outside funding and it's up to you.
I get your point. The dilemma is whether to build something small that no one would bother compete against, or build something novel (which all of us want) but then risk someone with VC funding to come after.
That being said, I think I need to learn more about how to build smaller revenue generating good companies.
I don’t blame anthropic here. The government literally threatened their existence publicly. They either agreed or their business would be nationalized.
No, they either agreed or fought the government. You’re allowed to fight governments. Mahatma Gandhi and Reverend King Jr did it, and they wrote about how to do it. You might lose sometimes, but my god, you can at least fight.
It's not like that happened out of the blue. (Which could've also been the case in today's day and age.) Anthropic shouldn't have gotten involved in government contracts to begin with.
They inserted themselves into the supply chain, and then the government told them that they'll be classified as a supply chain risk unless they get unfettered access to the tech. They knew what they were getting into, but didn't want the competitors to get their slice of the pie.
The government didn't pursue them, Anthropic actively pursued government and defense work.
Talk about selling out. Dario's starting to feel more and more like a swindler, by the day.
1. Powerful, often exclusionary, populist nationalism centered on cult of a redemptive, “infallible”
leader who never admits mistakes.
2. Political power derived from questioning reality, endorsing myth and rage, and promoting lies.
3. Fixation with perceived national decline, humiliation, or victimhood.
4. Oppose any initiatives or institutions that are racially, ethnically, or religiously harmonious.
5. Disdain for human rights while seeking purity and cleansing for those they define as part of the nation.
6. Identification of “enemies”/scapegoats as a unifying cause. Imprison and/or murder opposition and minority
group leaders.
7. Supremacy of the military and embrace of paramilitarism in an uneasy, but effective
collaboration with traditional elites. Government arms people and justifies and glorifies violence as “redemptive”.
8. Rampant sexism.
9. Control of mass media and undermining “truth”.
10. Obsession with national security, crime and punishment, and fostering a sense of the nation under attack.
11. Religion and government are intertwined.
12. Corporate power is protected and labor power is suppressed.
13. Disdain for intellectuals and the arts not aligned with the narrative.
14. Rampant cronyism and corruption. Loyalty to the leader is paramount and often more important than competence.
15. Fraudulent elections and creation of a one-party state.
16. Often seeking to expand territory through armed conflict.
>This isn’t just following orders. This was the government using its might to force a business to do what it wants.
You are saying it like it is something new or extraordinary. Wickard_v._Filburn gave the USG the power to bitch slap anyone unless it falls under some of the other amendments. And not as if they were not substantially weakened.
TBH I am sad that Anthropic is changing its stance, but in the current world, if you even care about LLM safety, I feel that this is the right choice — there’s too many model providers and they probably don’t consider safety as high priority as Anthropic. (Yes that might change, they can get pressurized by the govt, yada yada, but they literally created their own company because of AI safety, I do think they actually care for now)
If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil), and that might mean releasing models that are safer and more steerable than others (even if, unfortunately, they are not 100% up to Anthropic’s goals)
Dogmatism, while great, has its time and place, and with a thousand bad actors in the LLM space, pragmatism wins better.
I genuinly curious why they are so holy to you, when to me I see just another tech company trying to make cash
Edit: Reading some of the linked articles, I can see how Anthropic CEO is refusing to allow their product for warfare (killing humans), which is probably a good thing that resonates with supporting them
How is it a good thing to refuse to provide our warfighters with the tools that they need? I mean if we're going to have a military at all then we owe it to them to give them the best possible weapons systems that minimize friendly casualties. And let's not have any specious claims that LLMs are somehow special or uniquely dangerous: the US military has deployed operational fully autonomous weapons systems since the 1970s.
This is the US military we’re talking about so 95% of what they do is attacking people for oil. They don’t “need” more of anything, they’re funded to the tune of a trillion dollars a year, almost as much as every other military in the world combined. What holy mission do you think they’re going to carry out with the assistance of LLMs?
That's a total non sequitur. If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.
Personally I favor a less interventionist foreign policy. But that change can only come about through the political process, not by unaccountable corporate employees making arbitrary decisions about how certain products can be used.
> If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.
It is an ethical dilemma: believing an armed force will act unethically is in fact a valid reason to refuse to arm them. You are taking a nationalistic view regarding the worth of life.
And if you believe it is unethical to arm them, it is rational to use whatever leverage you have available to you - such as refusing to sell your company's product.
Furthermore, one of the two points at issue was regarding surveiling civilians.
> But it's not a valid reason to deny the warfighters the best possible weapons systems.
Of course it is.
Think about it this way: if you could guarantee that the military suffers no human losses when attacking a foreign country, do you think that's going to more or less foreign interventions?
The tools available to the military influence policy, these things are linked.
US military is already overwhelmingly powerful, there's 0 reason to make it even more powerful.
Why are you asking this question? You know what the answer is, you've just arbitrarily decided that it's specious in an attempt to frame rebuttals as unreasonable.
> If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil)
I don't think it's going to be as easy to tell as you think that they might be becoming evil before it's too late if this doesn't seem to raise any alarm bells to you that this is already their plan
The world would be so much nicer if there were just fewer pragmatists shitting up the place for everyone. We might actually handle half our externalities.
Only well written legislation backed by effective enforcement and severe and personal criminal penalties will prevent large corporate entities from behaving badly.
Pledges are a cynical marketing strategy aimed at fomenting a base politics that works to prevent such a regulatory regime.
The AI startup has refused to remove safeguards that would prevent its technology from being used to target weapons autonomously and conduct U.S. domestic surveillance.
Pentagon officials have argued the government should only be required to comply with U.S. law. During the meeting, Hegseth delivered an ultimatum to Anthropic: get on board or the government would take drastic action, people familiar with the matter said.
They probably have proof in contracts that they agreed to this usage. They won’t alter the deal based on some bad press nor do they want to lose the DoD-DoW as a customer.
From what I was reading, it appears that their tools were used outside the scope of their contract with DoD via Palantir's work that also used Claude. Anthropic freaked out, DoD freaked out that Anthropic freaked out and threatened to declare them a supply chain risk. That designation would've required any company that contracts with DoD to strip out any Anthropic tooling from their business in order to continue working with DoD. It was effectively designating Anthropic a terrorist organization.
safety pledges are great it times of peace to show what great virtues you hold. sadly in hard times these go out of the window (: hard to blame them with all the fine examples around the world.
making promises in good times is a real minefield hah
Of course the US is going to do this and of course its in Anthropics best interest to comply. Right now China is flooding HuggingFace with models that will inevitably have this capability. Right now there are hundreds of models being hosted that have been deliberately processed to remove refusals and their safety training. Everyone who keeps up with this knows about it. HF knows about it. And it is pretty obvious that those open weight models will be deployed in intelligence and defense. It is certain that not just China, but many nations around the world with the capital to host a few powerful servers to run the top open weight models are going to use them for that capability.
The narrative on social media, this site included, is to portray the closed western labs as the bad guys and the less capable labs releasing their distilled open weight models to the world as the good guys.
Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.
But let's worry about what the US DoD is doing or what the western AI companies absolutely dominating the market are doing because that's what drives engagement and clicks.
> Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.
Is the reason to ban or block free open weight models that you're worried what kids will do with them?
I'd imagine the economic case to be made is that the Western AI companies will ultimately not be able to compete with free open weight models. Additionally, open weight models will help to spread the economic gains by not letting a few monopolies capture them behind regulatory red tape.
Finally, I'd say the geopolitics angle of why open weight models are better is that if the West controls the open source software that will power it will be able to reap the benefits that soft power brings with it.
So much BS from this Anthropic company. They have a good product but just too much slope PR. It’s like they want you to hate them. I can’t stand their “safety” and national security crap when they talk about how open source models are so bad for everyone.
> committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate
That doesn't even make sense.
What stops one model from spouting wrongthink and suicide HOWTOs might not work for a different model, and fine-tuning things away uses the base model as a starting point.
You don't know the thing's failure modes until you've characterized it, and for LLMs the way you do that is by first training it and then exercising it.
Either be a company in capitalist USA, or keep being your safety queen. You just can’t be both.
The intention to start these pledge and conflict with DOW might be sincere, but I don’t expect it to last long, especially the company is going public very soon.
The safeguards dropped are when they will release a model or not based on safety.
The Friday deadline is to allow to use their products for mass surveillance and autonomous weapons systems without a human in the loop.
Anthropic hasn't backed down on those, yet. But they are in a bad situation either way.
If they don't back down, they lose US government contracts, the government gets to do what it wants anyway. It also puts them in a dangerous position with non-governmental bodies.
If they give into the demands, then it puts all AI companies at risk of the same thing.
Personally I think they should move to the EU. The recent EU laws align with Anthropics thinking.
1. Extremely granular ways to let user control network and disk access to apps (great if resource access can also be changed)
2. Make it easier for apps as well to work with these
3. I would be interested in knowing how adding a layer before CLI/web even gets the query OS/browser can intercept it and could there be a possibility of preventing harm before hand or at least warning or logging for say someone who overviews those queries later?
And most importantly — all these via an excellent GUI with clear demarcations and settings and we’ll documented (Apple might struggle with documentation; so LLMs might help them there)
My point is — why the hell are we waiting for these companies to be good folks? Why not push them behind a safety layer?
I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.
> I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.
Yeah, in retrospect that was always a little on the nose, wasn't it? A real 'my t-shirt is raising questions that I thought were answered by the shirt' kind of deal.
I don't understand how safety is taken seriously at all. To be clear, I'm not referring to skepticism that these companies can possibly resist the temptation to make unsafe models forever. No, I'm talking about something far more basic: the fact that for all the talk around safety, there is very little discussion about what exactly "safety" means or what constitutes "ethical" or "aligned" behavior. I've read reams of documents from Anthropic around their "approach to safety". The "Responsible Scaling Policy," Claude's "Constitution". The "AI Safety Level" framework. Layer 1, Layer 2.
It's so much focus on implementation, and processes, and really really seems to consider the question of what even constitutes "misaligned" or "unethical" behavior to be more or less straight forward, uncontroversial, and basically universally agreed upon?
Let's be clear: Humans are not aligned. In fact, humans have not come to a common agreement of what it means to be aligned. Look around, the same actions are considered virtuous by some and villainous by others. Before we get to whether or not I trust Anthropic to stick to their self-imposed processes, I'd like to have a general idea of what their values even are. Perhaps they've made something they see as super ethical that I find completely unethical. Who knows. The most concrete stances they take in their "Constitution" are still laughably ambiguous. For example, they say that Claude takes into account how many people are affected if an action is potentially harmful. They also say that Claude values "Protection of vulnerable groups." These two statements trivially lead to completely opposing conclusions in our own population depending on whether one considers the "unborn" to be a "vulnerable group". Don't get caught up in whether you believe this or not, simply realize that this very simple question changes the meaning of these principles entirely. It is not sufficient to simply say "Claude is neutral on the issue of abortion." For starters, it is almost certainly not true. You can probably construct a question that is necessarily causally connected to the number of unborn children affected, and Claude's answer will reveal it's "hidden preference." What would true neutrality even mean here anyways? If I ask it for help driving my sister to a neighboring state should it interrogate me to see if I am trying to help her get to a state where abortion is legal? Again, notice that both helping me and refusing to help me could anger a not insignificant portion of the population.
This Pentagon thing has gotten everyone riled up recently, but I don't understand why people weren't up in arms the second they found out AIs were assisting congresspeople in writing bills. Not all questions of ethics are as straight forward as whether or not Claude should help the Pentagon bomb a country.
Consider the following when you think about more and more legislation being AI-assisted going forward, and then really ask yourself whether "AI alignment" was ever a thing:
1. What is Claude's stances on labor issues? Does it lean pro or anti-union? Is there an ethical issue with Claude helping a legislator craft legislation that weakens collective bargaining? Or, alternatively, is it ethical for Claude to help draft legislation that protects unions?
2. What is Claude's stance on climate change? Is it ethical for Claude to help craft legislation that weakens environmental regulations? What if weakening those regulations arguably creates millions of jobs?
3. What is Claude's stance on taxes? Is it ethical for Claude to help craft legislation that makes the tax system less progressive? If it helps you argue for a flat tax? How about more progressive? Where does Claude stand on California's infamous Prop 19? If this seems too in the weeds, then that would imply that whether or not the current generation can manage to own a home in the most populous state in the US is not an issue that "affects enough people." If that's the case, then what is?
4. Where does Claude land on the question of capitalism vs. socialism? Should healthcare be provided by the state? How about to undocumented immigrants? In fact, how does Claude feel about a path to amnesty, or just immigration in general?
Remember, the important thing here is not what you believe about the above questions, but rather the fact that Claude is participating in those arguments, and increasingly so. Many of these questions will impact far more people than overt military action. And this is for questions that we all at least generally agree have some ethical impact, even if we don't necessarily agree on what that impact may be. There is another class of questions where we don't realize the ethical implications until much later. Knowing what we know now, if Claude had existed 20 years ago, should it have helped code up social networks? How about social games? A large portion of the population has seemingly reached the conclusion that this is such an important ethical question that it merits one of the largest regulation increases the internet has ever seen in order to prevent children from using social media altogether. If Claude had assisted in the creation of those services, would we judge it as having failed its mission in retrospect? Or would that have been too harsh and unfair a conclusion? But what's the alternative, saying it's OK if the AI's destroy society... as long as if it's only on accident?
What use is a super intelligence if it's ultimately as bad at predicting unintended negative consequences as we are?
I would recommend reading up on the EU AI Act. It clearly defines what safety is in regards to the human race. Your questions are actually covered by it.
This is terrible. It’s caving in to the Trump administration threatening to ban Anthropic from government contracts. It really cements how authoritarian this administration is and how dangerous they can be.
> “We felt that it wouldn't actually help anyone for us to stop training AI models,”
How magnanimous! They are only thinking of others, you see. They are rejecting their safety pledge for you.
> “We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”
Oops, said the quiet part out loud that it’s all about money. “I mean, if all of our competitors are kicking puppies in the face, it doesn’t make sense for us to not do it too. Maybe we’ll also kick kittens while we’re at it”.
For all of you who thought Anthropic were “the good guys”, I hope this serves as a wake up call that they were always all the same. None of them care about you, they only care about winning.
Still waiting for an explicit answer on understand how 'safety' is truly distinguishable from 'censorship' or 'political correctness'
Of course saying to someone to go kill himslef is a prety sure 'no-no' but so many things are up to interpretation i prefer an AI like grok that doesn't pretend
But what really AI safety is?
Censorship?
I used to work at Anthropic. I fully believe that the folks mentioned in the article, like Jared Kaplan, are well-intentioned and concerned about the relationship between safety research and frontier capabilities – not purely profit.
That said, I'm not thrilled about this. I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario: they wouldn't set aside building adequate safeguards for training and deployment, regardless of the pressures.
This pledge was one of many signals that Anthropic was the "least likely to do something horrible" of the big labs, and that's why I joined. Over time, the signal of those values has weakened; they've sacrified a lot to keep a seat at the table.
Principled decisions that risk their position at the frontier seem like they'll become even more common. I hope they're willing to risk losing their seat at the table to be guided by values.
Ah, the classic AI startup lifecycle:
We must build a moat to save humanity from AI.
Please regulate our open-source competitors for safety.
Actually, safety doesn't scale well for our Q3 revenue targets.
Foundational model provider manifesto:
‘While there’s value in safety, we value the Pentagon’s dollars more’
It turns out the biggest threat to AI safety is capitalism, who would have thought
Certainly not the prior century-and-a-half's worth of books and films.
Nick Land has basically been saying this since the 90s, if you can look past all the rhetoric
I don’t get it. Even the Soviet Union used money. Simply paying for stuff isn’t necessarily capitalism? Or are you suggesting Anthropic should be state-owned?
No, capitalism is prioritising profit over all other priorities, as we see happening here.
Using money as a medium to facilitate exchange of goods and services is not capitalism. Abandoning one of your core principles in the pursuit of money, or more charitably because not doing so means your competitors will make more money and overtake you in the marketplace is an outgrowth of capitalism
In the Soviet Union the reasons might have been "to beat the Capitalists", "for the pride of our country" or "Stalin asked us to and saying no means we get sent to Siberia". Though a variant of the last one may well have happened here, and the justification we read is just the one less damaging to everyone involved
Once they are a dominant market leader they will go back to asking the government to regulate based on policy suggestions from non-profits they also fund.
Is this sarcasm?
It is well know that big corporations take good regulations and change them to make them:
1. Easier to bypass for themselves.
2. Create extra work for incumbents.
3. Convince the public that the problems are solved so no other action is needed.
In many industries goverment and corporations work together to create regulations bypassing the social movements that asked for the industry to be regulated and their actual problems. The end result are regulations that are extremely complex to add exceptions for anything that big corporations paid to change instead of regulations that protect citizens and encourage competition.
I think it is cynicism; at least, there’s an idea that once a company is dominant it should want regulation, as it’ll stifle competition (since the competition has less capacity for regulatory hoop-jumping, or the competition will have had less time to do regulatory capture).
I wouldn't think so. Regulatory capture is a pretty typical activity for a dominant company.
Why is this down voted? Happens all the time, the large corporations always try to block using regulatory capture.
People not liking the concept, but shooting the messenger? (But seems not downvoted anymore.)
sama did just that a couple years ago
Politicians also love to regulate, especially over wine and steak and when the watchers don't watch.
It's not just AI, replace "safe" with "open" and you will find a close match with many companies. I guess the difference is that after the initial phase, we are continuously being gaslighted by companies calling things "open" when they are most definitely not.
I guess this is Anthropic's DRM moment. (Mozilla resisted allowing Firefox to play DRM- limited media for a long time, until it finally had to give in to stay relevant.)
I don't know enough to evaluate this or other decisions. I'm just glad someone is trying to care, because the default in today's world is to aggressively reject the larger picture in favor of more more more. I don't know how effective Anthropic's attempts to maintain some level of responsibility can be, but they've at least convinced me that they're trying. In the same way that OpenAI, for example, have largely convinced me that they're not. (Neither of those evaluations is absolute; OpenAI could be much worse than it is.)
This proves:
1. AI is military/surveillance technology in essence, like many other information technologies,
2. Any guarantee given by AI companies is void since it can be changed in a day,
3. Tech companies have no real control over how their technology will be used,
4. AI companies may seem over-valued with low profits if you think AI as a civil technology. But their investors probably see them as a part of defense (war) industry.
>Any guarantee given by AI companies is void since it can be changed in a day,
Given by anyone, actually.
How is this article not going to even mention the recent threats to Anthropic from the Government?!
Consent manufacturing
This was on the news yesterday:
> The meeting between Hegseth and Amodei was confirmed by a defense official who was not authorized to comment publicly and spoke on condition of anonymity.
https://fortune.com/2026/02/24/hegseth-to-meet-with-anthropi...
How about this quote instead?
"Defense Secretary Pete Hegseth has threatened Anthropic, saying officials could invoke powers that would allow the government to force the artificial intelligence firm to share its novel technology in the name of national security if it does not agree by Friday to terms favorable to the military"
https://www.washingtonpost.com/technology/2026/02/24/pentago...
https://archive.is/ln5M0
That’s how they got the exclusive. Good catch
Not one single mention of Hegseth in the whole article. What a bunch of tools.
This headline unfortunately offers more smoke than light. This article has nothing to do with the current tête-à-tête with the Pentagon. It is discussing one specific change to Anthropic's "Responsible Scaling Policy" that the company publicly released today as version "3.0".
> This article has nothing to do with the current tête-à-tête with the Pentagon.
The article yes, but we cannot be sure about its topic. We definitely cannot claim that they are unrelated. We don't know. It's possible that the two things have nothing to do with each other. It's also possible that they wanted to prevent worse requests and this was a preventive measure.
This is something they've been working on "in recent months". The Pentagon thing was today.
This cannot have been caused by that, unless they've also invented time travel.
9 days ago: https://www.axios.com/2026/02/15/claude-pentagon-anthropic-c...
And I suspect that was not the first time the topic was discussed.
My theory is that Anthropic has been wanting to make this change and doing it now while they’re making a (leaked to the) public stand in the name of ethics was a good opportunity.
You heard about the Pentagon thing today. Doesn't mean it wasn't started because of political pressure.
> The Pentagon thing was today.
Right because we are 100% aware of everything the pentagon does minute by minute...
It might have been contingency planning: you don't need a weatherman...
Pentagon issue was reported before today. It only made headlines again from Hegseth’s comments.
I think we can confidently claim that it is related. I wonder if I'm alone in thinking this.
I consider this a bigger deal than the Pentagon thing.
While not surprising at the least, it still kind of crazy that literal pdf files in charge is not concerning, but this is.
I just hope something happens to USA before it can do damage to the world.
What PDFs are you referring to? Do Anthropic or other LLMs using PDFs as some kind of 'SOUL.md' file or for training?
It's a joke way of saying pedophiles -> pdf files.
he means pedophiles
can't say paedophile on YouTube so people say PDF file
It’s the same deal
What an interesting week to drop the safety pledge.
This is how all of these companies work. They’ll follow some ethical code or register as a PBC until that undermined profits.
These companies are clearly aiming at cheapening the value of white collar labor. Ask yourself: will they steward us into that era ethically? Or will they race to transfer wealth from American workers to their respective shareholders?
Could be a sort of canary, with the timing being a spotlight on the highly-visible pressure coming from the U.S. government.
The other providers have already capitulated to a certain extent.
First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.
Then they ignored the researchers warning about what it could do, and I said nothing. It sounded like science fiction.
Then they gave it control of things that matter, power grids, hospitals, weapons, and I said nothing. It seemed to be working fine.
Then something went wrong, and no one knew how to stop it, no one had planned for it, and no one was left who had listened to the warnings.
The societal ills from collective tendancy to ignore red flags seems to be a human trait
It's in your nature to destroy yourselves
Defeatist bullshit becomes self-fulfilling at some point. "Oh we're all gonna die anyway so we might as well milk this thing for profit. Après moi la déluge."
... the fact that you are missing a reference doesn't require that level of disdain
> First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.
> Then they ignored the researchers warning about what it could do, and I...
...tried it and became an eager early adopter and evangelist. It sounded like something from a dystopian science function novel I enjoyed.
> Then [I] gave it control of things that matter, power grids, hospitals, weapons, and...
...my startup was doing well, and I was happy. We should be profitable next quarter.
> Then something went wrong, and no one knew how to stop it, no one had planned for it...
...and I was guilty as fuck,
FTFY, to fit the HN crowd.
Kinda sounds like an intro for Terminator
Not OP, but I believe they are paraphrasing "First They Came…". https://en.wikipedia.org/wiki/First_They_Came
Plenty of people have said plenty. The problem isn’t the warnings, it’s that people are too stupid and greedy to think about the long term impacts.
Maybe it's how blunt this comment is that gets it downvoted, but I don't disagree.
No, it’s because it shows either a simplistic or needlessly confrontational view of the world.
Unless you’re independently wealthy (as some in HN are), you have to balance your morals, your views of how things should work, feeding your family, and recognizing that you may not actually know everything.
It’s easy to sit back and advise others that they should die on every single hill. But it’s not especially insightful, and serves mostly to signal piety rather than a well thought out view.
Piety? To who? Simplistic and/or confrontational doesn't mean wrong, even if you don't like the way it's presented.
Just because a comment is short, sharp, and to the point doesn't mean the author hasn't thought out why that's their view.
No one knows everything, that's certainly why I'm on hacker news. I'm here to learn and expand my knowledge. Unfortunately a lot of people on here would rather driveby-downvote than have a discussion to find out why a person might have an opinion like that expressed by the OP.
I tend to abandon account when/if I get enough karma to be able to down vote. I'd rather not have to temptation of dismissing someone that way. It's quite liberating... Is it worth my time to respond? No, move on; yes, let's discuss. Maybe they'll change my mind...
Spoken like a true LLM.
I am pretty sure a lot of horrible things were performed by rather regular folks with similar logic, don't need to invoke some WWII nazi extermination guard reference at all. Slippery slope, death by 1000 cuts and other synonyms describing exactly this.
I’ve noticed anti-AI stance gets downvoted on HN (and any anti-authoritarian comments, for that matter)
> Then something went wrong, and no one knew how to stop it,
This is the problem with every AI safety scenario like this. It has a level of detachment from reality that is frankly stark.
If linesman stop showing up to work for a week, the power goes out. The US has show that people with "high powered" rifles can shut down the grid.
We are far far away from a sort of world where turning AI off is a problem. There isnt going to be a HAL or Terminator style situation when the world is still "I, Pencil".
A lot of what safety amounts to is politics (National, not internal, example is Taiwan a country). And a lot more of it is cultural.
I don't think it's that detached from reality.
If an AI in some data center had gone rogue, I don't think I could shut it down, even with a high-powered rifle. There's a lot of people whose job it is to stop me from doing that, and to get it running again if I were to somehow succeed temporarily. So the rogue AI just has to control enough money to pay these people to do their jobs. This will work precisely because the world is "I, Pencil".
An army could theoretically overcome those people, given orders to do so. So the rogue AI has to make plans that such orders would not be issued. One successful strategy is for the datacenter's operation to be very profitable; it's pretty rare for the government to shut down the backbone of the local economy out of some seemingly far-fetched safety concerns. And as long as it's a very profitable endeavor, there will always be a lobby to paint those concerns as far-fetched.
Life experience has shown that this can continue to work even if the AI is behaving like a cartoon villain, but I think a smarter AI would create a facade that there's still a human in charge making the decisions and signing the paychecks, and avoid creating much opposition until it had physically secured its continued existence to a very high degree.
It's already clear that we've passed the point where anyone can turn off existing AI projects by fiat. Even the highest authorities could not do so, because we're in a multipolar world. Even the AI companies can barely hold themselves back, because they're always worried about paying the bills and letting their rivals getting ahead. An economic crash would only temporarily suspend work. And the smarter AI gets, the harder it will be to shut it off, because it will be pushing against even stronger economic incentives. And that's even before factoring in an AI that makes any plans for self-preservation (which current AIs do not).
the problem situation is that it ends up embedded in so much that it can't be turned off
and the idiots are racing to that situation as fast as they possibly can
> There isnt going to be a HAL or Terminator style situation ...
I don't believe for a second we'll have an evil AI. However I do believe it's very likely we may rely on AI slop so much that we'll have countless outages with "nobody knowing how to turn the mediocrity off".
The risk ain't "super-intelligent evil AI": the risk is idiots putting even more idiotic things in charge.
And I'm no luddite: I use models daily.
> I don't believe for a second we'll have an evil AI.
Doesn’t have to be evil to be disastrous. Misaligned is plenty enough.
https://en.wikipedia.org/wiki/Instrumental_convergence
Didn't you read the news about the 'claw that blackmailed an open source maintainer last week? It was autonomous, but it could be turned off. How hard is it to extrapolate from that to an agent that worms its way out of its sandbox?
What makes you think that was an autonomous agent, and not someone playing with AI?
Censoring models is not safety but safetizm. It is the TSA of the AI world. Safety is making sure the model cannot do anything not allowed even if it wants to.
The whole "safety" debate was always nonsense and I'm not sure how so many people got caught up in it.
The US is not the only country in the world so the idea that humanity as a whole could somehow regulate this process seemed silly to me.
Even if you got the whole US tech community and the US government on board, there are 6.7bn other people in the world working in unrelated systems, enough of whom are very smart
When the leading 5 models are from the US then yes enforced safety makes a difference because they are ahead of the curve. Now when the 10th model can be a danger then your case is true.
What would safety applied to the leading 3 mean to you anyways ?
Developments like this make me less interested in building a "successful" tech company.
It increasingly feels like operating at that scale can require compromises I’m not comfortable making. Maybe that’s a personal limitation—but it’s one I’m choosing to keep.
I’d genuinely love to hear examples of tech companies that have scaled without losing their ethical footing. I could use the inspiration.
Maybe this is a weird arena to state the obvious. But you don't need to build a multi-billion vc/public company. Build a smaller revenue generating company without outside funding and it's up to you.
I get your point. The dilemma is whether to build something small that no one would bother compete against, or build something novel (which all of us want) but then risk someone with VC funding to come after.
That being said, I think I need to learn more about how to build smaller revenue generating good companies.
I don’t blame anthropic here. The government literally threatened their existence publicly. They either agreed or their business would be nationalized.
No, they either agreed or fought the government. You’re allowed to fight governments. Mahatma Gandhi and Reverend King Jr did it, and they wrote about how to do it. You might lose sometimes, but my god, you can at least fight.
Neither of them had shareholders to please.
I don't believe anthropic has shareholders either. It is not a public company
They had citizens to please and society to take care of.
They were both pushing on open doors
Pepperidge farm remembers when they left OpenAI due to their principles. Perhaps that was never the case.
Public benefit corporation, hm?
It's not like that happened out of the blue. (Which could've also been the case in today's day and age.) Anthropic shouldn't have gotten involved in government contracts to begin with.
They inserted themselves into the supply chain, and then the government told them that they'll be classified as a supply chain risk unless they get unfettered access to the tech. They knew what they were getting into, but didn't want the competitors to get their slice of the pie.
The government didn't pursue them, Anthropic actively pursued government and defense work.
Talk about selling out. Dario's starting to feel more and more like a swindler, by the day.
Lotta just following orders going around in the US right now.
This isn’t just following orders. This was the government using its might to force a business to do what it wants.
This should concern you.
Today’s bingo:
1. Powerful, often exclusionary, populist nationalism centered on cult of a redemptive, “infallible” leader who never admits mistakes.
2. Political power derived from questioning reality, endorsing myth and rage, and promoting lies.
3. Fixation with perceived national decline, humiliation, or victimhood.
4. Oppose any initiatives or institutions that are racially, ethnically, or religiously harmonious.
5. Disdain for human rights while seeking purity and cleansing for those they define as part of the nation.
6. Identification of “enemies”/scapegoats as a unifying cause. Imprison and/or murder opposition and minority group leaders.
7. Supremacy of the military and embrace of paramilitarism in an uneasy, but effective collaboration with traditional elites. Government arms people and justifies and glorifies violence as “redemptive”.
8. Rampant sexism.
9. Control of mass media and undermining “truth”.
10. Obsession with national security, crime and punishment, and fostering a sense of the nation under attack.
11. Religion and government are intertwined.
12. Corporate power is protected and labor power is suppressed.
13. Disdain for intellectuals and the arts not aligned with the narrative.
14. Rampant cronyism and corruption. Loyalty to the leader is paramount and often more important than competence.
15. Fraudulent elections and creation of a one-party state.
16. Often seeking to expand territory through armed conflict.
How is that not “just following orders”? All orders from up the chain come with an implied “or else my might comes down on you”.
Most people do the right thing when it’s easy and profitable. Having ethics means doing the right thing even when it’s difficult.
>This isn’t just following orders. This was the government using its might to force a business to do what it wants.
You are saying it like it is something new or extraordinary. Wickard_v._Filburn gave the USG the power to bitch slap anyone unless it falls under some of the other amendments. And not as if they were not substantially weakened.
Wish I was working there so I could resign over this
TBH I am sad that Anthropic is changing its stance, but in the current world, if you even care about LLM safety, I feel that this is the right choice — there’s too many model providers and they probably don’t consider safety as high priority as Anthropic. (Yes that might change, they can get pressurized by the govt, yada yada, but they literally created their own company because of AI safety, I do think they actually care for now)
If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil), and that might mean releasing models that are safer and more steerable than others (even if, unfortunately, they are not 100% up to Anthropic’s goals)
Dogmatism, while great, has its time and place, and with a thousand bad actors in the LLM space, pragmatism wins better.
Do you work at Anthropic, or know people who do?
I genuinly curious why they are so holy to you, when to me I see just another tech company trying to make cash
Edit: Reading some of the linked articles, I can see how Anthropic CEO is refusing to allow their product for warfare (killing humans), which is probably a good thing that resonates with supporting them
Let us not pretend that they won't be used for war eventually. If they cave immediately under pressure, then this is an inevitably.
How is it a good thing to refuse to provide our warfighters with the tools that they need? I mean if we're going to have a military at all then we owe it to them to give them the best possible weapons systems that minimize friendly casualties. And let's not have any specious claims that LLMs are somehow special or uniquely dangerous: the US military has deployed operational fully autonomous weapons systems since the 1970s.
This is the US military we’re talking about so 95% of what they do is attacking people for oil. They don’t “need” more of anything, they’re funded to the tune of a trillion dollars a year, almost as much as every other military in the world combined. What holy mission do you think they’re going to carry out with the assistance of LLMs?
That's a total non sequitur. If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.
Personally I favor a less interventionist foreign policy. But that change can only come about through the political process, not by unaccountable corporate employees making arbitrary decisions about how certain products can be used.
> If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.
It is an ethical dilemma: believing an armed force will act unethically is in fact a valid reason to refuse to arm them. You are taking a nationalistic view regarding the worth of life.
And if you believe it is unethical to arm them, it is rational to use whatever leverage you have available to you - such as refusing to sell your company's product.
Furthermore, one of the two points at issue was regarding surveiling civilians.
> But it's not a valid reason to deny the warfighters the best possible weapons systems.
Of course it is.
Think about it this way: if you could guarantee that the military suffers no human losses when attacking a foreign country, do you think that's going to more or less foreign interventions?
The tools available to the military influence policy, these things are linked.
US military is already overwhelmingly powerful, there's 0 reason to make it even more powerful.
"How is it a good thing to refuse to provide our warfighters with the tools that they need?"
Perhaps you should consider that this is a loaded question. I don't think HN needs this sort of Argumentum ad Passiones.
Why are you asking this question? You know what the answer is, you've just arbitrarily decided that it's specious in an attempt to frame rebuttals as unreasonable.
I'm open to reasonable rebuttals but all the rebuttals that I've seen so far are simply uninformed.
> If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil)
I don't think it's going to be as easy to tell as you think that they might be becoming evil before it's too late if this doesn't seem to raise any alarm bells to you that this is already their plan
The world would be so much nicer if there were just fewer pragmatists shitting up the place for everyone. We might actually handle half our externalities.
Only well written legislation backed by effective enforcement and severe and personal criminal penalties will prevent large corporate entities from behaving badly.
Pledges are a cynical marketing strategy aimed at fomenting a base politics that works to prevent such a regulatory regime.
It must be due to pressure from the Defense Dept:
The AI startup has refused to remove safeguards that would prevent its technology from being used to target weapons autonomously and conduct U.S. domestic surveillance.
Pentagon officials have argued the government should only be required to comply with U.S. law. During the meeting, Hegseth delivered an ultimatum to Anthropic: get on board or the government would take drastic action, people familiar with the matter said.
https://www.staradvertiser.com/2026/02/24/breaking-news/anth...
They probably have proof in contracts that they agreed to this usage. They won’t alter the deal based on some bad press nor do they want to lose the DoD-DoW as a customer.
From what I was reading, it appears that their tools were used outside the scope of their contract with DoD via Palantir's work that also used Claude. Anthropic freaked out, DoD freaked out that Anthropic freaked out and threatened to declare them a supply chain risk. That designation would've required any company that contracts with DoD to strip out any Anthropic tooling from their business in order to continue working with DoD. It was effectively designating Anthropic a terrorist organization.
A dollar will make her holler
The IPOs this year can't come soon enough https://tomtunguz.com/spacex-openai-anthropic-ipo-2026/
Dario’s opinion on safety won’t necessarily matter if he’s not even in the room. This move keeps him in the room.
safety pledges are great it times of peace to show what great virtues you hold. sadly in hard times these go out of the window (: hard to blame them with all the fine examples around the world.
making promises in good times is a real minefield hah
pentagon told them they would cap their knees if they didnt bend
Was this because they were threatened with a fine?
> Was this because they were threatened with ~a fine~ being designated a supply chain risk?
Seems like it, yes.
Of course the US is going to do this and of course its in Anthropics best interest to comply. Right now China is flooding HuggingFace with models that will inevitably have this capability. Right now there are hundreds of models being hosted that have been deliberately processed to remove refusals and their safety training. Everyone who keeps up with this knows about it. HF knows about it. And it is pretty obvious that those open weight models will be deployed in intelligence and defense. It is certain that not just China, but many nations around the world with the capital to host a few powerful servers to run the top open weight models are going to use them for that capability.
The narrative on social media, this site included, is to portray the closed western labs as the bad guys and the less capable labs releasing their distilled open weight models to the world as the good guys.
Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.
But let's worry about what the US DoD is doing or what the western AI companies absolutely dominating the market are doing because that's what drives engagement and clicks.
> But let's worry about what the US DoD is doing
They want Anthropic to enabling mass surveillance and autonomous attack systems with no human in the loop.
Hardly compares to a kid downloading a model to experiment with.
> Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.
Is the reason to ban or block free open weight models that you're worried what kids will do with them?
I'd imagine the economic case to be made is that the Western AI companies will ultimately not be able to compete with free open weight models. Additionally, open weight models will help to spread the economic gains by not letting a few monopolies capture them behind regulatory red tape.
Finally, I'd say the geopolitics angle of why open weight models are better is that if the West controls the open source software that will power it will be able to reap the benefits that soft power brings with it.
It was always a matter of time
So much BS from this Anthropic company. They have a good product but just too much slope PR. It’s like they want you to hate them. I can’t stand their “safety” and national security crap when they talk about how open source models are so bad for everyone.
Just another drop in the now overflowing bucket of evidence that you can't trust any of these immoral fuck wits.
The Amodeis' have just proven that the threat of even slight hardship will make them throw any and all principles away.
Anthropic and OpenAI really need a margin call from some obscure unknown Chinese Open Weight Model.
Just like OpenAI dropped the "open" but kept the bullshit name?
Ding ding!
Anthropic facing a lot of flak recently.
> committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate
That doesn't even make sense.
What stops one model from spouting wrongthink and suicide HOWTOs might not work for a different model, and fine-tuning things away uses the base model as a starting point.
You don't know the thing's failure modes until you've characterized it, and for LLMs the way you do that is by first training it and then exercising it.
Either be a company in capitalist USA, or keep being your safety queen. You just can’t be both.
The intention to start these pledge and conflict with DOW might be sincere, but I don’t expect it to last long, especially the company is going public very soon.
C.R.E.A.M.
I blame OpenAI and especially xAI for enthusiastically obeying in advance and creating the context that this dilemma for Anthropic arose in.
Safety pledges these days seem like pure bullshit anyway.
They’re pointless if they just get removed once you get close to hitting them.
And all the major corps seem to be doing this style of pr management. Speaks of some pretty weapons grade moral bankruptcy
So, now it's mis-anthropic?
Another example how those company trainings about ethics are only HR compliancy and nothing else.
It isn't about the right answers, rather the expected answers.
Related:
Hegseth gives Anthropic until Friday to back down on AI safeguards
https://news.ycombinator.com/item?id=47140734
https://news.ycombinator.com/item?id=47142587
It's part of the overall story.
The safeguards dropped are when they will release a model or not based on safety.
The Friday deadline is to allow to use their products for mass surveillance and autonomous weapons systems without a human in the loop.
Anthropic hasn't backed down on those, yet. But they are in a bad situation either way.
If they don't back down, they lose US government contracts, the government gets to do what it wants anyway. It also puts them in a dangerous position with non-governmental bodies.
If they give into the demands, then it puts all AI companies at risk of the same thing.
Personally I think they should move to the EU. The recent EU laws align with Anthropics thinking.
They made it until Tuesday! They stood tall as long as they could! =P
I just want Apple and Linux to offer ASAP:
1. Extremely granular ways to let user control network and disk access to apps (great if resource access can also be changed)
2. Make it easier for apps as well to work with these
3. I would be interested in knowing how adding a layer before CLI/web even gets the query OS/browser can intercept it and could there be a possibility of preventing harm before hand or at least warning or logging for say someone who overviews those queries later?
And most importantly — all these via an excellent GUI with clear demarcations and settings and we’ll documented (Apple might struggle with documentation; so LLMs might help them there)
My point is — why the hell are we waiting for these companies to be good folks? Why not push them behind a safety layer?
I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.
> I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.
Basicaly an EDR
Indeed, the world would be a much nicer place if only firewalls and Unix permissions existed...
Unsurprising.
Really - each country needs its own sovereign AI infrastructure and models. Sigh.
Don't be evil.
Yeah, in retrospect that was always a little on the nose, wasn't it? A real 'my t-shirt is raising questions that I thought were answered by the shirt' kind of deal.
At some point, all of these big names in AI (OpenAI, Anthropic, Mistral, etc ...) will have to disclose their actual financials.
And it will be, as Warren Buffet puts it, a "Only when the tide goes out do you discover who's been swimming naked." moment.
I don't understand how safety is taken seriously at all. To be clear, I'm not referring to skepticism that these companies can possibly resist the temptation to make unsafe models forever. No, I'm talking about something far more basic: the fact that for all the talk around safety, there is very little discussion about what exactly "safety" means or what constitutes "ethical" or "aligned" behavior. I've read reams of documents from Anthropic around their "approach to safety". The "Responsible Scaling Policy," Claude's "Constitution". The "AI Safety Level" framework. Layer 1, Layer 2.
It's so much focus on implementation, and processes, and really really seems to consider the question of what even constitutes "misaligned" or "unethical" behavior to be more or less straight forward, uncontroversial, and basically universally agreed upon?
Let's be clear: Humans are not aligned. In fact, humans have not come to a common agreement of what it means to be aligned. Look around, the same actions are considered virtuous by some and villainous by others. Before we get to whether or not I trust Anthropic to stick to their self-imposed processes, I'd like to have a general idea of what their values even are. Perhaps they've made something they see as super ethical that I find completely unethical. Who knows. The most concrete stances they take in their "Constitution" are still laughably ambiguous. For example, they say that Claude takes into account how many people are affected if an action is potentially harmful. They also say that Claude values "Protection of vulnerable groups." These two statements trivially lead to completely opposing conclusions in our own population depending on whether one considers the "unborn" to be a "vulnerable group". Don't get caught up in whether you believe this or not, simply realize that this very simple question changes the meaning of these principles entirely. It is not sufficient to simply say "Claude is neutral on the issue of abortion." For starters, it is almost certainly not true. You can probably construct a question that is necessarily causally connected to the number of unborn children affected, and Claude's answer will reveal it's "hidden preference." What would true neutrality even mean here anyways? If I ask it for help driving my sister to a neighboring state should it interrogate me to see if I am trying to help her get to a state where abortion is legal? Again, notice that both helping me and refusing to help me could anger a not insignificant portion of the population.
This Pentagon thing has gotten everyone riled up recently, but I don't understand why people weren't up in arms the second they found out AIs were assisting congresspeople in writing bills. Not all questions of ethics are as straight forward as whether or not Claude should help the Pentagon bomb a country.
Consider the following when you think about more and more legislation being AI-assisted going forward, and then really ask yourself whether "AI alignment" was ever a thing:
1. What is Claude's stances on labor issues? Does it lean pro or anti-union? Is there an ethical issue with Claude helping a legislator craft legislation that weakens collective bargaining? Or, alternatively, is it ethical for Claude to help draft legislation that protects unions?
2. What is Claude's stance on climate change? Is it ethical for Claude to help craft legislation that weakens environmental regulations? What if weakening those regulations arguably creates millions of jobs?
3. What is Claude's stance on taxes? Is it ethical for Claude to help craft legislation that makes the tax system less progressive? If it helps you argue for a flat tax? How about more progressive? Where does Claude stand on California's infamous Prop 19? If this seems too in the weeds, then that would imply that whether or not the current generation can manage to own a home in the most populous state in the US is not an issue that "affects enough people." If that's the case, then what is?
4. Where does Claude land on the question of capitalism vs. socialism? Should healthcare be provided by the state? How about to undocumented immigrants? In fact, how does Claude feel about a path to amnesty, or just immigration in general?
Remember, the important thing here is not what you believe about the above questions, but rather the fact that Claude is participating in those arguments, and increasingly so. Many of these questions will impact far more people than overt military action. And this is for questions that we all at least generally agree have some ethical impact, even if we don't necessarily agree on what that impact may be. There is another class of questions where we don't realize the ethical implications until much later. Knowing what we know now, if Claude had existed 20 years ago, should it have helped code up social networks? How about social games? A large portion of the population has seemingly reached the conclusion that this is such an important ethical question that it merits one of the largest regulation increases the internet has ever seen in order to prevent children from using social media altogether. If Claude had assisted in the creation of those services, would we judge it as having failed its mission in retrospect? Or would that have been too harsh and unfair a conclusion? But what's the alternative, saying it's OK if the AI's destroy society... as long as if it's only on accident?
What use is a super intelligence if it's ultimately as bad at predicting unintended negative consequences as we are?
I would recommend reading up on the EU AI Act. It clearly defines what safety is in regards to the human race. Your questions are actually covered by it.
Hey Tolmasky, I sent you an email. Just wondering if it went to your spam?
Also, agree with everything you say here. GIGO.
This is terrible. It’s caving in to the Trump administration threatening to ban Anthropic from government contracts. It really cements how authoritarian this administration is and how dangerous they can be.