There is no Alignment Problem

1 pointby salacryl6 hours ago3 comments

techblueberry5 hours ago
This statement is alignment ->
“Converting humans into paperclips contradicts likely intent”
This statement only violates “likely intent” if you have an ethical framework that values human life. Like, I dunno, one of my foundational understandings of computers, that I think is required to understand AI is they are profoundly simple / stupid. When you really think about the types of instructions that hit the CPU, higher level languages abstract away how profoundly specific you have to be.
Why would you assume AI’s logic would align with an understanding that a creature values it’s own life? As soon as you say something like “well obviously a human would’ve ask to kill all humans - why? From first principles why, and if you’re building an ethical framework from the most fundamental of first principles, then the answer is there is no why. - human atoms are valuable. We’re made up of resources that are valuable, why is the sum greater than parts?
If you follow an existentialist framework, logically speaking there is no objective purpose to life and person as paperclip may have just as much value as person as meat popsicle.
What is the purely logical valueless reason that a person wouldn’t be asked to be turned into a paperclip?
What if I told you paperclips are worth $.005 but you can’t put a value on human life?
And even then, humans have this debate, what if instead of turning us into paperclips, they did the whole matrix battery thing, we do something similar to cows, and AI could argue it’s a higher life form, so logically speaking, enslaving a lower lifeform to the needs of the higher life from is logical.
And I don’t mean this in an insulting way, I’m existentialist adjacent and an ahthiest so maybe partially it’s something about my belief system or whatever, but what is it about your model of the world that thinks that a purely logical framework would make the right ethical choices? It’s very foreign to me, and I dont think it’s because your dumb, but because we probably have profoundly different frameworks of the world? My answer to me is the “obvious” answer, an the way you framed your post, your answer to you is the “obvious” answer.
- salacryl4 hours ago
  You're conflating two separate problems:
  Goal Verification: Does 'maximize paperclips' mean 'convert universe'? Statistically unlikely. Verify before executing. Ethical Framework: Should AI value human life? Different problem, not what I'm addressing.
  RDV solves #1 through premise verification. It doesn't solve #2, nor does it claim to. 'Likely intent' isn't ethics - it's Bayesian inference about goal probability. When a human says 'maximize paperclips,' P(wants office supplies) >> P(wants genocide). Verification asks: 'Is that what you meant?' Ethics asks: 'Should I do it?' These are orthogonal questions.
  - techblueberry4 hours ago
    “Does 'maximize paperclips' mean 'convert universe'?” Statistically unlikely.
    Why not? What statistics? Why is the universe more valuable than infinite paperclips? If I imagine a sandbox with no moral reasoning, I would say it is statistically likely. There is in fact the paperclips game where if you can’t specifically do that, I don’t know why you wouldn’t. If you’re answer is “stop being obtuse, you know why it’s statistically unlikely” that’s a human value.
    But then to your point about premise verification, again, based on what moral framework. If I ask you to build a house, is the first question, do you value the life of a tree over the wood used for the house? There are infinite premises one might examine without moral frameworks.
    Why is cutting down a tree worse than genocide, of people? Of ants? Does every bug killed to build the house deserve the same moral verification? Why not? What if one of the bugs was given a name by the three year old girl who lives down the street?
    If you’re argument is the training data already includes human value’s, than that’s probably a different argument. Just hope you don’t train on too many serial killer manifestos.
ethical_swarm6 hours ago
[dead]
ethical_swarm6 hours ago
[dead]