Who’s going to stop them? Who’s going to say no? The military contracts are too big to say no to, and they might not have a choice.
The elimination of toil will mean the elimination of humans all together. That’s where we’re headed. There will be no profitable life left for you, and you will be liquidated by “AI-Powered Automation for Every Decision”[0]. Every. Decision. It’s so transparent. The optimists in this thread are baffling.
Terminator is a good movie but in reality, a cheap autonomous drone would mess one of those up pretty good.
I've seen some of the footage from Ukraine, drones are deadly, efficient, they are terrifying on the battlefield. Even though those robots will get crazy maneuverable, it's going to be pretty hard to out run an exploding drone.
Maybe the Terminators will have shotguns, but I could imagine 5 drones per terminator being a pretty easy to achieve considering they will be built by other autonomous robots.
Of course they will. Practically everything useful has a military application. I'm not sure why this is considered a hot take.
This looks like an increasingly theoretical concern. (And probably always has been. Wars were far more brutal when folks fought face to face than they are today.)
I am personally a bit skeptical of anthropormophic hands achieving similarly high reliability. There's just too many small parts that need to withstand high forces.
[0]https://robotsdoneright.com/Articles/what-are-the-different-...
Mechanical reliability is not the main concern IMO
I had always assumed that such a robot would be very specific (like a cleaning robot) but it does seem like by the time they are ready they will be very generalizable.
I know they would require quite a few sensors and motors, but compared to self-driving cars their liability would be less and they would use far less material.
When the tree of costs that make up a product are traced, surely all the leaf nodes are human labour? As in, to make the actuator, I had to pay someone to assemble it and I had to buy the parts. Each part had a materials cost and a labour cost. So it goes for the factory that made the fasteners, the foundry that made the steel, the mine that extracted the ore.
Shudder to think of how to regulate resource extraction in a future where AI humanoid robots are strip mining and logging for free.
What about energy, real estate and taxes?
Even at the extreme end of automation, if you want iron ore, you need to buy a mine from somebody, pay taxes on it, and power the machines to extract the minerals and transport them elsewhere for processing.
If I were writing a sci-fi novel about this I don't know how I'd handle something real estate (or mineral rights or water rights). You already need permission from the government to extract resources.
As for taxes, why does the government even want the money? What are they going to do with it?
> As for taxes, why does the government even want the money? What are they going to do with it?
There are websites that break down how e.g. different national/federal budgets are divvied up in the real world. Alternatively, I suggest a good book on macroeconomics; I am partial to Steve Keen's "Debunking Economics", but there are many others.
So even though another robot could probably do the "jimmy up". it seems like overtime, the robots will "drift" into all being a bit different.
Even commercial airlines seem to go through fairly unique repairs from things like collisions with objects, tail strikes etc.
Maybe it's just easier to recycle robots?
Assume every motor has a 1% failure rate per year.
A boring wheeled roomba has 3 motors. That's a 2.9% failure rate per year, and 8.6% failures over 3 years.
Assume a humanoid robot has 43 motors. That gives you a 35% failure rate per year, and 73% over 3 years. That ain't good.
And not only is the humanoid robot less reliable, it's also 14.3x the price - because it's got 14.3x as many motors in it.
[1] And bearings and encoders and gearboxes and control boards and stuff... but they're largely proportional to the number of motors.
For example, do the motors in hard drives fail anywhere close to 1% a year in the first ~5 years? Backblaze data gives a total drive failure rate around 1% and I imagine most of those are not due to failure of motors.
But the neat thing about my argument is it holds true regardless of the underlying failure rate!
So long as your per-motor annual failure rate is >0, 43x it will be bigger than 3x it.
43x of 1% failure rate is tragic, but 43x of 0.1% is acceptable in my book.
For example, an industrial robot arm with 6 motors achieves much higher reliability than a consumer roomba with 3 motors. They do this with more metal parts, more precision machining, much more generous design tolerances, and suchlike. Which they can afford by charging 100x as much per unit.
For example, if you're making a phone that is going to be sold around the world, then you're going to worry about arctic/equator temps (will some of your components melt or ICs fail), salty sea air (will the product begin to corrode for people living by a beach), or fast moving elevators (will the speakers pop from a sudden change in pressure).
You can check out this manufacturers robot arms as some examples of existing products. They list some data sheets for their robot arms, including some arms that are IPxx rated. I don't think looking at robot arms is a 1to1 comparison for what you could expect from a humanoid robot since the considerations in the design process are going to be different.
website is kuka dot com/en-at/products/robotics-systems/industrial-robots/kr-agilus
For example, MIG welding robots tend to life a hard life. And if you look at photos of industrial painting robots, you'll find they're often fitted with plastic smocks.
If you look up photos online you'll only get marketing images from robot makers, where everything is shiny and brand new - I can assure you, it's not like that after they've been operating for a decade or two :)
If the dust collection was disabled, the workshop and the machine would be caked in debris.
It doesn't move, it doesn't fall over or have anything falling on top of it either (like a robot could).
Plus they'll likely be modular and able to be replaced.
IMHO, the bigger design issue for humanistic is lowering the need for mechanical precision which requires lots more metals and instead using adaptive feedback and sensors to obtain accuracy similar to how humans and animals do it. AIs should be really good at that, eventually. I think the compute will need to be about 10x what it is now though.
OpenVLA, which came out last year, is a Llama2 fine tune with extra image encoding that outputs a 7-tuple of integers. The integers are rotation and translation inputs for a robot arm. If you give a vision llama2 a picture of a an apple and a bowl and say "put the apple in the bowl", it already understands apples, bowls, knows the end state should apple in bowl etc. What missing is a series of tuples that will correctly manipulate the arm to do that, and the way they did it is through a large number of short instruction videos.
The neat part is that although everyone is focusing on robot arms manipulating objects at the moment, there's no reason this method can't be applied to any task. Want a smart lawnmower? It already understands "lawn" "mow", "don't destroy toy in path" etc, just needs a finetune on how to corectly operate a lawnmower. Sam Altman made some comments about having self-driving technology recently and I'm certain it's a chat-gpt based VLA. After all, if you give chatgpt a picture of a street, it knows what's a car, pedestrian, etc. It doesn't know how to output the correct turn/go/stop commands, and it does need a great deal of diverse data, but there's no reason why it can't do it. https://www.reddit.com/r/SelfDrivingCars/comments/1le7iq4/sa...
Anyway, super exciting stuff. If I had time, I'd rig a snowblower with a remote control setup, record a bunch of runs and get a VLA to clean my driveway while I sleep.
Not https://public.nrao.edu/telescopes/VLA/ :(
For completeness, MMLLM = Multimodal Large language model.
1) Properly recognize what they are seeing without having to lean so hard on their training data. Go photoshop a picture of a cat and give it a 5th leg coming out of it's stomach. No LLM will be able to properly count the cat's legs (they will keep saying 4 legs no matter how many times you insist they recount).
2.) Be extremely fast at outputting tokens. I don't know where the threshold is, but its probably going to be a non-thinking model (at first) and probably need something like Cerebras or diffusion architecture to get there.
2. Figure has a dual-model architecture which makes a lot of sense: A 7B model that does higher-level planning and control and a runs at 8Hz, and a tiny 0.08B model that runs at 200Hz and does the minute control outputs. https://www.figure.ai/news/helix
google-deepmind/mujoco_menagerie: https://github.com/google-deepmind/mujoco_menagerie
mujoco_menagerie/aloha: https://github.com/google-deepmind/mujoco_menagerie/tree/mai...
Please make robots. LLMs should be put to work for *manual* tasks, not art/creative/intellectual tasks. The goal is to improve humanity. not put us to work putting screws inside of iphones
(five years later)
what do you mean you are using a robot for your drummer
It's a "visual language action" VLA model "built on the foundations of Gemini 2.0".
As Gemini 2.0 has native language, audio and video support, I suspect it has been adapted to include native "action" data too, perhaps only on output fine-tuning rather than input/output at training stage (given its Gemini 2.0 foundation).
Natively multimodal LLM's are basically brains.
Absolutely not.
Only suggestion I have is “study more”.
If it looks like a duck and quacks like a duck...
Just because it is alien to you, does not mean it is not a brain, please go look up the definition of the word.
And my comment is useful, a VLA implies it is processing it's input and output natively, something a brain does hence my comment.
https://arxiv.org/abs/2506.01844
Explanation by PhosphoAI: https://www.youtube.com/watch?v=00A6j02v450
Rather than advertising blitz and flashy press events, they just do blog posts that tech heads circulate, forget about, and then wonder 3-4 years later "whatever happened to that?"
This looks awesome. I look forward to someone else building a start-up on this and turning it into a great product.
> To ensure robots behave safely, Gemini Robotics uses a multi-layered approach. "With the full Gemini Robotics, you are connecting to a model that is reasoning about what is safe to do, period," says Parada. "And then you have it talk to a VLA that actually produces options, and then that VLA calls a low-level controller, which typically has safety critical components, like how much force you can move or how fast you can move this arm."
For example, what if you need to train the model to keep unauthorized people from shutting it off?
Fast forward to 2025, weeks have no self driving cars, and nothing is even close to getting to Mars, let alone manned