So, it is very hands-off, but also very expensive, and it is never clear if optimizing the fitness function is worth it, because the fitness function itself may be insufficiently or incorrectly specified.
However, I do think that people should try, even with just a whiteboard or a notebook, to design a fitness-function, for their problem, as if they were going to try to evolve it, because (1) it forces them to explicate their correctness constraints, and (2) they may discover that the program that they are trying to write _is equivalent_ to the fitness function.
I'll give you an example for point 2. Many years ago, I had to parse a gnarly language, and I chose to do it via Chomsky Grammars (that automatically build a tree based on the grammar-spec). Chomsky Grammars are cool, in that they are basically just a state-machine, but they are incredibly difficult to debug: when they work, they might work incorrectly (malformed tree), and when they fail, they give no reason for failure (because even with a trace, you are trying to figure out which backtrack should not have happened). So, out of desperation, I started to consider using genetic programming to just evolve a correct Chomsky Grammar. It became clear that there are only 2 possible fitness functions (1) a function that tests a hand-picked input against a hand-crafted tree-output (which is vulnerable to over-fitting), and (2) a function that is not (well, is much less) vulnerable to over-fitting, but is effectively a pre-existing, correct grammar that can produce those trees.
If you are in situation 2, then the genetic programming is not necessary, unless you are trying to create an optimized (or obfuscated) parser, and even then the optimization may be overfit to the test-inputs (even if they are generated test-inputs from the grammar itself). If you are in situation 1, then you are better off re-evaluating your approach (I abandoned the Chomsky Grammar notation, and invented one that is much easier to understand and debug, without losing any of the expressiveness -- it also happens to be slower, but fast-and-broken is worthless compared to not-so-fast-and-works-fine).
One place where genetic programming has been consistently awesome, is in parameter-search style problems (e.g. your genome is a long list of floats, representing weights and/or anti-weights, and you need to find out which weights give you more fitness (or less error)). I hear good things about variable-neighborhood-search, but have yet to try it.
Starting with: https://sci-hub.ru/storage/moscow/4324/11d145b2c2c3ab320f70b...
The section with oscilloscope traces showing the progression of the “designs” over time was extremely interesting - I’d love to see what the 10x10 grid of functions looked like at each snapshot.
Thank you!
Adrian Thompson's research in the 90s evolved FPGAs that did signal analysis with bizarre features:
- A tiny number of cells (far fewer than expected)
- No clock, despite performing signal analysis
- FPGA cells that were logically disconnected, but when removed caused the device to stop working
Even then their approach was taking advantage of the physics in the FPGA. One can only imagine how effective this could be when applied to circuit design with the compute budget of a frontier lab.
https://cacm.acm.org/research/analysis-of-unconventional-evo...
I seem to recall legal commentators reacting with an eyeroll—apparently judges split much finer hairs than these for a living—but it was a cute stunt.
[1] https://m.youtube.com/watch?v=sJtm0MoOgiU and https://www.the-independent.com/tech/music-copyright-algorit...
Which is not to say let’s not do it anyway and see!
Against the claim that this wouldn't be searchable, we can just observe that this is about the size of the US patent database. Does this mean patents are not searchable? In that case, aren't all patent infringements excused?
Otherwise someone could copyright every combination of words.
My point was that it’s hard to imagine citing something that could not be patented as prior art. It would be like citing a phone book as proof that a software program can’t be copyrighted (“the exact bytes appear in the 1973 Albany NY white pages, therefore it wasn’t original”)
There is no need for it to be patentable (or patented). Prior art only requires that it be described and be made publicly available. It doesn't even require the originator of the information to be identified (traditional knowledge is prior art.)
That would be really sad..
I think you're going too far with this. Most people understand scientific theories to be an approximation. F=ma is approximately true, in the sense that it's only accurate within the newtonian regime and each of those terms includes so many asterisks that you will only ever measure it approximately.
The latter is the jokes about the physicists "assuming a perfectly spherical cow."
In fact that's kinda the whole point of the "unreasonable effectiveness of mathematics" essay. It is unreasonable that mathematical approximations are so good at describing our world.
Not to detract from your point at all, but I only ever heard this joke about mathematicians!
Biology is incredibly well oiled!
The parent is talking more about elegant simplicity vs. sprawling, seemingly haphazard complexity, and you're talking more about durability to failure points and 'completeness'.
Likewise, in code, a lot of the most durable, battle tested software looks extremely inelegant and duct taped, as 90% of the code is dedicated to handling one-off patches and weird edge cases.
That's the layman's idea of physics theories. They are beautiful and elegant only on the surface, that's why they're technically models and approximations of the real world. The standard model renormalization techniques are a mess of patches and ad-hoc heuristics, pretty far from the "this lagrangian literally contains all physics". Generally you just _ignore_ higher order terms and just call it a day. The famous E=mc^2 it's just the first term of a Taylor expansion. The beautiful form of physics it's what you would call "good enough" and often just a pedagogical tool.
Is this actually true? My understanding was that E=mc^2 is exact for a particle at rest.
Up until the present it has been a nearly uniform march of revealed symmetries, collapsed privileged frames of reference, and other such (in the deepest sense) simplifications in our model of reality that has improved its fidelity to the measurable.
I hang qualifier about these developments being simplifying because the result isn't simple in the details: quantum chromodynamics is a daunting subject! But it's not just an enumeration of details and contradictions, the particle zoo that preceded the Eightfold Way looked like line noise, now in indexed notation the Lagrangian of the entire Standard Model fits on a page (or so I've been told I've never actually seen the page).
It's almost tautological that the frontier where it's still messy involves an unrevealed symmetry or a persistent privileged frame of reference, that's what frontier means, we don't see past it to the seam where it folds up.
Personally I suspect AI systems will be a great deal more inclined to discard the parochial axioms that have every point placed human ego above simplicity.
It doesn't resolve all of the open problems in physics if you amputate consciousness, free will, agency persistent identity, and an unambiguous arrow of time.
But it starts looking possible to make progress.
Occam's Razor is a useful heuristic, but it biases us towards simpler explanations.
I think of elegance as not having to add epicycles, not that everything in the system has to be simple.
Also, without a working theory the, the space of possible solutions is near infinite. LLMs manage to pluck out the space of comprehensible English strings from n-dimensional hell. Even if this is done with a black box of billions of parameters, it’s still elegance in the sense that such a space even exists and was found
Gaining expertise is always the hard part and our new LLM overlords are making that much harder. So the simple “pure” functions as a teaching aid have never been more important.
End users have never cared about how the sausage is made though.
LLMs can explain complex things to humans with tons of specific context that you don’t find in textbooks or even a google search.
It’s probably never been easier to grasp a large codebase than it is today for example. You can probe and ask specific questions without going through a maze of imports and relationships and config files yourself.
Learning things will always be up to the person, it’s still a choice and dedication to a craft can still be taught.
I keep meeting people who think this and have enormous understanding gaps in the topics they've had an LLM teach them.
The absolute worst judge of how well someone understands a complex topic is the novice themselves.
When gaining mastery is not a requirement to doing novice-level work, many fewer people will get there. It takes more dedication than it did before.
When you attempt to hyper-optimize, even with humans in the loop, you end up a mess. You're lucky if you can find clean guiding principles anywhere. If you can hyper-optimize hyper quickly, you end up with an extra layer of mess.
Imagine that it's maybe the 1800's and you're asking why somebody who has already survived smallpox is not susceptible to becoming infected again. If you offered an explanation involving tiny detectives wandering around and collecting evidence which they present to each other and decide whether to multiply... one in which the tolerance comes from the detectives from the previous fight still hanging around in your lymph nodes ready to spring into action if they run across the right kind of evidence. Well that would probably be a more complicated explanation that anybody at the time would offer, and it would also be correct.
https://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_induc...
I think your point is more that we might be able to initially describe complex phenomena as messy, horrible complex equations, that doesn’t mean we shouldn’t work to simplify them and make them more understandable to us.
Some are useful.
Having theories that only give answers, but you can't reason about is not as useful. Having a theory where you don't know the limits of it's applicability, can be very dangerous.
At least in the physical realm there is not yet anything that combines relativity with QM so they can only be approximations. Even in math so far there seem to be similar challenges using programatic and "AI" driven solutions and proofs.
Still, I know that LLMs will be useful for Verilog/VHDL and particularly with verification, where they are already heavily used. Defined outputs and complete test coverage is already such a big part digital/asic design, I'd be surprised if it isn't used a lot more. Many software people would say that hardware is badly written copy-pasta, as it is. That said, higher velocity slop and hardware "technical debt" isn't something you can fix with an update. And no matter how fast you "ship", you won't get parts back in less than a few months. Poorly used, it will lead to expensive failures.
OPs argument is that reality is expressed by very complex equations and interactions; by definition this is outside of Solomonoff induction because it’s easy to imagine this accurate model by definition is the shortest algorithmic explanation, it’s just orders of magnitude more complex than our current approximations.
I guess the argument from OP would look like: "Yes, now imagine we poke and extend our universe as far as we can. How much bigger do you think our final 'shortest description' would be? I imagine it may be orders of magnitude more complex."
Well, I can imagine a squared circle... doesn't mean the math checks out. I would reply that you do not have to imagine, you can go about looking at different mathematically possible universes in Tegmark IV and find the expected number of bits for the one you actually exist in. Which is ~0 bits more complex than the shortest description based on the data you currently have.
Also, note that Newtonian mechanics is not actually a very short theory for building a universe, because you have to instantiate every object in the universe. You actually get a lot more of the structure for free with general relativity (re: Wigner's classification of the particles). An observer in a presumed-Newtonian universe calling it a simple theory would be like saying, "I compressed Wikipedia to one byte, just by putting it all in the decompiler!"
Though I also imagine that that is the point.
Maybe I'm just significantly and unrepresentatively unlucky, but Claude is significantly more intelligent than the average human around me on most any metric I can think of.
by any meaningful measure of intelligence. the latest models are much smarter than the bulk of the population.
how would you define intelligence?
That is a meaningful measure of intelligence that every LLM completely fails at.
It may seem similarly vague, but it does in fact open interesting, productive, and necessary questions. A "computer" was a professional crunching numbers - "replaced", "easily" because of the deterministic procedural nature of said work, but what about the technical effort to arrive there, and what about the less "mechanical" jobs? When do "processes" become "intelligence"?
Some of us had studied AI originally to study the mind - "how do we formalize thought". It's the interdisciplinary, transversal nature of the area.
Also maybe compare that with that large and important intersection between CS and Economics - the "science of optimization" and its implementation in efficient IT systems. The effort in terms of that different discipline may not be evident, yet lots of engineering is "optimizing" and the generalization of those solutions we call Economics (see the book Algorithms to live by).
So: the term "Artificial Intelligence" may not be important as CS solutions to practical problems are built (you just focus on the better solution), but there is relevance to the "side disciplince" of AI, and from that perspective that is the cone, the scope anyway. "How would an intelligent solver approach the problem".
But as you point out, we used to have human calculators. So is a simple desk calculator a form of "AI"? If so, what type of software isn't AI?
If what it does is "taking care of the carry", it represents a pretty minimal requirement for intelligence - it does replace a professional that could do it, but that professional does not have to apply too much proficiency and cleverness to do its job. It is improper AI.
> what type of software isn't AI
That which would not correspond to the job of an intelligent entity. Maybe blitting bitmaps around a screen?
As I tried to convey, it is more of a matter of perspective: the area of "implementing ways to solve problems as an intelligent entity would". It is a discipline that intersects others - engineering, logic, brain science, philosophy, epistemology, maybe again economics (as "the science of optimality and efficiency" - as an intelligent solver would do)... Consider it a special discipline that spans many other realms.
Okay, that makes sense. Even so:
> If what it does is "taking care of the carry", it represents a pretty minimal requirement for intelligence - it does replace a professional that could do it, but that professional does not have to apply too much proficiency and cleverness to do its job. It is improper AI.
I think you're underselling how much mental work is required to solve complex arithmetic. Yes, it's simple for a computer, but (1) even basic computers are extremely complex in absolute terms, and (2) even the most complex computing tasks could be considered simple once you break them down far enough—for example, a large language model is "just" fancy matrix multiplication.
So I feel like there's a "sufficiently advanced technology is indistinguishable from magic" element here. Something becomes AI once it seems sufficiently advanced. But then time passes and it doesn't feel that advanced anymore.
I understand that human language doesn't always have a super precise definition, and I'm not trying to be pedantic. I think the term "artificial intelligence" is under-specified to the point of having virtually no meaning. To the extent that it is useful—obviously, a lot of people are using it conversation, so something is getting communicated—it's because it's possible to infer from context what someone is referring to (ie "the student used AI to write her essay" is clearly referring to an LLM, not Eliza).
We'd all be better off if we used words that describe what we're actually talking about.
Defining a procedure for arithmetic is easy. Implementing it in silicon is not. To carry on the procedure for the former has low relevance to intelligence. To carry on the job of the latter does have high relevance to intelligence. If the latter is performed by a professional it is intelligence. If it is performed by an algorithm it is artificial intelligence. "Automating finding out good ways to implement ALUs" is AI; the ALUs running are not.
So, studying AI, asking ourselves which new "devices" (abstract sense) we can find so that our algorithms have aspects of cleverness, is productive as it simply and plainly pushes, invests in the production of that class of algorithms.
Surely there is a continuity between "sort" and "genetic alg." - but the direction counts, it is in that direction that we strived to produce them producers.
So, it's very much not about the complexity of the product («sufficiently advanced technology»): it is in the complexity of the intermediate that built the final product, when that intermediate is not human. The pocket calculator is majestic, yes - but there is the strong point: it was human made. That is human intelligence at work. Study how to have it blueprinted by a machine, and if it works properly, you'll attribute a simulation of intelligence to the automated blueprinter - that is artificial intelligence.
> used words that describe what we're actually talking about
Look, people who follow me here know I place radical importance to language and to the awareness of language. It should be one of the aspects I would be most dreaded for.
Surely, most people are unaware of what they say to a large extent.
But in the case of "Artificial Intelligence", it seems you are underplaying the concept of directions - "simple algorithms" vs "advanced algorithms"; "houses" vs "skyscrapers"; "flying machines" vs "air force fighters". There is continuity and yet different position. And intelligence surely can be implemented at different levels.
Another thing (I am strongly selecting what I could reply, and I am forced to be concise). There is also a concept of "unintelligence" - the dire opposite of intelligence is also a thing (if Eliza is ~0, you can go below that). Understanding what intelligence is helps recognize its opposite, which is an experienced pitfall in the area.
The computer can now literally talk to you in natural language and then perfectly produce sophisticated actions in response to completely arbitrary and unstructured input. It trivially passes the Turing test. By any definition prior to the year 2023 we are living with Artificial General Intelligence and it’s here now.
Remember, the interrogator is allowed to be hostile, so they would obviously employ all known prompt injections and typical LLM 'gotchas' to figure out who the AI is.
People aren't trying to communicate accurately if their first priority is getting you excited about the thing!
AI is not a real thing or a natural kind but a perspective. Whether something qualifies as "AI" or not cannot be decided by the objective features of the thing. Ergo, it can be defined at the author's pleasure.
> conflating established and morally neutral activities in ML
LLMs are no more or less morally neutral than other ML techniques.
It's really getting annoying having to have these conversations.
The real question is how much compute do you need. With LLMs getting popular, so is compute. That's the real win for non-LLM technologies. The sheer availability of GPU capacity. Yes, it's expensive, but time in a GB300 supercomputer isn't even possible if they don't exist.
Alexnet succeeded for many reasons but a big reason is that computers got good enough to apply those algorithms and techniques in practice. Outside of LLMs, what new AI/ML systems await us in the future? The LLM bubble popping, if it ever does, is going to leave us with supercomputer capacity going unused and available for cheap, meaning experiments that were once infeasibly expensive become practical. I can't afford $10 million to run a weather simulation, but at $1,000 for the same amount of compute, a lot more experimentation becomes practical.
Also, CFCs.
in the journal articles they did show measurements of real devices which agreed fine with predictions, but i didn't find them addressing it explicitly in the text. also, some systems they presented contained subblocks that were conventionally designed that could be carrying some of the weight.
or maybe i'm just sour that they're coming for my job? or maybe that's what they want us to think?
i think what wins in practice is simple ideas that can work in spite of all manufacturing and environment variations, and model limitations -- think stuff like feedback and symmetry. and what they show here is the opposite of that. i've done blind optimization of circuit parameters some times only to end up realizing some pretty simple such ideas that i'd missed (like "you need symmetry here" or "you just need more bandwidth here") and made complete sense when you thought about them. so i wonder if we can't tweak a few pixels in their structures and reveal something simpler.
also, obligatory mention: "genetic antennas"
Yes, this is exactly what bothers me about this article and about a few similar articles published in the past, that they do not contain any evidence that their claims about the usefulness of AI in design are true.
In TFA it says that the role of AI is replacing the electromagnetic simulator in the optimization process, by guessing the behavior of the structure, which is many orders of magnitude faster than a simulation.
This sounds plausible, but in order to believe this I would want to see the differences between AI guesses and real measurements, in the case of structures with geometries that are very different from those used in the training of the AI.
Also I would want to see exactly with which simulators they have compared the speed of the AI model.
There are various simulation approaches for electromagnetic fields and electronic circuits, that can trade-off accuracy for speed, so I am not convinced that AI inference takes necessarily much less time than some faster low-accuracy methods of simulation, which would still be more accurate and more reliable than AI guesses.
Since you beat me to it, I'll add something that relates relates you were saying on "realizing some pretty simple... ideas".
I think a big plus of computer aided design like this is "innovization"[1]. Somewhat awkward term. But, a system like this leading one to deeper understanding of a particular process is the general idea. It's a fun feeling in practice.
> How generalizable are these methods? Can they consistently deliver truly high performance? Can we get to a place where AI produces designs that maximize every conceivable trade-off, holistically optimizing every parameter to its most ideal physical state? .... AI can hallucinate a design that creates bad circuits that don’t work. This means verification methods need to remain under human oversight.
And they are essentially correct. We need better validation and verification methods, both software and hardware to keep in check the mistakes of automated random processes.
Maybe it doesn't matter?
I mean, of course it matters. But most of this sort of design space is effectively NP-complete, where the creation starts with a blank schematic page and has an impossibly large search space, but where the checking of the design is much simpler.
> also, obligatory mention: "genetic antennas"
Exactly. How does this work? When confronted with the question, of course, everybody gets all excited about the constrained randomness of the GA, but if you think about it, what really makes it work is that there is a comparatively cheap test for fitness for purpose.
One of my favorite little morsels of internet goodness.
It's interesting since I saw another comment near yours that raised the question of robustness of the lab-grown design, which I thought was kind of the most fascinating part of the damninteresting article was the revelation that the evolved programs were inseparable from the single physical FPGA used in the training. Since this RFIC training model employs a simulator, do you suppose that the quirks of the physical hardware on which the simulator runs are sufficiently isolated from the training such that a pair of designs would behave similarly when the simulator was run on distinct hardware? And I guess the even more obvious question is whether a design evolved on a simulator would have any hope of behaving as expected in physical hardware?
My hunch about the latter is no, although it still seems like an interesting study, and I often find myself thinking that really understanding what was going on with the FPGAs might be a prerequisite for really understanding how to master reinforcement learning.
Anyway I'm glad you posted this and if you have any other favorites related to this domain send them my way!
>I thought was kind of the most fascinating part of the damninteresting article was the revelation that the evolved programs were inseparable from the single physical FPGA used in the training.
100% agree with this!!! It gave me this weird feeling the first time i read it, like the onset of some alien intelligence. xD
I definitely think a simulator is the way to go, but I'm guessing tools like that could find problems and edge cases in the simulator that nobody thought to test for.
I'm just glad they still have the article up. I bet I've shared it 50 times over the past 20 years lol.
I feel a bit of unease when I read this title, not because of the threat of AI, but because the prevailing aphorism that "RF is black magic" is a slap in the face to the millions of physicists and RF engineers who DO understand every bit of this. It's a fun harmless anti-intellectual saw that I don't believe is harmless at all. We need more RF engineers and telling people it's all "black magic" and "wizardry" (and worst of all, saying "even RF engineers don't understand RF") makes it seem like it's not worth studying.
I think the opposite is true. It being advertised as difficult to understand is one of the reasons I personally decided to study RF Engineering. The prospect of learning something so challenging pulled me in. The Smith Chart helped.
When can one reasonably expect $10 10GHz oscilloscope on a chip, with some pins for video out and user input in?
At some point the economy will realize theres more LLM inference than access to scientific & technologic measurements, its an economic waste not to connect as much scientific instruments as possible to inference which already exists.
If the "dark arts" (which never really were that dark, analog designers for higher frequencies used the same Maxwell equations as the analog designers for lower frequencies, even if the implications change with frequency) end up automated by AI, the high wages will disappear, and oscilloscope mfrs won't be able to charge as much.
That said we’ve had some success internally having Claude do parameter sweeps
I clicked on all the links. Pretty much all of those movies could still work with wired technology. Even the one called cellular, in which a woman is trapped in an attic with a broken landline phone and manages to connect wires and dial a random number.
Yes I'm nitpicking. I guess I'm glad we have Wi-Fi and all, but don't try to sell me on it as a crucial plot device
The problem isn’t the design: its manufacturing restraints.
This is nothing new or impressive.
I feel like technology is going to become alien at some point. We're all going to be using magical runes instead of chips.
I like this headline. In other words, AI will suck out every last bit that makes engineering fun.
I know, I know. The job is to make money for your employer not have fun. AI makes money faster so shut up and do your job.
But fuck, I took this career because I found joy in understanding things and making things that look and work well.
I kind of thought the real success is when the designer comes up with key things that are well beyond their training or any training that could have been done up until that time. Based on their years of experience living in an environment where training is table stakes but that's not the thing that's relied upon the most in the end.
With LLMs it seems like odds are that a concept which is statistically insignificant in the training set may surface in place of a truly novel solution, effectively displacing the real breakthroughs that actually go beyond trainable performance.
In a way that decision-makers can not tell the difference, and that could be the worst part.
The AI in this case didn't create a novel technology- it merely used the existing technology without basing the new design on a previous one. The whole "human couldn't come up with it" is because the possible design space is so large, there's no reason a human would start where the AI did.
The thing the AI did better than humans was brute forcing a solution faster. Still a very handy thing to have, but it isn't "creating" in the sense that it invented new materials or fabrication processes or anything novel.
No. That can be said about LLMs, but not about all forms of AI. The technique used is not a LLM.
Sadly we've bastardized the term AI that, if it ever meant anything, it's meaningless now. The currently most voted thread in this post discuses the topic.
> In our new approach, the architecture begins essentially from nothing and is progressively assembled through successive iterations. The system explores the design space by generating myriad candidate circuit combinations and mapping the resulting performance trade-offs as it navigates this landscape. Because the process is not biased by prior human design choices, it can produce completely novel circuit topologies that look markedly different from those created by human designers.
The key tho is can they solve problems not easily solved before with prior techniques. Further can they identify problems not readily presented. Then identify novel solutions. Etc. The answer is emphatically yes they can. These features don’t have to literally exist in their training data, but the supporting highly convoluted network of associations of all their training data does have to in some complex space allow for it to produce these answers. It’s not the same as they’re stochastic parrots at all.
Are they creative? No, because they don’t have awareness. My personal imprecise definition of creative requires both self and awareness as well as free will. There is no driving awareness in all AI architectures, it all derives from extrinsic impetus. Creativity is derived, IMO, from a layer of our minds that is not readily assessed or measured and is only indirectly expressed through language, art, and music. Hence it is not directly trainable and therefore a learning model can’t learn it by reinforcement. It can learn the proxies, but the proxies are not, as we all deeply know, the same as our experienced awareness. We are not our words, our art, our music. We try hard to bridge it, but it’s impossible and you and I know this to be true from experience. In fact we can not even examine our own awareness because it’s not directly observable or possible for us to directly reason about. This is core to a lot of philosophy, especially mid and far eastern philosophy of the mind, the self, the five aggregates of Buddhism, etc. Psychology points at it, and modern psychology avoids it because it’s practically difficult for outcome oriented treatments.
While I have no hope for a rigorous definition (I don't think it's possible), there are two very distinct kinds of creativity:
1. Result is sufficiently novel for the system itself, i.e. it never seen it previously. This kind is too trivial to even talk about.
2. Result is novel for the side observer. This kind of creativity is meaningless because it depends on at least one unknown (side observer).