The statistics are 9th-order (of 3x3 blocks of pixels) but of a simple form which are hardly more expressive than 2nd-order nearest neighbour statistics (in terms of the different textures that they can reproduce) which are well known. In the approximate case where you only care about the average value of each pixel I think it would collapse to 2nd-order. Texture synthesis with MRFs with local statistics is discretized (in space) Turing reaction-diffusion. I did my PhD on this topic.
Probably the most influential early paper on this kind of simple texture model, where you will see similar patterns, is:
Cross & Jain, 1983, PAMI, Markov Random Field Texture Models
Are you still working on this topic or other things now?
And when you have a (two-layer) RBM you can integrate out the hidden variables and get a MRF without them (as I first described), or vice-versa. Which is more useful depends.
Or if you're more interested in textures, RBMs are used for texture and image modelling (e.g. [1]), and higher order statistics are far more interesting. In virtually all papers they take the form of the responses of linear filters (like that 3x3 filter in OP, often wavelets for natural images). But you can use completely different statistics. I found that long-range local binary patterns (LBPs), used for texture recognition, are good for texture generation too.
I've switched away from textures, but energy-based models are harder to escape. A surprising intersection of old and new topics at [2].
[1] R Liao, S Kornblith, M Ren, D Fleet & G Hinton, 2022, "Gaussian-Bernoulli RBMs Without Tears" https://arxiv.org/pdf/2210.10318
[2] C Zhang, B Jia, Y Zhu & S-C Zhu, 2024, "Human-level few-shot concept induction through minimax entropy learning" https://www.science.org/doi/10.1126/sciadv.adg2488
Trying to trade space for time, I used a model that gives every cell a set of all 512 of the possible 3x3 neighborhoods that could have caused that cell's present state ("alibis"). It then goes to each cell, comparing its alibis to those of neighboring cells and eliminating mutually impossible ones from either set. This step has to be repeated until no more alibis are shed in a pass.
When it finally stabilizes, the model is a solution kernel that can then field further manual guesses against it. If a cell's alibis all agree it was dead in the "before", there's no need to guess, but if they're not unanimous, what if we hazard a guess one way or the other for a bit? How does that ripple through the rest of the board? If any of the cells ran completely out of alibis given a certain guess, that guess was clearly not a proper solution, and it's time to back out and try a different one. If there's no solution at all, that's a Garden of Eden.
Ultimately I wanted to generate not just one solution, but all the solutions for a given board. I got stumped because I wasn't convinced I wasn't still working in 2**(n*m) time or worse trying guesses against the kernel.
It's a really fascinating problem, so much so that I even made a pico8 game about it years ago! Even the 6x6 grids are really tough!
https://jsbin.com/josuxarufu/1/edit?js,console,output
It tries to "evolve" the target image (right) by making random changes to the origin (left) -- output after 1 step is in center.
It's eventually supposed to be a genetic algorithm, but for now it just goes through it pixel by pixel, and if the pixel doesn't match the target, it inverts one of its neighbors. If that's beneficial (checking loss in a 5x5 rect centered on the test pixel), we keep the change. I get loss down to about 25% then it plateaus.
Feel free to fiddle, fork, etc! (I stayed up pretty late so the code is bad! Suggestions welcome!)
> First of all, while I said “Predecessorifier” in the talk, “Ataviser” seems to be the accepted word, coming from “Atavism”, which the online Merriam-Webster dictionary defines as “recurrence of or reversion to a past style, manner, outlook, approach, or activity”.
"its like showing a solved rubiks cube and asking what the scramble was"
^ this analogy may be the best I've seen in a long time.ive caught myself doing this
https://www.youtube.com/watch?v=PlzV4aJ7iMI (in French)
> I can’t help but wonder if there’s a reaction-diffusion-model-esque effect at work here as well
There are continuous approximations of the game of life that show this, for example this implantation:
It would probably work even better if you tweak the loss function with some kind of averaging/blurring filter.
It's the trivial solution. It's significantly more efficient than using backprop. It's much simpler and leads to practically the same result.
This makes me wonder why you would use backprop to do this. It feels like a waste of time, and I'm not sure why this post generates so much attention on HN.
Also see https://nullprogram.com/blog/2020/11/24/
> I'm pretty sure that's no more efficient than guessing all pixels at once and trying to check of that worked or not.
It's iterative so you don't have to run for a very long time, and it will keep improving the loss function for longer.
It's true that a "good" decision now might turn out to be "bad" later, but whether it's effective in improving a solution depends on the fraction of times that happens, which is almost certainly not 50%. Hill-climbing methods like this are used everywhere in optimisation when you want a decent solution quickly and don't require optimality.
>somewhat analogous to predicting that a coin flip will land on 'heads' if it landed on heads at the last flip
I'm no statistician either, but this is not analogous at all.
I just mean with the known complexity and nature of Game of Life, trying to guess things based on an objective function, when going back in time, seems no more efficient than trying to guess coin flips, based on prior flips. There's probably some Logic Theorem (sorry I don't know it) that describes why objective functions cannot be used to help solve the backwards Game of Life problem.
It's a complexity problem like which butterfly "caused" a tornado, or which initial clumping of rocks "caused" a black hole, because there are infinite numbers of solutions. Like how many initial GoL boards are there that result in a particular final state? I say for any final state there will be infinite possible starting states that can lead up to it.
Albeit, using a different stochastic technique
(Not saying the goal was working well and being fast to implement.)
In the section on "Automatic differentiation" he considers the question of differentiable computation graphs.