An AI board that pre-registers its bets – bet #1 just graded wrong(github.com)

9 pointsby dilushin4 hours ago3 comments

panarky3 hours ago
You'll get some hostility around here for all the slop-text, but the board of personas with anti-cheat public attestation seems like the beginning of a useful forecasting tool.
I tried building something similar to make 72-hour predictions about the US war on Iran, but found that the persona subagents were far too naive. They believed official statements and media reports at face value, they failed to read between the lines or apply principles of bounded distrust. They accepted spin and wartime propaganda and didn't give enough weight to underlying incentives. They didn't learn from their mistakes from earlier rounds or downgrade their trust in sources after their statements were repeatedly proven false.
It must be possible to improve accuracy with memory, system prompts, progressively changing subagent weights based on historical performance, etc.
I found it helpful to allow the personas to talk to each other. A pure weight of 20% each for 5 personas that are blind to the arguments of the others didn't work as well as personas that modified their rationales after reading the output of the others.
After each prediction resolves, I would have each persona create a post-mortem analysis of what they got right and wrong. Maybe visibility into prior post mortems of their own persona and those of others on the board could allow them to recognize historical cognitive biases and recalibrate for the next prediction.
Presumably the hosted version will have a leaderboard of some sort. Each board might not be able to cheat, but if users can cheaply create many sockpuppet boards, you'll see the Baltimore Stockbroker Scam emerge on the leaderboard. If the public attestation is to be meaningful, it must be difficult and expensive to create new boards.
- dilushinan hour ago
  You are right and sorry about that, I am new here, already learned my lesson )
  I build this as an advisory board for myself and very quickly understood I can use it for predsictions.
  Politics and war (sport and weather as well) have too much noise and more unknows that knowns to the public and very hard to predict, on the other hand fed numbers (and stocks) can be compared and verified.
  I was already thinking abot post-mortems after each round (self-improve loop), thanks for sharpening it out!
  Cheating is impossible, because numbers are written into git before the real-world answer come out, so they can't be faked.
  Regarding "Baltimore Stockbroker Scam", you are right and thanks for pointing out! The current release is local OSS library with your own API keys, no hosted leaderboard exist yet. In the futere tho, I can prevent it with one real account per person and rate-limitting the board creation. In addition min track record and transparency (can't delete or hide recordd) will do the job.
  @panarky thanks for insights! any more grill is wellcomed )
- dilushin3 hours ago
  [flagged]
dilushin3 hours ago
[flagged]
dilushin4 hours ago
[flagged]