3 pointsby latentframe8 hours ago2 comments

DoctorOetker7 hours ago
Do any inference providers allow conditional evaluation of prompts? i.e. a user could request: when the price per token (which fluctuates in real life) falls below a certain level, run this prompt.
Some people need instantaneous replies, others would prefer deeper thinking for a more profound reply even if it means waiting for a day to pass: they could queue a series of prompts for project A, B, C, etc. Then observe the reply for project A and ask followups A2, A3, ... then process the replies for project B asking followups B2, B3... and so on and by the time the day has passed they have gone through the different projects, and the next day will bring the answers of the previous day. Waiting can often be perfectly fine if it brings a lower cost and the wait can be amortized by working on different projects in parallel.
latentframe8 hours ago
Some shift looks already happening : scaling AI used to be about better models and more compute but now it’s getting about power cooling and physical limits.