When calling the automation, you need to provide three main parameters: 1. Title (title): A brief descriptive name for the automation. This helps identify it at a glance. For example, "Check for recent news headlines". 2. Prompt (prompt): The detailed instruction or request you want the automation to follow. For example: "Search for the top 10 headlines from multiple sources, ensuring they are published within the last 48 hours, and provide a summary of any recent Russian military strikes in the Lviv Oblast." 3. Schedule (schedule): This uses the iCalendar (iCal) VEVENT format to specify when the automation should run. For example, if you want it to run every day at 8:30 AM, you might provide:
BEGIN:VEVENT RRULE:FREQ=DAILY;BYHOUR=8;BYMINUTE=30;BYSECOND=0 END:VEVENT
Optionally, you can also include: • DTSTART (start time): If you have a specific starting point, you can include it. For example:
BEGIN:VEVENT DTSTART:20250115T083000 RRULE:FREQ=DAILY;BYHOUR=8;BYMINUTE=30;BYSECOND=0 END:VEVENT
In summary, the call typically includes: • title (string): A short name. • prompt (string): What you want the automation to do. • schedule (string): The iCal VEVENT defining when it should run.
- Assumed UTC instead of EST. Corrected it and it still continued to bork
- Added random time deltas to my asked times (+2, -10 min).
- Couple notifications didn't go off at all
- The one that did go off didn't provide a push notification.
---
On top of that, only usable without search mode. In search mode, it was totally confused and gave me a Forbes article.
Seems half baked to me.
Doing scheduled research behind the scenes or sending a push notification to my phone would be cool, but surprised they thought this was OK for a public beta.
Anthropic is ahead in this because they keep their UIs simplistic so the failure modes are also simple (bad connection)
OpenAI is just pushing half baked stuff to prod and moving on (GPTs, Canvas).
Find it hilarious and sad that o1-pro just times out thinking on very long or image-intense chats. Need to reload page multiple times after it fails to reply and maybe answer will appear (or not? Or in 5 minutes?). Kinda shows they’re not testing enough and “not eating their own food” and feels like chatgpt 3.5 ui before the redesign
What's funny is that OpenAI's Canvas was their attempt to copy Anthropic's Artifacts! So it's not like Anthropic is stagnant and OpenAI is at least shipping, Anthropic is shipping and OpenAI can't even copy them right.
Right now, in fact, my understanding is OpenAI is using their current LLM's to write the next generation ones which will far surpass anything a developer can currently do. Obviously we'll need to keep management around to tell these things what to do, but the days of being a paid software engineer are numbered.
That’s the only way I get it to have a halfway decent brain after a web search. Something about that mode makes it more like a PR drone version of whatever I asked it to search, repeating things verbatim even when I ask for more specifics in follow-up.
The same company that touts their super hyper advanced AI tool that can do everyone's (except the C-level's, apparently) jobs to the world can't figure out how to make a functional cron job happen? And we're giving them a pass, despite the bajillions of dollars that M$ and VC is funneling their way?
Quite interesting they wouldn't just throw the "proven to be AGI cause it passes some IQ tests sometimes" tooling at it and be done with it.
But wouldn't a company like OpenAI use a tick-based system in this architecture? i.e. there's an event emitter that ticks every second (or maybe minute), and consumers that operate based on these events in realtime? Obviously things get complicated due to the time consumed by inference models, but if OpenAI knows the task upfront it could make an allowance for the inference time?
If the logic is event driven and deterministic, it's easy to test and debug, right?
This is also a bad case in terms of queueing theory. Looking at Kingmans equation, the arrival variance is very high (a ton of jobs will run at 00:00 and much fewer at 00:01), and the service time also has pretty high variance. That combo will either require high queue delay variance, low utilization (i.e. over-provosioning), or a sophisticated auto-scaler that aggressively starts and stops instances to anticipate the schedule. Most of the time it's ok to let jobs queue since most use cases don't care if a daily or weekly job is 5 minutes late.
Makes me wonder if they internally have "press releases / Q" as an internal metric to keep up the hype.
some people cant even wrap gheir heads around it, taking hours and hours of discussions. still my favourite problem though.
Developers after more research: "Oh... this is a political problem."
This is too much of a dev feature for apple to implement and there are probably third party apps that do this, but meh
Apple has not innovated in years and a GPT Phone where your lock screen is a FaceTime call like UI/UX with your AI Agent who does everything for you would give Apple a run for it's money! Pick up your phone & see your agent waiting to assist & it could be skinned to look like a deceased loved one (mom still guiding your through life).
To get things done it would interface with other AI Agents of businesses, companies, your doctor, friends & family to schedule things & used as a knowledgebase.
Maybe this is their step towards creating said agents?
I just… don’t want this. I don’t think anyone I know wants this.
I use chatGPT now for almost everything and when in the car have full back n forth conversations to get things done (or as a knowledge-base) there too. Recently i was discussing with it how do i properly get rid (junk) my old car in Pennsylvania. It provided all the steps and gave me local businesses. Though it didn't call them or interface with them to find their available times/costs, tell me such details & have me instruct it to schedule my preferred choice. Which i wish it did and prompted thoughts how it could do so, as technology that gets adopted mostly is tech that has simplified our lives.
I think my concept above is similar to what was seen in the movie H.E.R. (Joaquin Phoenix & Scarlett Johansson starred) so it's not that crazy or odd. Throwing in skinning it to be whoever like a deceased loved one, might to probably is.
I switched over to the "GPT4o with scheduled tasks" model and there were no UI hints as to how I might use the feature. So I asked it "what you can you follow up later on and how?"
It replied "Could you clarify what specifically you’d like me to follow up on later?"
This is a truly awful way to launch a new product.
OpenAI doesn't, they just have a ton of funding and (up to recently) a good mass media story, and the best natural language responses.
The moat around Siri is much deeper, and I don't really see any evidence OpenAI has any special sauce that can't be reproduced by others.
My prediction is that OpenAI's reliance on AI doomerism to generate a regulatory moat falters as they become unable to produce step changes in new models, while Apple's efforts despite being halting and incomplete become ubiquitous thanks to market share and access to on device context.
I wouldn't (and don't) put my money in OpenAI anymore. I don't see a future for them beyond being the first mover in an "LLM as a service" space in which they have no moat. On top of that they've managed to absorb the worst of criticism as a sort of LLM lightning rod. Worst of all, it may turn out that off-device isn't even really necessary for most consumer applications in which case they'll start to have to make most of their money on corporate contracts.
Maybe something will change, but right now OpenAI is looking like a bubble company with no guarantee to its dominant position. Because it is what it is: simply the largest pooling of money to try to corner this market. What else do they have?
Most people use Gmail, Docs, Google Maps, Google Calendar above Apples alternatives. Gemini could really tie them up well.
Remember Google Sheets (already the Tonka Toys of spreadsheets) adding named tables to Sheets?
You can't use them in any of the AppsScript APIs. You have to fall back to manual searching for strings and index arithmetic.
Google Drive still barely supports anything like moving an entire folder to another folder.
They have failed at least a half dozen times now to deliver a functional chat/VOIP app after they already had one in Google Talk.
They regularly sunset products that actually have devoted and zealous user bases for indiscernible reasons.
Android is just chugging along doing nothing interesting and still carrying the same baggage it did before. It's a painful platform to develop for and the Jetpack Compose/Kotlin shift hasn't ameliorated much of that at all.
Their search offering is now worse than Bing, worse than Kagi, and worse than some of the LLM based tools that use their index. It's increasingly common that you can't even find a single link that you know an entire verbatim sentence from via Google search for inexplicable reasons. Exact keyword or phrase searches no longer work. You can't require a keyword in results.
I don't trust Google to deliver a single functional software product at this point, let alone a compelling integration of many different ones developed in different siloes.
About the only thing going for them is how many people still have Gmail accounts from that initial invite only and generous limits campaign... 20 years ago?
Google is not a healthy company. I don't invest in them anymore, and barring some major change I probably won't again. It's a dying blue chip which is a terrible position to have your money in.
P.S. oh, and Gemini is awful by comparison in both price and quality to competitors. It isn't saving them. It's just a "me too".
P.P.S. I'm personally just waiting for their next "game changing" announcement bound to fail to get in at the top floor on shorting what stock I have. It's one of those cases where finance has rose coloured glasses based on brand name that anyone who's used Google products for years would be thoroughly disabused of.
For example, I found myself asking Claude about places to see in a city I’m visiting while switching back and forth to gmaps. This would have been a much better experience integrated directly with gmaps knowledge graph
Then there are some UI hints.
"Remind me of your mom's birthday on [X] date"
Wow, really maximising that $10bn GPU investment!
More seriously, todo apps are about productivity, not just about becoming a huge bucket of tasks. I've always found that the productivity comes from getting context out of my head and scheduled for the right time. This release appears to be more about that big bag of tasks and less about productivity. I'm all for AI in products, I think it can be powerful, but I've not had a use-case for it in my todo app.
no.
"a todo app that you can interact with by writing natural language input?"
okay.
> nuance
really?!
As for nuance, I've seen an astounding amount of divergent context incorporated into LLM responses. Not always, but far more than I've ever been able to encode into a parsing script, which is exactly nothing not explicitly programmed.
Edit: I suppose they'll be here at some point: https://help.openai.com/en/articles/9624314-model-release-no...
These seem like extremely shitty release notes. I have no clue why anybody pays for this model.
Recently someone shared a link to one of their chat sessions here, and it reliably 404'd for me but not others.
It only scheduled the first thing and that was after having to be specific by saying "7:30pm-11pm". I wanted to say "from now to 11pm" but it did couldn't process "now"
https://x.com/karinanguyen_/status/1879270529066262733 https://x.com/OpenAI/status/1879267276291203329
We support just about every other job platform but I’d love to hear from potential users before I hack something together.
I got the best results by not enabling Search the Web when I was trying to create tasks. It confuses the model. But scheduled tasks can successfully search the web.
It's flaky, but looks promising!
Resource URL: https://cdn.oaistatic.com/assets/jbl0aowda306m4s1.js Source Map URL: jbl0aowda306m4s1.js.map
Also I am getting`Unable to display this message due to an error.`a lot.
Me:
> Give me positive feedback every hour
ChatGPT:
> Provide positive feedback
> Next run Jan 15, 2025
> Got it! I’ll send you positive feedback every hour.
An hour later, I received the following email:
```
Your scheduled task couldn't be completed
ChatGPT tried to complete Provide positive feedback multiple times, but it encountered an error and wasn't able to send. It will try again the next time this task is scheduled.
Open chat If you have any questions, please contact through the help center.
All the best, ChatGPT
```
We already have many implementations where at a cron interval one could call the GPT APIs for stuff. And its nice to monitor it and see how things are working etc.
So I am curious whats the use case to embed a schedule inside the ChatGPT infrastructure. Seems like a little off its true purpose?
i saw no mention of them on the help article, or the ui
if i ask for a daily early morning news summary will it show up in the middle of the night or around lunch time? will it get updated when i travel? seems interesting if what you're looking for is a reminder that is not time relevant, just a thing that should happen at some point with a time precision of about 1 day.
https://help.openai.com/en/articles/10291617-scheduled-tasks...
Many existing apps (like Todoist) have already had LLM integrations for a while now, and have more features like calendars and syncing.
Or do I completely not understand what this product is trying to be?
I even have an automated x account @alarmsglobal
Otherwise, you'll have a lot of systems dependent on these orchestrators creating hard-to-debug mistakes up and down the pipeline. With software, you can reach a state where it does what you tell it to without having to worry if some model adjustment or API change is going to break the output.
If they solve that, then yes. Otherwise, what I personally expect is a lot of businesses rushing into implementing "agents" only to backpedal later when they start to have negative material effects on bottom lines.
People want us to be at "Her" levels of AI, but we're at a far earlier stage. We can fake certain aspects of that (using TTS), but blindly trusting an AI to run everything is going to be a big mistake in the short-term. And in order for the inevitability of what you describe to take place, the predecessor(s) to that have to work in a way that doesn't scare people and businesses away.
The plowing of money and hype into the current forms of AI (not to mention the gaslighting about their ability) makes me think the real inevitability is a meltdown in the next 5-10 years which leads to AI-hesitancy on a mass scale.
The problem with your "close to the metal" assertion is that this has been parroted about every iteration of LLMs thus far. They've certainly gotten better (impressively so), but again, it doesn't matter. By their very nature (whether today or ten years from now), they're a big risk at the business level which is ultimately where the rubber has to hit the road.
At the moment people are so wooed by the confidence of current LLMs that they forget that there's all sorts of types of AI models. I think the key is going to be to have them work together, each doing the part they're good at.
This is where reasoning models come in. Train models on many logical statements then give them enough time to produce a chain of thoughts that’s indistinguishable from “understanding” and “thinking”.
I’m not sure why this leap is so hard for some people to make.
So obviously completely full of shit.
> If you can find a way to make the context window on the scale of the human brain, you may be able to mostly mitigate this.
Human brains have a much smaller context window than AI do. We can't pay attention to the last 128,000 concepts that filtered past our sensory systems — our conscious considerations are for about seven things.
There's a lot of stuff that we don't yet understand well enough to reproduce with AI, but context length is the wrong criticism for these models.
You're right. What I'm getting at is the overall speed, efficiency, and accuracy of the storage, retrieval, and processing capability of the human brain.
I think it is, yes.
It was interviewed under that name on one of the UK's main news broadcasts almost immediately after it came out. Few hundred million users. Anecdotes about teachers whose students use it to cheat.
But who knows. I was surprising people about the existence of Wikipedia as late as 2004, and Google Translate's augmented reality mode some time around the start of the pandemic.
Then you have to recreate all of the services on top of the AWS.
Then you have to deal with regulations and certifications.
Then you have to convince decision makers to go against their own interests. “No one ever got fired for Amazon”.
Then you have to convince corporations to spend money to migrate.
As have 10s of thousands of other people who could invert a btree on the whiteboard….
Maybe try some humility. You're not helping yourself with the bragging about frankly underwhelming and common (here) experience.