I had to make this same decision years ago: do I focus on Ruby or do I bring Sidekiq to other languages? What I realized is that I couldn't be an expert in every language, Sidekiq.js, Sidekiq.py, etc. I decided to go a different direction and built Faktory[0] instead, which flips the architecture and provides a central server which knows how to implement the queue lifecycle internally. The language-specific clients become much simpler and can be maintained by the open source community for each language, e.g. faktory-rs[1]. The drawback is that Faktory is not focused on any one community and it's hard for me to provide idiomatic examples in a given language.
It's a different direction but by focusing on a single community, you may have better outcomes, time will tell!
[0]: https://github.com/contribsys/faktory [1]: https://github.com/jonhoo/faktory-rs
Sidekiq is pretty bare bones compared to what Oban supports with workflows, crons, partitioning, dependent jobs, failure handling, and so forth.
https://github.com/sidekiq/sidekiq/blob/ba8b8fc8d81ac8f57a55...
Additionally, delayed_job came before resque.
This is such a key feature. Lots of people will tell you that you shouldn't use a relational database as a worker queue, but they inevitably miss out on how important transactions are for this - it's really useful to be able to say "queue this work if the transaction commits, don't queue it if it fails".
Brandur Leach wrote a fantastic piece on this a few years ago: https://brandur.org/job-drain - describing how, even if you have a separate queue system, you should still feed it by logging queue tasks to a temporary database table that can be updated as part of those transactions.
This looks like a good definition too: https://www.milanjovanovic.tech/blog/outbox-pattern-for-reli...
For $135/month on Oban Pro, they advertise:
All Open Source Features
Multi-Process Execution
Workflows
Global and Rate Limiting
Unique Jobs
Bulk Operations
Encrypted Source (30/90-day refresh)
1 Application
Dedicated Support
I'm going to toot my own horn here, because it's what I know, but take my 100% free Chancy for example - https://github.com/tktech/chancy. Out of the box the same workers can mix-and-match asyncio, processes, threads, and sub-interpreters. It supports workflows, rate limiting, unique jobs, bulk operations, transactional enqueuing, etc. Why not move these things to the OSS version to be competitive with existing options, and focus on dedicated support and more traditional "enterprise" features, which absolutely are worth $135/month (the Oban devs provide world-class support for issues). There are many more options available in the Python ecosystem than Elixir, so you're competing against Temporal, Trigger, Prefect, Dagster, Airflow, etc etc.We may well move some of those things to the OSS version, depending on interest, usage, etc. It's much easier to make things free than the other way around. Some Pro only features in Elixir have moved to OSS previously, and as a result of this project some additional functionality will also be moved.
Support only options aren't going to cut it in our experience; but maybe that'll be different with Python.
> There are many more options available in the Python ecosystem than Elixir, so you're competing against Temporal, Trigger, Prefect, Dagster, Airflow, etc etc.
There's a lot more of everything available in the Python ecosystem =)
That's totally fair, and I can only speak from the sidelines. I haven't had a chance to review the architecture - would it possibly make sense to swap from async as a free feature to the process pool, and make async a pro feature? This would help with adoption from other OSS projects, if that's a goal, as the transition from Celery would then be moving from a process pool to a process pool (for most users). The vast, vast majority of Python libraries are not async-friendly and most still rely on the GIL. On the other hand, Celery has absolutely no asyncio support at all, which sets the pro feature apart.
On the other hand, already released and as you said it's much harder to take a free feature and make it paid.
Thanks again for Oban - I used it for a project in Elixir and it was painless. Missing Oban was why I made Chancy in the first place.
That's great advice. Wish we'd been in contact before =)
There's a difference between QoL features and reliability functions; to me, at least, that means that I can't justify trying to adopt it in my OSS projects. It's too bad, too, because this looks otherwise fantastic.
This feels like core functionality is locked away, and the opensource part is nothing more than a shareware, or demo/learning version.
Edit: I looked into it a bit more, and it seems we can launch multiple worker nodes, which doesn't seem as bad as what I originally thought
Single-threaded asyncio execution - concurrent but not truly parallel, so CPU-bound jobs block the event loop.
This makes it not even worth trying. Celery's interface kind of sucks, but I'm used to it already, and I can get infinitely parallel expanding vertically and horizontally for as long as I can afford the resources.
I also don't particularly like ayncio, and if I'm using a job queue wouldn't expect to need it.
Edit: I looked into it a bit more, and it seems we can launch multiple worker nodes, which doesn't seem as bad as what I originally thought
I've never heard of Oban until now and the one we've considered was Temporal but that feels so much more than what we need. I like how light Oban is.
Does anyone have experience with both and is able to give a quick comparison?
Thanks!
Temporal - if you have strict workflow requirements, want _guarantees_ that things complete, and are willing to take on extra complexity to achieve that. If you're a bank or something, probably a great choice.
Oban - DB-backed worker queue, which processes tasks off-thread. It does not give you the guarantees that Temporal can because it has not abstracted every push/pull into a first-class citizen. While it offers some similar features with workflows, to multiple 9's of reliability you will be hardening that yourself (based on my experience with Celery+Sidekiq)
Based on my heavy experience with both, I'd be happy to have both available to me in a system I'm working on. At my current job we are forced to use Temporal for all background processing, which for small tasks is just a lot of boilerplate.
We don't hate Celery at all. It's just a bit harder to get it to do certain things and requires a bit more coding and understanding of celery than what we want to invest time and effort in.
Again, no hate towards Celery. It's not bad. We just want to see if there are better options out there.
1. DBOS built durable workflows and queues on top of Postgres (disclaimer: I'm a co-founder of DBOS), with some recent discussions here: https://news.ycombinator.com/item?id=44840693
2. Absurd explores a related design as well: https://news.ycombinator.com/item?id=45797228
Overall, it's encouraging to see more people converging on a database-centric approach to durable workflows instead of external orchestrators. There's still a lot of open design space around determinism, recovery semantics, and DX. I'm happy to learn from others experimenting here.
Having an entirely open source offering and selling support would be an absolute dream. Maybe we'll get there too.
The differences are in the implementation and DX: the programming abstraction, how easy recovery/debugging is, and how it behaves once you're running a production cluster.
One thing that bit us early was versioning. In practice, you always end up with different workers running different code versions (rolling deploys, hotfixes, etc.). We spent a lot of time there and now support both workflow versioning and patching, so old executions can replay deterministically while still letting you evolve the code.
Curious how Oban handles versioning today?
Personally I've migrated 3 apps _from_ DB-backed job queues _to_ Redis/other-backed systems with great success.
Regarding overall throughput, we've written about running one million jobs a minute [1] on a single queue, and there are numerous companies running hundreds of millions of jobs a day with oban/postgres.
[1]: https://oban.pro/articles/one-million-jobs-a-minute-with-oba...
For your first point - I would counter that a lot of data about my systems lives outside of the primary database. There is however an argument for adding a dependency, and for testing complexities. These are by and large solved problems at the scale I work with (not huge, not tiny).
I think both approaches work and I honestly just appreciate you guys holding Celery to task ;)
Besides, DB has higher likehood of failing you if you reach certain throughputs
So the non-paid version really can't be used for production unless you know for sure you'll have very short jobs?
(I've paid for it for years despite not needing any of the pro features)
What does it have over Celery?
You can also easily spawn as many processes running the cli as you like to get multi-core parallelism. It's just a smidge* little more overhead than the process pool backend in Pro.
Also, not an expert on Celery.
I also use celery when I have a process a user kicked off by clicking a button and they're watching the progress bar in the gui. One process might have 50 tasks, or one really long task.
Edit: I looked into it a bit more, and it seems we can launch multiple worker nodes, which doesn't seem as bad as what I originally thought