153 pointsby dimamik6 hours ago22 comments
  • mperham5 hours ago
    I wrote Sidekiq, which Oban is based on. Congratulations to Shannon and Parker on shipping this!

    I had to make this same decision years ago: do I focus on Ruby or do I bring Sidekiq to other languages? What I realized is that I couldn't be an expert in every language, Sidekiq.js, Sidekiq.py, etc. I decided to go a different direction and built Faktory[0] instead, which flips the architecture and provides a central server which knows how to implement the queue lifecycle internally. The language-specific clients become much simpler and can be maintained by the open source community for each language, e.g. faktory-rs[1]. The drawback is that Faktory is not focused on any one community and it's hard for me to provide idiomatic examples in a given language.

    It's a different direction but by focusing on a single community, you may have better outcomes, time will tell!

    [0]: https://github.com/contribsys/faktory [1]: https://github.com/jonhoo/faktory-rs

    • sorenone4 hours ago
      Thanks Mike! You are an inspiration. Parker and I have different strengths both in life and language. We're committed to what this interop brings to both Python and Elixir.
    • ai_critic4 hours ago
      "based on" is sorta a stretch here.

      Sidekiq is pretty bare bones compared to what Oban supports with workflows, crons, partitioning, dependent jobs, failure handling, and so forth.

      • mperham4 hours ago
        By “based on” I don’t mean a shared codebase or features but rather Parker and I exchanged emails a decade ago to discuss business models and open source funding. He initially copied my Sidekiq OSS + Sidekiq Pro business model, with my blessing.
        • sorentwo3 hours ago
          This is absolutely true (except we went OSS + Web initially, Pro came later). You were an inspiration, always helpful in discussion, and definitely paved the way for this business model.
        • ai_critic3 hours ago
          Thank you for the clarification!
        • sorenone3 hours ago
          You got the beer. We got the pen. ;)
    • semiquaver5 hours ago
      Isn’t it more accurate to say that they are both based on Resque?
    • enraged_camel5 hours ago
      Maybe you didn’t intend it this way, but your comment comes across as an attempt to co-opt the discussion to pitch your own thing. This is generally looked down upon here.
      • BowBun5 hours ago
        Knowing Mike and his work over the years, that is not the case. He is a man of integrity who owns a cornerstone product in the Ruby world. He is specifically the type of person I want to hear from when folks release new software having to do with background jobs, since he has 15 years of experience building this exact thing.
      • mperham3 hours ago
        It was an off-the-cuff comment and probably not worded ideally but the intent was to discuss how Oban is branching off into a new direction for their business based on language-specific products while I went a different direction with Faktory. Since I came to the exact same fork in the road in 2017, I thought it was relevant and an interesting topic on evolving software products.
  • simonw5 hours ago
    > Oban allows you to insert and process jobs using only your database. You can insert the job to send a confirmation email in the same database transaction where you create the user. If one thing fails, everything is rolled back.

    This is such a key feature. Lots of people will tell you that you shouldn't use a relational database as a worker queue, but they inevitably miss out on how important transactions are for this - it's really useful to be able to say "queue this work if the transaction commits, don't queue it if it fails".

    Brandur Leach wrote a fantastic piece on this a few years ago: https://brandur.org/job-drain - describing how, even if you have a separate queue system, you should still feed it by logging queue tasks to a temporary database table that can be updated as part of those transactions.

  • TkTech4 hours ago
    The Oban folks have done amazing, well-engineered work for years now - it's really the only option for Elixir. That said, I'm very confused at locking the process pool behind a pro subscription - this is basic functionality given CPython's architecture, not a nice-to-have.

    For $135/month on Oban Pro, they advertise:

        All Open Source Features
    
        Multi-Process Execution
    
        Workflows
    
        Global and Rate Limiting
    
        Unique Jobs
    
        Bulk Operations
    
        Encrypted Source (30/90-day refresh)
    
        1 Application
    
        Dedicated Support
    
    
    I'm going to toot my own horn here, because it's what I know, but take my 100% free Chancy for example - https://github.com/tktech/chancy. Out of the box the same workers can mix-and-match asyncio, processes, threads, and sub-interpreters. It supports workflows, rate limiting, unique jobs, bulk operations, transactional enqueuing, etc. Why not move these things to the OSS version to be competitive with existing options, and focus on dedicated support and more traditional "enterprise" features, which absolutely are worth $135/month (the Oban devs provide world-class support for issues). There are many more options available in the Python ecosystem than Elixir, so you're competing against Temporal, Trigger, Prefect, Dagster, Airflow, etc etc.
    • sorentwo4 hours ago
      > It supports workflows, rate limiting, unique jobs, bulk operations, transactional enqueuing, etc. Why not move these things to the OSS version to be competitive with existing options, and focus on dedicated support and more traditional "enterprise" features, which absolutely are worth $135/month (the Oban devs provide world-class support for issues).

      We may well move some of those things to the OSS version, depending on interest, usage, etc. It's much easier to make things free than the other way around. Some Pro only features in Elixir have moved to OSS previously, and as a result of this project some additional functionality will also be moved.

      Support only options aren't going to cut it in our experience; but maybe that'll be different with Python.

      > There are many more options available in the Python ecosystem than Elixir, so you're competing against Temporal, Trigger, Prefect, Dagster, Airflow, etc etc.

      There's a lot more of everything available in the Python ecosystem =)

      • TkTech3 hours ago
        > Support only options aren't going to cut it in our experience; but maybe that'll be different with Python.

        That's totally fair, and I can only speak from the sidelines. I haven't had a chance to review the architecture - would it possibly make sense to swap from async as a free feature to the process pool, and make async a pro feature? This would help with adoption from other OSS projects, if that's a goal, as the transition from Celery would then be moving from a process pool to a process pool (for most users). The vast, vast majority of Python libraries are not async-friendly and most still rely on the GIL. On the other hand, Celery has absolutely no asyncio support at all, which sets the pro feature apart.

        On the other hand, already released and as you said it's much harder to take a free feature and make it paid.

        Thanks again for Oban - I used it for a project in Elixir and it was painless. Missing Oban was why I made Chancy in the first place.

        • sorentwo3 hours ago
          > The vast, vast majority of Python libraries are not async-friendly and most still rely on the GIL. On the other hand, Celery has absolutely no asyncio support at all, which sets the pro feature apart.

          That's great advice. Wish we'd been in contact before =)

  • airocker8 minutes ago
    We had considered Oban when deciding whether to go with Kafka/Debezium or not. We sided with Kafka because it can do high throughput ingestion and it is easier to maintain it with cursor in today's world. Postgres is not meant for heavy writes, but heavy querying. You could fix that with lot of care but then it does not scale multi-master very well either. Kafka scales much better for heavy writes.
  • markbao6 minutes ago
    Is Postgres fast enough for job processing these days? We do hundreds of millions of jobs now and even years ago when our volume was a fraction of that, we got a huge performance boost moving from Postgres + Que to Redis + Sidekiq. Has that changed in the intervening years?
  • offbyone3 hours ago
    Ooof. I don't mind the OSS/pro feature gate for the most part, but I really don't love that "Pro version uses smarter heartbeats to track producer liveness."

    There's a difference between QoL features and reliability functions; to me, at least, that means that I can't justify trying to adopt it in my OSS projects. It's too bad, too, because this looks otherwise fantastic.

    • sorentwo2 hours ago
      With a typical Redis or RabbitMQ backed durable queue you’re not guaranteed to get the job back at all after an unexpected shutdown. That quote is also a little incorrect—producer liveness is tracked the same way, it’s purely how “orphaned” jobs are rescued that is different.
      • offbyone2 hours ago
        "jobs that are long-running might get rescued even if the producer is still alive" indicates otherwise. It suggests that jobs that are in progress may be double-scheduled. That's a feature that I think shouldn't be gated behind a monthly pro subscription; my unpaid OSS projects don't justify it.
        • dec0dedab0de2 hours ago
          Agreed. I try to avoid using anything that has this freemium model of opensource, but I let it slide for products that provide enterprise features at a cost.

          This feels like core functionality is locked away, and the opensource part is nothing more than a shareware, or demo/learning version.

          Edit: I looked into it a bit more, and it seems we can launch multiple worker nodes, which doesn't seem as bad as what I originally thought

  • dec0dedab0de2 hours ago
    OSS Oban has a few limitations, which are automatically lifted in the Pro version:

    Single-threaded asyncio execution - concurrent but not truly parallel, so CPU-bound jobs block the event loop.

    This makes it not even worth trying. Celery's interface kind of sucks, but I'm used to it already, and I can get infinitely parallel expanding vertically and horizontally for as long as I can afford the resources.

    I also don't particularly like ayncio, and if I'm using a job queue wouldn't expect to need it.

    Edit: I looked into it a bit more, and it seems we can launch multiple worker nodes, which doesn't seem as bad as what I originally thought

  • hangonhn5 hours ago
    This is something my company has been considering for a while. We've been using celery and it's not great. It gets the job done but it has its issue.

    I've never heard of Oban until now and the one we've considered was Temporal but that feels so much more than what we need. I like how light Oban is.

    Does anyone have experience with both and is able to give a quick comparison?

    Thanks!

    • BowBun5 hours ago
      Very, very different tools, though they cover similar areas.

      Temporal - if you have strict workflow requirements, want _guarantees_ that things complete, and are willing to take on extra complexity to achieve that. If you're a bank or something, probably a great choice.

      Oban - DB-backed worker queue, which processes tasks off-thread. It does not give you the guarantees that Temporal can because it has not abstracted every push/pull into a first-class citizen. While it offers some similar features with workflows, to multiple 9's of reliability you will be hardening that yourself (based on my experience with Celery+Sidekiq)

      Based on my heavy experience with both, I'd be happy to have both available to me in a system I'm working on. At my current job we are forced to use Temporal for all background processing, which for small tasks is just a lot of boilerplate.

    • owaislone5 hours ago
      I'm just coming back to web/API development Python after 7-8 years working on distributed systems in Go. I just built a Django+Celery MVP given what I knew from 2017 but I see a lot of "hate" towards Celery online these days. What issues have you run into with Celery? Has it gotten less reliable? harder to work with?
      • TkTech4 hours ago
        Celery + RabbitMQ is hard to beat in the Python ecosystem for scaling. But the vast, vast majority of projects don't need anywhere that kind of scale and instead just want basic features out of the box - unique tasks, rate limiting, asyncio, future scheduling that doesn't cause massive problems (they're scheduled in-memory on workers), etc. These things are incredibly annoying to implement over top of Celery.
        • hangonhn4 hours ago
          Yeah that list right there. That's exactly it.

          We don't hate Celery at all. It's just a bit harder to get it to do certain things and requires a bit more coding and understanding of celery than what we want to invest time and effort in.

          Again, no hate towards Celery. It's not bad. We just want to see if there are better options out there.

      • alanwreath4 hours ago
        I like celery but I started to try other things when I had projects doing work from languages in addition to python. Also I prefer the code work without having to think about queues as much as possible. In my case that was Argo workflows (not to be confused with Argo CD)
  • Arubis5 hours ago
    While this is a Cool Thing To See, I do wish things would go the other way—and have all the BI/ML/DS pipelines and workflows folks are building in Python and have them come to Elixir (and, as would follow, Elixir). I get where the momentum is, but having something functional, fault-tolerant, and concurrent underpinning work that’s naturally highly concurrent and error-prone feels like a _much_ more natural fit.
  • tnlogy2 hours ago
    Looks like a nice API. We have used the similar pattern for years, but with sqlalchemy and the same kind of sql statement for getting the next available job. Think it’s easier to handle worker queues just with postgresql rather than some other queue system to keep supported and updated for security fixes etc.
  • qianli_cs4 hours ago
    Thanks for sharing, interesting project! One thing that stood out to me is that some fairly core features are gated behind a Pro tier. For context, there are prior projects in this space that implement similar ideas fully in OSS, especially around Postgres-backed durable execution:

    1. DBOS built durable workflows and queues on top of Postgres (disclaimer: I'm a co-founder of DBOS), with some recent discussions here: https://news.ycombinator.com/item?id=44840693

    2. Absurd explores a related design as well: https://news.ycombinator.com/item?id=45797228

    Overall, it's encouraging to see more people converging on a database-centric approach to durable workflows instead of external orchestrators. There's still a lot of open design space around determinism, recovery semantics, and DX. I'm happy to learn from others experimenting here.

    • sorentwo4 hours ago
      There are other projects that implement the ideas in OSS, but that's the same in Elixir. Not that we necessarily invented DAGs/workflows, but our durable implementation on the Elixir side predates DBOS by several years. We've considered it an add-on to what Oban offers, rather than the entire product.

      Having an entirely open source offering and selling support would be an absolute dream. Maybe we'll get there too.

      • qianli_cs3 hours ago
        That's fair, the idea itself isn't new. Workflows/durable execution have been around forever (same story in Elixir).

        The differences are in the implementation and DX: the programming abstraction, how easy recovery/debugging is, and how it behaves once you're running a production cluster.

        One thing that bit us early was versioning. In practice, you always end up with different workers running different code versions (rolling deploys, hotfixes, etc.). We spent a lot of time there and now support both workflow versioning and patching, so old executions can replay deterministically while still letting you evolve the code.

        Curious how Oban handles versioning today?

  • dfajgljsldkjag5 hours ago
    I have fixed many broken systems that used redis for small tasks. It is much better to put the jobs in the database we already have. This makes the code easier to manage and we have fewer things to worry about. I hope more teams start doing this to save time.
    • BowBun5 hours ago
      Traditional DBs are a poor fit for high-throughput job systems in my experience. The transactions alone around fetching/updating jobs is non-trivial and can dwarf regular data activity in your system. Especially for monoliths which Python and Ruby apps by and large still are.

      Personally I've migrated 3 apps _from_ DB-backed job queues _to_ Redis/other-backed systems with great success.

      • asa400an hour ago
        How high of throughput were you working with? I've used Oban at a few places that had what pretty decent throughput and it was OK. Not disagreeing with your approach at all, just trying to get an idea of what kinds of workloads you were running to compare.
        • BowBun36 minutes ago
          Millions of jobs a minute
      • sorentwo3 hours ago
        Transactions around fetching/updating aren't trivial, that's true. However, the work that you're doing _is_ regular activity because it's part of your application logic. That's data about the state of your overall system and it is extremely helpful for it to stay with the app (not to mention how nice it makes testing).

        Regarding overall throughput, we've written about running one million jobs a minute [1] on a single queue, and there are numerous companies running hundreds of millions of jobs a day with oban/postgres.

        [1]: https://oban.pro/articles/one-million-jobs-a-minute-with-oba...

        • BowBun32 minutes ago
          Appreciate the response, I'm learning some new things about the modern listening mechanisms for DBs which unlock more than I believed was possible.

          For your first point - I would counter that a lot of data about my systems lives outside of the primary database. There is however an argument for adding a dependency, and for testing complexities. These are by and large solved problems at the scale I work with (not huge, not tiny).

          I think both approaches work and I honestly just appreciate you guys holding Celery to task ;)

      • brightball5 hours ago
        The way that Oban for Elixir and GoodJob for Ruby leverage PostgreSQL allows for very high throughput. It's not something that easily ports to other DBs.
        • BowBun35 minutes ago
          Appreciate the added context here, this is indeed some special sauce that challenges my prior assumptions!
        • owaislone4 hours ago
          Interesting. Any docs that explain what/how they do this?
          • TkTech4 hours ago
            A combination of LISTEN/NOTIFY for instantaneous reactivity, letting you get away with just periodic polling, and FOR UPDATE...SKIP LOCKED making it efficient and safe for parallel workers to grab tasks without co-ordination. It's actually covered in the article near the bottom there.
          • brightball4 hours ago
            Good Job is a strong attempt. I believe it's based around Advisory Locks though.

            https://github.com/bensheldon/good_job

    • pawelduda5 hours ago
      In Rails at least,aside from being used for background processing, redis gives you more goodies. You can store temporary state for tasks that require coordination between multiple nodes without race conditions, cache things to take some load off your DB, etc.

      Besides, DB has higher likehood of failing you if you reach certain throughputs

  • shepardrtc3 hours ago
    > Inaccurate rescues - jobs that are long-running might get rescued even if the producer is still alive. Pro version uses smarter heartbeats to track producer liveness.

    So the non-paid version really can't be used for production unless you know for sure you'll have very short jobs?

    • sorentwo2 hours ago
      You can have jobs that run as long as you like. The difference is purely in how quickly they are restored after a crash or a shutdown that doesn’t wait long enough.
  • sieep2 hours ago
    Oban is incredible and this type of software will continue to grow in importance. Kudos!
  • waffletoweran hour ago
    No offense to all of the effort referenced here, I understand that there are many computing contexts with different needs. However, I really need to ask: am I the only one who cringes at the notion of a transactional database being a job processing nexus? Deadlocks anyone? Really sounds like asking for serious trouble to me.
  • owaislone5 hours ago
    I don't know how I feel about free open source version and then a commercial version that locks features. Something inside me prevents me from even trying such software. Logically I'd say I support the model because open source needs to be sustainable and we need good quality developer tools and software but when it comes to adoption, I find myself reaching for purely open source projects. I think it has to do with features locked behind a paywall. I think I'd be far more open to trying out products where the commercial version offered some enterprise level features like compliance reports, FIPS support, professional support etc but didn't lock features.
    • sanswork5 hours ago
      For most of the history the main locked feature was just a premium web interface(there were a few more but that was the main draw) that's included in free now and I think the locked features are primarily around most specialised job ordering engines. Things that if you need free you almost certainly don't need. Oban has been very good about deciding what features to lock away.

      (I've paid for it for years despite not needing any of the pro features)

  • tinyhouse4 hours ago
    How is this different than Celery and the like?
  • nodesocket2 hours ago
    Is there a web U/I to view jobs, statuses, queue length etc?
  • deeviant4 hours ago
    I can't imagine why you would want a job processing framework linked to a single thread, which make this seem like a paid-version-only product.

    What does it have over Celery?

    • sorenone3 hours ago
      The vast majority of tasks you use a job processing framework for are related to io bound side effects: sending emails, interacting with a database, making http calls, etc. Those are hardly impacted by the fact that it's a single thread. It works really well embedded in a small service.

      You can also easily spawn as many processes running the cli as you like to get multi-core parallelism. It's just a smidge* little more overhead than the process pool backend in Pro.

      Also, not an expert on Celery.

      • dec0dedab0de2 hours ago
        I use celery when I need to launch thousands of similar jobs in a batch across any number of available machines, each running multiple processes with multiple threads.

        I also use celery when I have a process a user kicked off by clicking a button and they're watching the progress bar in the gui. One process might have 50 tasks, or one really long task.

        Edit: I looked into it a bit more, and it seems we can launch multiple worker nodes, which doesn't seem as bad as what I originally thought

  • sergiotapia3 hours ago
    Python dudes are in for a treat, Oban is one of the most beautiful elegant parts of working with Elixir/Phoenix. They have saved me so much heartache and tears over the years working with them.
  • cpursley3 hours ago
    Oban is cool but I really like the idea of pgflow.dev, which is based on pgmq (rust) Postgres plugin doing the heavy lifting as it makes it language agnostic (all the important parts live in Postgres). I've started an Elixir adapter which really is just a DSL and poller, could do the same in Python, etc.

    https://github.com/agoodway/pgflow

  • scotthenshaw33 hours ago
    [dead]