199 pointsby ko_pivot2 days ago25 comments
  • wqtz2 days ago
    Databricks acquired bit.io and subsequently shut it down quite fast. Afaik bit.io had a very small team and the founder was a serial entrepreneur who is not going to stick around and he did not. I am not sure who from bit.io is still around at databricks.

    If I am guessing right, Motherduck will likely be acquired by GCP because most of the founding team was ex-BQ. Snowflake purchased Modin and polars is still quite immature to be acquisition ready. So, what does this leave us with. There is also EDB who is competing in enterprise Postgres space.

    Folks I know in the industry are not very happy with databricks. Databricks themselves was hinting people that that they would be potentially acquired by Azure as Azure tries to compete in the data warehouse space. But everyone become an AI company which left Databricks in an awkward space. Their bdev team is not bestest from my limited interactions with them (lots of starbucks drinkers and let me get back to you after a 3 month PTO), so they do not know who or how to lead them to an AI pivot. With cash to burn from overinvestment and the snowflake/databricks conf coming up fast they needed a big announcement and this is that big announcement.

    Should have sobered up before writing this though. But who cares.

    • mritchie712a day ago
      The "datalake" is becoming a bit of a commodity. It's getting pretty easy to spin one up yourself[0] using completely open source components.

      Databricks and Microsoft (thru Fabric) are trying to build a complete data platform, i.e. ELT + datalake + BI

      My bet with Definite (https://www.definite.app/) has been this is too hairy for a large company to do well and we can do it better.

      0 - https://www.definite.app/blog/cloud-iceberg-duckdb-aws

    • arccya day ago
      starbucks drinkers is certainly a new way to describe people, though i'm not sure what image that's supposed to invoke
      • ethbr1a day ago
        From context in parent, I'm reading as the sort of person who looks more competent than they are and skates from job to job quickly enough that no one notices.
        • joshuanapolia day ago
          Maybe they mean the kind of biz dev that uses small bribes (a free drink at Starbucks) to help get customers to take their call.
          • tomroda day ago
            Of all the images I imagined, it's not this one.

            BDev can be good or bad. Bad ones tend to not follow up, and Starbucks here represents they have poor decision making skills (reinforced by going on PTO for three months and not following up on commitments).

      • bluecheese452a day ago
        Thought the same. I mean I don’t drink it because I can make my own far cheaper but I don’t look on with scorn at those who do. It says a lot more about the person making the judgment than those who drink the coffee.
    • > Folks I know in the industry are not very happy with databricks

      Yeah, big companies globing up everything does not lead to a healthy ecosystem. Congrats on the founders for their the acquisition but everyone else loses with movements like this.

      I'm still sour after their Redash purchase that instantly "killed" the open source version. Tabular acquisition was also a bit controversial since one of the founders is the PMC Chair for Iceberg which "competes" directly with Databricks own delta lake. The mere presence of these giants (mostly databricks and snowflake) makes the whole data ecosystem (both closed and open source) really hostile.

    • sys13a day ago
      Very unlikely that Databricks would be acquired by Azure. So much of their business is on AWS, and they are invested in by AWS/Azure/GCP.
    • AlexeyBelov12 hours ago
      Starbucks drinkers? What do you mean?
    • >Should have sobered up before writing this though. But who cares.

      In vino veritas, and all that; we appreciate your honesty!

  • betteryeta day ago
    Neon is a great product because they are run by Postgres enthusiasts. They have decent customer-friendly pricing, real serverless HTTP endpoints, and they're always on the latest version of Postgres as soon as it is stable. From what I can tell, no other provider has this positioning, driven by dedication.

    I really hope they can maintain this dedication after acquisition, but Databricks will probably push them into enterprise and it will lose the spark. I wish Cloudflare bought them instead.

  • newfocogi2 days ago
    They offer serverless Postgres. Here's a link if anyone else needs it https://neon.tech/
    • gopalv2 days ago
      An OLTP solution fixes a lot of the headaches about the traditional extract-load-transform steps.

      Mostly a lot of OLAP starts when the data loads in Kafka logs or a disk of some sort.

      Then you schedule a task or keep a task polling this constantly, which is always prone to small failures & delays or big failures when schema changes up.

      The "data pipeline" team exists because the data doesn't move by itself from where it is first stored to where it is ready for deep analysis.

      If you can directly push 1-row updates transactionally to a system and feed off the backend to write a more OLAP friendly structure, then you can hookup things like a car rental service's operational logs into a system which can compute more complex things like forecasting of availability or apply discounts to give a customer an upgrade for cheap.

      Neon looks a lot better than YugaByte in tech (which also talks postgres protocols) and a lot nicer in protocol compatibility than something like FoundationDB.

      Alloy from Google feels somewhat similar, Spanner has a postgres interface too.

      The postgres API is a great abstraction common point, even if the actual details of the implementations vary a lot.

  • jmull2 days ago
    Wow, $1B.

    I've been bullish on neon for a while -- the idea hits exactly the right spot, IMO, and their execution looks good in my limited experience.

    But I mean that from a technical perspective. I never have any real idea about the business -- do they have an edge that makes people want to start paying them money and keep paying them money? Heck if I know.

    I guess that's going to be Databricks problem now (maybe).

    • xyst2 days ago
      Actual revenue is irrelevant. This is a business decision to corner the market.
    • brapa day ago
      I'm sorry but what is "the idea"? Managed postgres?

      It seems like execution >>> idea in this case

      • joshstrangea day ago
        Neon goes further than just "managed postgres". I would say one of their big features is just how fast and easy you can spin up new db/clusters. It's completely possible (encouraged) to spin up 1 DB per tennant and potentially spin u and tear down 1000's of databases.

        It opens up some interesting ideas/concepts when creating an isolated DB is just as easy as creating a new db table.

      • jmulla day ago
        More specifically, the idea is "serverless" posgres.

        But as I mentioned, I mean from a tech standpoint... If you're interested, they've posted various things about how the tech works.

        > It seems like execution >>> idea in this case

        I don't know what >>> means here, so possibly I complete agree or perhaps completely disagree.

        • __sa day ago
          >>> means "way better than"
  • impulser_a day ago
    These Postgres, and serverless databases are all so overhyped. I have tried all of them and they all are much slower than just deploying a managed database in the same datacenter as your application.

    I have an application deployed on Railway with a Postgres database and the user's latency is consistent 150ms. The same application deployed on these serverless/edge provider is anywhere between 300-400ms with random spikes to 800ms. The same application, same data, and same query.

    The edge and serverless has to be the biggest scam in cloud industry right now.

    They aren't faster, and they aren't cheaper. You could argue they are easier to scale, but that not he case anymore since everyone provides autoscaling now.

    • cpursleya day ago
      Whatever. I was able to set up Neon Postgres in 5 mins. It’s still crazy fast with my Fly services, has replication out of the box and backups. Much easier than AWS and from what I can tell, getting something going with Railway. And I don’t have to worry about operating it. My time is valuable.
      • mbreesea day ago
        All of that can be true. What I wonder is — if that all is true — how much of a moat is there around that? It seems like the secret sauce in that company isn’t some custom technology, it’s execution. Execution can be replicated by another competent team. Or is there some other secret sauce that I can’t see?
        • vladich13 hours ago
          It's the team, they have a few Postgres committers and major contributors, and there are not that many of them. But that's a bit precarious, the team may leave after the acquisition for many reasons.
        • Execution is some of the hardest secret sauce of all
          • mbreesea day ago
            I completely agree... in my comment, the word "competent" was doing a lot of heavy lifting.

            And it begs comparisons to comments about Dropbox/rsync, etc...

            But, I personally think the Neon concept of branching databases with CoW storage is quite interesting. That, combined with cost-management with autoscaling does seem like at least a serviceable moat.

      • impulser_a day ago
        These are features of any managed database service.

        DigitalOcean, Railway, Render, and so on all offer the exact same feature except it's just pure Postgres and you can deploy them in the same data center as your application.

        • cpursleya day ago
          Render nor DO offer logical replication and are missing some other features.
      • myflash13a day ago
        400ms added latency is really bad for user experience. Do a few queries and you’re going to need to add caching. Now you’re spending your precious developer time managing caching invalidation in lots of places instead of just setting up your database properly in the beginning.
        • cpursleya day ago
          • myflash1314 hours ago
            I understand there are ways to deal with the problem of latency in serverless, but this is a problem I'd rather not deal with in the first place. The database IS the application, and I would not want to sacrifice speed of the database for anything. Serverless is totally not worth the trade-off for me: slightly more convenient deployments, for much higher latency to the database.

            I'm a solo dev that has been installing and running my own database server with backups for decades and have never had a problem with it. It's so simple, and I have no idea why people are so allergic to managing their own server. 99% of apps can run very snappily on a single server, and the simplicity is a breath of fresh air.

            • pdimitar5 hours ago
              That's why I'm working hard on bringing in a tightly integrated support for SQLite in the Elixir ecosystem (via a Rust FFI bridge): because in my professional experience not many applications need something as hardcore and amazing as PostgreSQL; at least 80% of all apps I ever witnessed would be just fine with an embedded database.

              I share similar experiences like yours and others in this thread, and to me all those operational concerns grow into unnecessary noise that distracts from the real problems that we are paid to solve.

        • tristan957a day ago
          Are you referring to cold start latencies?
          • myflash1314 hours ago
            Not just cold start (another problem you have to worry about with serverless). There's the simple fact that network latency outside of the same datacenter is ALWAYS slow and randomly unpredictable, especially if you have to run multiple queries just to render a single page to your user. A database should always be over LAN in my opinion, if you need to access data over the internet, at that point it should be over an API/HTTP, not internal database access.
    • atombender11 hours ago
      Isn't this an apples-to-orange comparison?

      Neon's multi-region support isn't directly comparable to a single Postgres database in a single data center. You can set up Neon in a single data center, too, and I would expect the same performance in that case.

      Meanwhile, if you tried to scale your single-Postgres to a multi-region setup, you'd expect higher latencies relative to the location of your data.

    • mritchie712a day ago
      supabase lured me in with built-in oauth, real-time, and some nice client side features in their JS lib, but I do worry about the latency sometimes.

      It'd be a lot of work to run an apples to apples test with a Google Cloud Postgres db vs. Supabase and see what the difference is.

    • myflash13a day ago
      Even managed databases are a scam. You can easily get 10x cheaper pricing for the same workload, by, wait for it, installing Postgres yourself on a baremetal machine. Plus you get much better performance, no noisy neighbors, and ability to actually control and measure low level performance. I never got the hype for serverless. Why are people so allergic to setting up a server? It takes a few hours a year of investment, and the performance benefits are huge.
      • tristan957a day ago
        > Even managed databases are a scam

        Just because you don't derive value out of something doesn't mean it is a scam.

  • forgetfulness2 days ago
    What is the lowdown on Databricks? Their bread and butter were hosted Spark and notebooks. As tasks done in Spark over a data lake began to be delegated wholesale to columnar store ELT, they tried to pivot to "lake houses", then I sort of lost track of them after I got out of Spark myself.

    Did Delta Lake ever catch on? Where are they going now?

    • richardw2 days ago
      Capture enterprise AI enthusiasm by providing a 1-stop shop for data and AI, optionally hosted on your own cloud tenant. Keep deploying functionality so clients never need another supplier. Partner with SAP, OpenAI, anyone who holds market share. Buy anyone that either helps growth or might help a competitor grow.

      Enterprise view: delegate AI environment to Databricks unless you’re a real player. Market is too chaotic, so rely on them to keep your innovation pipeline fed. Focus on building your own core data and AI within their environment. Nobody got fired for choosing Databricks.

      • jimbokun2 days ago
        Can someone translate this to non-CEO speak?
        • baggiponte2 days ago
          You basically pay databricks a “fee” to choose the more appropriate and modern stack for you to build on, and keep it up to date. Never used it, but it handles with lots of the administrative bs (compliance, SLAs, idk) for you so you can just ship.
        • janderson2152 days ago
          [flagged]
      • forgetfulness2 days ago
        That does sound, as you allude, like IBM on its long downward spiral of globbing up products to stay relevant and touting them as an integral solution, while in-house development stuck to keeping legacy products alive for their Enterprise contracts. I wonder if they'll be foolish enough to start doing consulting around them, obliterating their economies of scale in the process; so far they are going with the "consulting partners" approach.

        Oh well. Databricks notebooks were hella cool back when companies were willing to spend lavishly on having engineers write cloud hosted Scala in the first place, and at premium prices to boot.

      • cactusfrog2 days ago
        A nice UI for a data lake house is underrated. I use AWS Athena at my work and it is just so bad for no good reason. For example, big columns of text are expanded outwards making reading the subsequent columns impossible.
        • senderistaa day ago
          Well UI has never exactly been Amazon's strong suit.
    • mritchie712a day ago
      Delta Lake is not catching on, but no worries, they bought Iceberg[0] (the competing standard).

      I'm joking, but only a bit. Iceberg is open source (Apache), but a lot of the core team and the creator worked at Tabular and Databricks bought them for $1B.

      0 - https://www.definite.app/blog/databricks-tabular-acquisition

    • rogermavis2 days ago
      It provides central place to store and query data. A big org might have a few hundred databases for various purposes - databricks lets data engineers set up pipelines to ETL that data into databricks and when the data is there it can be queried (using spark, so there's some downsides - namely a more restrictive SQL variant - but some advantages - better performance across very large datasets).

      Personally, I hated databricks, it caused endless pain. Our org has less than 10TB of data and so it's overkill. Good ol' Postgres or SQL Server does just fine on tables of a few hundred GB, and bigquery chomps up 1TB+ without breaking a sweat.

      Everything in databricks - everything - is clunky and slow. Booting up clusters can take 15 minutes whereas something like bigquery is essentially on-demand and instant. Data ETL'd into databricks usually differs slightly from its original source in subtle but annoying ways. Your IDE (which looks like jupyter notebook, but is not) absolutely suck (limited/unfamiliar keyboard shortcuts, flakey, can only be edited in browser), and you're out of luck if you want to use your favorite IDE, vim etc.

      Almost every databricks feature makes huge concessions on the functionality you'd get if you just used that feature outside of databricks. For example databricks has it's own git-like functionality (which is the 5% of git that gets most used, but no way to do the less common git operations).

      My personal take is databricks is fine for users who'd otherwise use their laptop's computer/memory - this gets them an environment where they can access much more, at about 10x the cost of what you'd pay for the underlying infra if you just set it up yourself. Ironically, all the databricks-specific cruft (config files, click ops) that's required to get going will probably be difficult for that kind of user anyway, so it negates its value.

      For more advanced users (i.e. those that know how to start an ec2 or anything more advanced), databricks will slow you down and be endlessly frustrating. It will basically 2-10x the time it takes to do anything, and sap the joy out of it. I almost quit my job of 12 years because the org moved to databricks. I got permission to use better, faster, cheaper, less clunky, open-source tooling, so I stayed.

      • bokenator2 days ago
        Which open source option did you end up going with? I'm in the same boat and would like to evaluate my options.
        • rogermavis2 days ago
          My stack atm is neovim, python/R, an EC2 and postgres (sometimes Sql Server). Some use of arrow and duckdb. For queries on less than few hundred GB this stack does great. Fast, familiar, the ec2 is running 24/7 so it's there when I need it and can easily schedule overnight jobs, and no time wasted waiting for it to boot.
          • creeksai2 days ago
            You mentioned earlier about how long it would take to acquire a new cluster in Databricks, but you are comparing it here to something that's always on here. In a much larger environment, your setup is not really practical to have a lot of people collaborating.

            Note that Databricks SQL Serverless these days can be provisioned in a few seconds.

            • rogermavis2 days ago
              > you are comparing it here to something that's always on

              That's the point. Our org was told databricks would solve problems we just didn't have. Serverful has some wonderful advantages: simplicity, (ironically) cheaper (than something running just 3-4 hours a day but which costs 10x), familiarity, reliability. Serverless also has advantages, but only if it runs smoothly, doesn't take an eternity to boot, isn't prohibitively expensive, and has little friction before using it - databricks meets 0/4 of those critera, with the additional downside of restrictive SQL due to spark backend, adding unnecessary refactoring/complexity to queries.

              > your setup is not really practical to have a lot of people collaborating

              Hard disagree. Our methods are simple and time-tested. We use git to share code (100x improvement on databricks' version of git). We share data in a few ways, the most common are by creating a table in a database or in S3. It doesn't have to be a whole lot more complicated.

              • creeksai2 days ago
                I totally understand if Databricks doesn't fit your use cases.

                But you are doing a disingenuous comparison here because one can keep a "serverful" cluster up without shutting it down, and in that case, you'd never need to wait for anything to boot up. If you shut down your EC2 instances, it will also take time to boot up. Alternatively, you can use the (relatively new) serverless offering from them that gets you compute resources in seconds.

                • rogermavis2 days ago
                  To ensure I'm not speaking incorrectly (as I was going from memory), I grep'ed my several years' of databricks notes. Oh boy.. the memories came flooding back!

                  We had 8 data engineers onboarding the org to databricks, it was only after 2 solid years before they got to working on serverless (it was because users complained of user unfriendliness of 'nodes', and managers of cost). But then, there were problems. A common pattern through my grep of slack convos is "I'm having this esoteric error where X doesn't work on serverless databricks, can you help".. a bunch of back and forth (sometimes over days) and screenshots followed by "oh, unfortunately, serverless doesn't support X".

                  Another interesting note is someone compared serverless databricks to bigquery, and bigquery was 3x faster without the databricks-specific cruft (all bigquery needs is an authenticated user and a sql query).

                  Databricks isn't useless. It's just a swiss army knife that doesn't do anything well, except sales, and may improve the workflows for the least advanced data analysts/scientists at the expense of everyone else.

                  • This matches my experiences as well. Databricks is great if 1. your data is actually big (processing 10s/100s of terabytes daily), and 2. you don't care about money.
          • thr0wa day ago
            > Fast > ec2

            Are you doing this on EBS? Honest question.

      • walamaking2 days ago
        Dumb question - how is this different from Snowflake?
        • pm902 days ago
          they are competitors and are similar. Snowflake popularized the cloud datawarehouse concept (after aws fumbled it big with Redshift). DB is the hot new tool.
        • levanten2 days ago
          They are very similar; with various similar solutions at differing stages of maturity.
    • ajmaa day ago
      when you got out of Spark, what did you go to?
      • forgetfulnessa day ago
        BigQuery ELT, the org I went to was rather immature in their data practice, and I sold them on getting some proper orchestration (Dataform, their preference over DBT, and Airflow), and keeping the architecture coherent.

        I'd have rather stuck with Spark just because I prefer Scala or Python to SQL (and that comes with e.g. being far easier to unit test), but life happened and that ecosystem was getting disrupted anyway.

  • datadrivenangel2 days ago
    Databricks is trying hard to get into serverless, but it seems like they refuse to allow it to actually be cheaper, which defeats the purpose of serverless.
    • viccis2 days ago
      There are so many gotchas. I'm getting so tired of working around it, but my company is all in on serverless so the pain will continue. A lot of it is tied up with Unity Catalog shortcomings, but Serverless and UC are basically joined at the hip.

      A few just off the top of my head:

      * You can't .persist() DataFrames in serverless. Some of my work involves long pipelines that wind up with relatively small DFs at the end of them, but need to do several things with that DF. Nowhere near as easy as just caching it. * Handling object storage mounted to Unity Catalog can be a nightmare. If you want to support multiple types of Databricks platforms (AWS, Azure, Google, etc.), then you will have to deal with the fact that you can't mount one type's object storage with another. If you're on Azure Databricks, you can't access S3 via Unity Catalog. * There's no API to get metrics like how much memory or CPU was consumed for a given job. If you want to handle monitoring and alerting on it yourself, you're out of luck. * For some types of Serverless compute, startup times from cold can be 1 minute or more.

      They're getting better, but Databricks is an endless progression of unpleasant surprises and being told "oh no you can't do it that way", especially compared to Snowflake, whose business Databricks has been working to chew away at for a while. Their Variant type is a great example. It's so much more limited than Snowflake's that I'm still learning new and arbitrary ways in which it's incompatible with Snowflake's implementation.

    • programmertote2 days ago
      I had an interview with a senior data engineering candidate and we were talking about how expensive Databricks can get. :D I set up specific budget alerts in Azure just for Databricks resources in DEV and PROD environments.
    • thrance2 days ago
      I don't think being cheaper is the main value sell of serverless. When I hear "serverless" I think "ease of deployment and automatic scaling".
      • whstla day ago
        Serverless is incredibly cheap for endpoints that don't get called too often, and incredibly expensive for endpoints that are.

        I guess different people just have different experiences.

      • whateveracct2 days ago
        Right but ultimately that's a cost thing, right? Because you can solve those problems through other means and by hiring internally.

        Serverless is meant to obviate some of that. But it is less compelling when the vendor tries to gobble up that margin for themselves.

        • sitkack2 days ago
          You will all forced to go serverless because new grads can't use the command line. Running a database is about the hardest thing you can do. If it is serverless, you don't need special skills, preventing employees from becoming valuable lowers costs across the board.
          • vhcr2 days ago
            Have you tried being less jaded? Running a database is NOT about the hardest thing you can do.
            • sitkacka day ago
              When running a service, databases are the hardest to run. K8S still doesn't handle them well (this is by design), so they are the first thing to get outsourced to a managed service.

              This is me being less jaded. Support those little wins!

    • avg_dev2 days ago
      hmm, what is a serverless Pg? I don't quite understand. I thought you needed a database server if you wanted to run Pg.
      • mohon2 days ago
        basically they separate the compute and storage into different components, where the traditional PG use both compute and storage at the same server.

        because of this separation, the compute (e.q SQL parsing, etc) can be scaled independently and the storage can also do the same, which for example use AWS S3

        so if your SQL query is CPU heavy, then Neon can just add more "compute" nodes while the "storage" cluster remain the same

        to me, this is similar to what the usual microservice where you have a API service and DB. the difference is Neon is purposely running DB on top of that structure

        • fock2 days ago
          So how is this distributed Postgres still an ACID-compliant database? If you allow multiple nodes to query the same data this likely is just Trino/an OLAP-tool using Postgres syntax? Or did they rebuild Postgres and not upstream anything?
      • kwilletsa day ago
        It's only serverless in the way it commits transactions to cloud storage, making the server instance ephemeral; otherwise it has a server process with compute and in-memory buffer pool almost identical to pg, with the same overheads.
      • LtWorf2 days ago
        Marketing speech.
        • udev40962 days ago
          You shouldn't be getting downvoted. Serverless is nothing more than a hype which is meant to overcharge you instead of running it on a server owned by you
          • vasco2 days ago
            That's a reductionist view of a technical aspect because of the way the technical aspect is sold. Serverless are VMs that launch and turn off extremely quickly, so much so that they open up new ways of using said compute.

            You can deploy serverless technologies in a self hosted setup and not get "overcharged". Is a system thread bullshit marketing over a system process?

  • thiagoeh2 days ago
    Looks like the acquihire of Bit.io in 2023 wasn't enough to be able to deliver their own OLTP offering

    https://blog.bit.io/whats-next-for-bit-io-joining-databricks... https://www.databricks.com/blog/welcoming-bit-io-databricks-...

    Or it's just a business decision to corner the market, as someone else said

    • timenova2 days ago
      Okay now I am concerned. We're using Neon. We can move easily at this point, but I'm sure they have huge customers storing many terabytes of data where this may be genuinely hard to do.

      I went to Archive.org and figured out that in 2023, they announced they were shutting down on May 30th, all databases shutdown on June 30th, only available for downloads after that, and deleted on July 30th.

      • joshstrangea day ago
        Same boat here. Not really looking to have to move but I'm incredibly thankful that I never integrated with Neon more than using Postgres. I don't depend on/need their API or other branching features.

        I hate that this is what I've become, I want to try some of the cool features "postgres++" providers offer but I actively avoid most features fearing the potential future migration. I got burned using the Data API on Aurora Serverless and then leaving them and having to rewrite a bunch of code.

    • klabb32 days ago
      They aren’t exactly hiding it. I kept my eye on bit.io because they looked very promising. Next day, gone. Shut down immediately. Something is fucky with the investment pipeline because it’s not ”worth” that much on its own, it’s a market dominance play, bad for innovation..
    • mcmcmc2 days ago
      > Or it's just a business decision to corner the market, as someone else said

      Given how lax antitrust enforcement is, probably this

  • esadeka day ago
    I migrated to Neon from bit.io after Databricks acquired and sunset it. Really hope I won't have to migrate again.
  • clpm4j2 days ago
    I've been seriously considering neon for a new application. This definitely gives me pause... maybe plain ol' Postgres is going to be the winner for me again.
    • jedberg2 days ago
      Why would this give you pause? You just don't want the data to be where Databricks is?

      Either way, there are plenty of other serverless Postgres options out there, Supabase being one of the most popular.

      • MOARDONGZPLZ2 days ago
        Can’t speak for anyone but myself and my experience anecdotally, having used Databricks: I consider them to be the Oracle of the modern era. Under no circumstances would I let them get their hooks into any company I have the power from preventing it.
        • clpm4j2 days ago
          This is exactly how I feel. I do not want to be in the Databricks ecosystem.
        • thor242 days ago
          Why do think so? Databricks notebook product I have used in couple of companies is pretty solid. I have done any google research but they are generally known to be very high talent dense kind of place to work.
          • sitkack2 days ago
            You and the parent are not talking about the same things.
          • 2 days ago
            undefined
      • omneity2 days ago
        Supabase, while a great product, does not offer serverless Postgress.
        • jedberg2 days ago
          What would you say they offer then if not serverless Postgres?

          You set up a database, you connect to it, they take care of the rest. It even scales to $0 if you don't use it.

          Is that not serverless Postgres?

          • omneity2 days ago
            Serverless in the context of Postgres means to decouple storage and compute, so you could scale compute "infinitely" without setting up replica servers. This is what Neon offers, where you can just keep hitting their endpoints with your pg client and it should just take whatever load (in principle) and bill you per request.

            Supabase gives you a server that runs classic Postgres in a process. Scaling in this scenario means you increase your server's capacity, with a potential downtime while the upgrade is happening.

            You are confusing _managed_ Postgres for _serverless_.

            Others in the serverless Postgres space:

            - https://www.orioledb.com/ (pg extension)

            - https://www.thenile.dev/ (pg "distribution")

            - https://www.yugabyte.com/ (not emphasizing serverless but their architecture would allow for it)

          • edoceo2 days ago
            That's postgre on their server.
            • jedberg2 days ago
              Yes, serverless doesn’t mean no servers.

              How is what Supabase offers different from what Neon offers from a user perspective?

              • anilgulecha2 days ago
                Exactly how EC2 is different from Lambda from a user's perspective.
      • greenavocado2 days ago
        > Why would this give you pause?

        After a funding round the value extraction from customers is just over the horizon

    • mdaniel2 days ago
      Lucky you, you still can as it's Apache 2 https://github.com/neondatabase/neon/blob/release-8516/LICEN...

      I haven't studied the CLA situation in order to know if a rug pull is on the table but Tofu and Valkey have shown that where there's a will there's a way

      • senderistaa day ago
        The whole point of a serverless platform is that it's hosted infrastructure. Open source doesn't mean it's feasible to run it yourself.
        • mdaniela day ago
          The whole point to you, but the whole point to me was having scale-to-zero because Aurora Serverless hurp-durp-ed on that. And I deeply enjoy the ability to fix bugs instead of contacting AWS Support with my hat in my hand asking to be put on some corporate backlog for 2073

          Thankfully, you can continue to pay Databricks whatever they ask for the privilege of them hosting it for you

      • ddorian432 days ago
        It's open source like a code dump. There's no support for open source IIRC.
    • vibhork2 days ago
      Try Supabase!
  • yalogin2 days ago
    A tangential question here, will Databricks ever go public? At this point it's a large company making billion dollar acquisitions.

    For someone looking to join the company, I cannot imagine IPO to be a motivation anymore.

    • manquer2 days ago
      Later stage things are , the potential IPO is a benefit not deterrent. Recruiters and hiring managers will hint at potential IPO being not far off as an incentive to join. It minimizes risk, they do same for potential target’s founders like Neon here .

      This is better than earlier stage startups , while you get far better multiples , it is also quite possible that you are let go somewhere into the cycle without the money to vest the options for tax reasons and there is short vesting period on exit.

      For this reason companies these days offer 5/10 yr post leaving as a more favorable offer

      ——

      For founders it is gives them a shorter window to a exit than on their own, and in revenue light and tech heavy startup like neon (compared to databricks) the value risk is reduced because stock they get in acquisition is based on real revenue and growth not early stage product traction as neon would be today .

      They also have some cash component which is usually enough to buy core things in most founders look at like buying a house in few million range or closing mortgages or invest in few early stage projects directly or through funds

    • ww5202 days ago
      If they are making money, there is no pressure to raise money from IPO.
    • VirusNewbie2 days ago
      Why does it matter if you get liquidity events 2-4x per year
    • kyawzazaw2 days ago
      they can do employee liquidity event
      • yalogin2 days ago
        That is not the same as an IPO right?
        • manquer2 days ago
          No, basically it is a buy back of employee options and stock .

          Many companies raise money only to give liquidity to founders / employees and some early investors even if they don’t money for operations at all.

          While Databricks is large , there are much bigger companies which would have IPOed at smaller sizes in the past which are delaying (may never do) today. Stripe and SpaceX are the biggest examples both have healthy positive cash flows but don’t feel the value of going public . Buying back shares and options is the only route if you don’t have IPO plans if you want to keep early stage employees happy

        • hgontijo2 days ago
          Company offers to purchase employee pre-ipo shares.
  • markus_zhang2 days ago
    I'm confused. I saw users left Databricks left and right. Two companies I worked for previously got out of it due to cost.

    Do they still have a lot of $$$?

  • joshstrange2 days ago
    Well this isn't great news. I quite enjoy using Neon but I doubt it's going to continue to cater to people like me if it's bought by Databricks (from the little I know about them and from looking at their website).

    Thankfully, I just need "Postgres", I wasn't depending on any other features so I can migrate easily if things start going south.

  • AbstractH2420 hours ago
    Anyone notice a rapid ramp up in acquisitions?

    As though folks are looking for exits but IPO isn’t an option.

    Think we’re approaching a reckoning for lots of companies that raised circa 2021 at valuations that are no longer plausible and AI startups.

    Oh, and ones in the first group that tried to rebrand as the second…

  • beoberha2 days ago
    Congrats to the Neon team - they make an awesome product. That’s about all the good I can say here. I don’t blame them for selling out. It’s always felt like a “when” not an “if”. I would be surprised if you can make money selling cloud databases - especially when funded by VCs.
  • 9999000009992 days ago
    Supabase just raised 200 million.

    What’s with all these Postgres hosting services being worth so much now?

    Someone at AWS probably thought about this, easy to provision serverless Postgres, and they just didn’t build it.

    I’m still looking for something that can generate types and spit it out in a solid sdk.

    It’s amazing this isn’t a solved problem. A long long time ago, I was apart of a team trying to sort this out. I’m tempted to hit up my old CEO and ask him what he thinks.

    The company is long gone…

    If anything we tried to do way too much with a fraction of the funding.

    In a hypothetical almost movie like situation I wouldn’t hesitate to rejoin my old colleagues.

    The issue then, as is today is applications need backends. But building backends is boring, tedious and difficult.

    Maybe a NoSql DB that “understands” the Postgres API?

    • investa2 days ago
      Building backends is easy. It is sort of weird. In 2003 no one would bat an eyelid at building an entire app and chucking it on a server. I guess front-end complexity had made that a specialism so with all that dev energy drained they have no time for the backend. The backend is substantial easier though!

      These high value startups timed well to capture the vibe coding (was known as builidng an MVP before), front end culture and sheer volume of internet use and developers.

      • 9999000009992 days ago
        It’s harder than signing up for Firebase.

        You have to understand a separate set of concerns. Spin something up on ec2, hook it into a db, configure https , figure out why it went down, etc.

        You’re right though, once I build a complex front end I want someone else to do the backend.

        • jimbokun2 days ago
          You need all that stuff when you need to scale. For an MVP you can get away with very little.
    • cpursleya day ago
      Supabase is not just a hosted Postgres, it’s a full(ish) backend stack built on open source components comparable with something like firebase. But being Postgres, encourages same data modeling (and an escape hatch). Their type generation and SDK is quite good, too. It’s one of my favorite services and powers to projects of mine, soon to be 3.
      • 999900000999a day ago
        I've tried Superbase.

        Their choice of Deno for edge functions is... Well, unique.

        For my current project I have to do a lot of quirky logic, and I kept hitting a brick wall with Supabase.

        I also didn't enjoy the self hosting journey. Not exactly easy.

        • cpursleya day ago
          Haven't used their edge functions yet. What's the issue with Deno (I'm not familier with it)?

          For the other stuff, what do you find quirky?

          • 9999000009996 hours ago
            Firebase let's you write functions in normal node js and Python.

            Supabase only supports Deno. The quirkiness is my own server side logic. Tbf, I've tried to build this project at least 4 times and I might need to take a step back.

      • cpursleya day ago
        *encourages sane modeling. I can’t type today.
    • zmja day ago
      > Someone at AWS probably thought about this, easy to provision serverless Postgres, and they just didn’t build it.

      AWS is working on this as well: https://aws.amazon.com/blogs/database/introducing-amazon-aur...

      • senderistaa day ago
        DSQL is genuinely serverless (much more so than "Aurora Serverless"), but it's a very long way from vanilla Postgres. Think of it more like a SQL version of DynamoDB.
    • _bohm2 days ago
      "Easy to provision" is mostly a strategic feature for acquiring new users/customers. The more difficult parts of building a database platform are reliability and performance, and it can take a long time to establish a reputation for having these qualities. There's a reason why most large enterprises stick to the hyperscalers for their mission-critical workloads.
      • investa2 days ago
        That reason also includes SOC2, FedRAMP, data at rest jurisdiction, availability zones etc. And if large enough you can negotiate the standard pricing.
        • _bohm2 days ago
          For sure. And oftentimes these less sexy features or certifications are much more cumbersome to implement/acquire than the flashy stuff these startups lead with
    • jimbokun2 days ago
      > Maybe a NoSql DB that “understands” the Postgres API?

      I believe there are several of these already, like Cockroach DB.

    • zamderax2 days ago
      Supabase is particularly valuable for its users. Or right now “vibecoders”
  • crowcroft2 days ago
    If I'm guessing this either:

    1. An acquihire (if your a Neon customer this would probably be a bad outcome for you).

    2. A growth play. Neon will be positioned as an 'application layer' product offered cheap to bring SaaS startups into the ecosystem. As those growth startups grow and need more services sell them everything else.

    • aurareturn2 days ago
      Who pays $1b for an acquihire?
      • crowcroft5 hours ago
        Character AI is the only one I can think of. Although point taken, there must be more going on than a pure acquihire.
  • chachra2 days ago
    Hope they don't increase the price!!
    • kelnos2 days ago
      I'd be more worried that they'd shut it down...
  • taw12852 days ago
    I am fairly new to all this data pipeline services (Databricks, Snowflakes etc).

    Say right now I have an e-commerce site with 20K MAU. All metrics are going to Amplitude and we can use that to see DAU, retention, and purchase volume. At what point in my startup lifecycle do we need to enlist the services?

    • speakfreely2 days ago
      A non-trivial portion of my consulting work over the past 10 years has been working on data pipelines at various big corporations that move absurdly small amounts of data around using big data tools like spark. I would not worry about purchasing services from Databricks, but I would definitely try to poach their sales people if you can.
      • lizard2 days ago
        Just curious, what would you consider, "absurdly small amounts of data around using big data tools like spark" and what do you recommend instead?

        I recently worked on some data pipelines with Databricks notebooks ala Azure Fabric. I'm currently using ~30% of our capacity and starting to get pushback to run things less frequently to reduce the load.

        I'm not convinced I actually need Fabric here, but the value for me has been its the first time the company has been able to provision a platform that can handle the data at all. I have a small portion of it running into a datbase as well which has been constant complaints about volume.

        At this point I can't tell if we just have unrealistic expectations about the costs of having this data that everyone wants, or if our data engineers are just completely out of touch with the current state of the industry, so Fabric is just the cost we have to pay to keep up.

        • speakfreelya day ago
          One financial services company has hundreds of Glue jobs that are using pyspark to read and write less than 4GB of data per run. These jobs run every day.
      • emmelaich2 days ago
        I'm aware of a govt agency with a few hundred gb of data using Mongo, Databricks and were being pushed towards Snowflake as well. Boggles the mind.
      • spratzt2 days ago
        I used to do similar work. Back in the day I used 25 TB as the cut off point for single node design. It’s certainly larger now.
      • jimbokun2 days ago
        Which is also a reason to not use Databricks, as they will cost your company money by selling gullible users things they don’t need.
  • ashvardanian2 days ago
    Of all the billion-scale investment and acquisition news of the last 24 hours this is the only one that makes sense. Especially after the record-breaking $15B round, that Databricks closed last year.
  • senderistaa day ago
    AWS just breathed a huge sigh of relief at the neutralization of Aurora's most dangerous competitor.
  • briandeara day ago
    Neon is awesome. I hope Databricks doesn’t brick it.
  • anshumankmr2 days ago
    Great.As someone using Neon, how might this impact me? Price bumps?
    • joshstrangea day ago
      I'd be most concerned with Neon being shut down. That's what Databricks did to bit.io (another serverless Postgres provider they bought).

      I'm really not looking forward to a migration.

  • User232 days ago
    Meanwhile here I am wondering why everyone isn’t using SQLite.
    • jimbokun2 days ago
      If you can serve all your traffic by a single instance running Sqlite in same process as your application, have at it.

      If you need to serve your dats across a network to many clients, managing that with SQLite is much trickier.

    • HWR_142 days ago
      I thought SQLite's use case was for a single-user local database.
      • 0x6c6f6c2 days ago
        More like "single process application's database".

        There are interesting use cases for DB-per-user which can be server or client side, or litestream's continuous backup/sync that can extend it beyond this use case a bit too.

        You _can_ use SQLite as your service's sole database, if you vertically scale it up and the load isn't too much. It'll handle a reasonable amount of traffic. Once you hit that ceiling though, you'll have to rethink your architecture, and undergo some kind of migration.

        The common argument for SQLite is deferring complexity of hosting until you've actually reached the type of load you have to use a more complex stack for.

  • outside12342 days ago
    Ok, can we just. How is Databricks an AI unicorn exactly?
    • ivape2 days ago
      Enterprises have lots of data. They store it somewhere, and there are multiple vendors that provide such "credible" infrastructure for this type of storage. Think of it like, your dad says he's willing to get a dog, but only trusts these-five-animal-shelters and nothing else. That doesn't mean that's correct (that those are the only places to get a dog), it just means that's what he trusts. Databricks is most likely a unicorn because they have successfully sold the idea that they are one of those trusted vendors, like Snowflake.

      The truth of the 2010s up until now is that every startup was a massive sales con job. The wealth of this industry is not truly built on incredible tech, but on the audacity of salesmanship. It's a billion-dollar con job. That's one of the reasons I take every ridiculous startup that launches quite seriously, because you have no idea just how audacious their sales people are. They can sell anything.

      Your question is very fundamental, and the answer is just as raw and fundamental too. I would love it if some of these sales people actually reform and write tell-alls about how they conned so many large companies in their years of working. This content has got to be out there somewhere.

      • woooooo2 days ago
        So, I'm not sure if this is less cynical or more cynical, but.. have you ever talked to the decision-makers who buy something like databricks?

        They can't build it themselves, and it's highly dubious that they'd be able to hire and supervise someone to build it. Databricks may be selling "nothing special", but it's needed, and the buyers can't build it themselves.

        • tibbara day ago
          The thing is, it's actually a very difficult engineering/research/infra problem to run complicated queries on enormous data lakes. All the obvious ways to do it are prohibitively slow and expensive. Every bit of performance you can squeeze out of this, you unlock the ability for people to work with their data more easily. So there is huge value in having some centralized companies sink lots of R&D into trying to solve these problems well.
        • th0ma52 days ago
          Is that how Databricks sees their customers? Yikes
          • fock2 days ago
            I can tell you the company I work at (4000 people, legacy banking IT) has 4 people running our Datalake. We likely have more people buying/"evaluating" Databricks currently (from overhearing calls in open-plan offices), so I guess they have a point. A very sad point...
      • 2 days ago
        undefined
      • s1artibartfast2 days ago
        My mental model is that there are few big money printing industries, and the major players and it will pay just about anything for a slight advantage. It's really about additive revenue, it's about protecting market share.