Databricks in talks to acquire startup Neon for about $1B(www.upstartsmedia.com)

205 pointsby ko_pivot2 months ago27 comments

wqtz2 months ago
Databricks acquired bit.io and subsequently shut it down quite fast. Afaik bit.io had a very small team and the founder was a serial entrepreneur who is not going to stick around and he did not. I am not sure who from bit.io is still around at databricks.
If I am guessing right, Motherduck will likely be acquired by GCP because most of the founding team was ex-BQ. Snowflake purchased Modin and polars is still quite immature to be acquisition ready. So, what does this leave us with. There is also EDB who is competing in enterprise Postgres space.
Folks I know in the industry are not very happy with databricks. Databricks themselves was hinting people that that they would be potentially acquired by Azure as Azure tries to compete in the data warehouse space. But everyone become an AI company which left Databricks in an awkward space. Their bdev team is not bestest from my limited interactions with them (lots of starbucks drinkers and let me get back to you after a 3 month PTO), so they do not know who or how to lead them to an AI pivot. With cash to burn from overinvestment and the snowflake/databricks conf coming up fast they needed a big announcement and this is that big announcement.
Should have sobered up before writing this though. But who cares.
- mritchie7122 months ago
  The "datalake" is becoming a bit of a commodity. It's getting pretty easy to spin one up yourself[0] using completely open source components.
  Databricks and Microsoft (thru Fabric) are trying to build a complete data platform, i.e. ELT + datalake + BI
  My bet with Definite (https://www.definite.app/) has been this is too hairy for a large company to do well and we can do it better.
  0 - https://www.definite.app/blog/cloud-iceberg-duckdb-aws
- ignoreusernames2 months ago
  > Folks I know in the industry are not very happy with databricks
  Yeah, big companies globing up everything does not lead to a healthy ecosystem. Congrats on the founders for their the acquisition but everyone else loses with movements like this.
  I'm still sour after their Redash purchase that instantly "killed" the open source version. Tabular acquisition was also a bit controversial since one of the founders is the PMC Chair for Iceberg which "competes" directly with Databricks own delta lake. The mere presence of these giants (mostly databricks and snowflake) makes the whole data ecosystem (both closed and open source) really hostile.
- arccy2 months ago
  starbucks drinkers is certainly a new way to describe people, though i'm not sure what image that's supposed to invoke
  - ethbr12 months ago
    From context in parent, I'm reading as the sort of person who looks more competent than they are and skates from job to job quickly enough that no one notices.
    joshuanapoli2 months ago
    Maybe they mean the kind of biz dev that uses small bribes (a free drink at Starbucks) to help get customers to take their call.
    tomrod2 months ago
    Of all the images I imagined, it's not this one.
    BDev can be good or bad. Bad ones tend to not follow up, and Starbucks here represents they have poor decision making skills (reinforced by going on PTO for three months and not following up on commitments).
  - bluecheese4522 months ago
    Thought the same. I mean I don’t drink it because I can make my own far cheaper but I don’t look on with scorn at those who do. It says a lot more about the person making the judgment than those who drink the coffee.
- ThePowerOfFuet2 months ago
  >Should have sobered up before writing this though. But who cares.
  In vino veritas, and all that; we appreciate your honesty!
- sys132 months ago
  Very unlikely that Databricks would be acquired by Azure. So much of their business is on AWS, and they are invested in by AWS/Azure/GCP.
- AlexeyBelov2 months ago
  Starbucks drinkers? What do you mean?
newfocogi2 months ago
They offer serverless Postgres. Here's a link if anyone else needs it https://neon.tech/
- gopalv2 months ago
  An OLTP solution fixes a lot of the headaches about the traditional extract-load-transform steps.
  Mostly a lot of OLAP starts when the data loads in Kafka logs or a disk of some sort.
  Then you schedule a task or keep a task polling this constantly, which is always prone to small failures & delays or big failures when schema changes up.
  The "data pipeline" team exists because the data doesn't move by itself from where it is first stored to where it is ready for deep analysis.
  If you can directly push 1-row updates transactionally to a system and feed off the backend to write a more OLAP friendly structure, then you can hookup things like a car rental service's operational logs into a system which can compute more complex things like forecasting of availability or apply discounts to give a customer an upgrade for cheap.
  Neon looks a lot better than YugaByte in tech (which also talks postgres protocols) and a lot nicer in protocol compatibility than something like FoundationDB.
  Alloy from Google feels somewhat similar, Spanner has a postgres interface too.
  The postgres API is a great abstraction common point, even if the actual details of the implementations vary a lot.
betteryet2 months ago
Neon is a great product because they are run by Postgres enthusiasts. They have decent customer-friendly pricing, real serverless HTTP endpoints, and they're always on the latest version of Postgres as soon as it is stable. From what I can tell, no other provider has this positioning, driven by dedication.
I really hope they can maintain this dedication after acquisition, but Databricks will probably push them into enterprise and it will lose the spark. I wish Cloudflare bought them instead.
jmull2 months ago
Wow, $1B.
I've been bullish on neon for a while -- the idea hits exactly the right spot, IMO, and their execution looks good in my limited experience.
But I mean that from a technical perspective. I never have any real idea about the business -- do they have an edge that makes people want to start paying them money and keep paying them money? Heck if I know.
I guess that's going to be Databricks problem now (maybe).
- xyst2 months ago
  Actual revenue is irrelevant. This is a business decision to corner the market.
  - blitzar2 months ago
    No, no no no, no revenue. Why would you go after revenue?
    Pre-revenue pure play.
    https://www.youtube.com/watch?v=BzAdXyPYKQo
- brap2 months ago
  I'm sorry but what is "the idea"? Managed postgres?
  It seems like execution >>> idea in this case
  - joshstrange2 months ago
    Neon goes further than just "managed postgres". I would say one of their big features is just how fast and easy you can spin up new db/clusters. It's completely possible (encouraged) to spin up 1 DB per tennant and potentially spin u and tear down 1000's of databases.
    It opens up some interesting ideas/concepts when creating an isolated DB is just as easy as creating a new db table.
  - jmull2 months ago
    More specifically, the idea is "serverless" posgres.
    But as I mentioned, I mean from a tech standpoint... If you're interested, they've posted various things about how the tech works.
    > It seems like execution >>> idea in this case
    I don't know what >>> means here, so possibly I complete agree or perhaps completely disagree.
    __s2 months ago
    >>> means "way better than"
    jordan1212k2 months ago
    Execution is more valuable than any individual technology. And the moats of the largest companies are only in part technical.
impulser_2 months ago
These Postgres, and serverless databases are all so overhyped. I have tried all of them and they all are much slower than just deploying a managed database in the same datacenter as your application.
I have an application deployed on Railway with a Postgres database and the user's latency is consistent 150ms. The same application deployed on these serverless/edge provider is anywhere between 300-400ms with random spikes to 800ms. The same application, same data, and same query.
The edge and serverless has to be the biggest scam in cloud industry right now.
They aren't faster, and they aren't cheaper. You could argue they are easier to scale, but that not he case anymore since everyone provides autoscaling now.
- cpursley2 months ago
  Whatever. I was able to set up Neon Postgres in 5 mins. It’s still crazy fast with my Fly services, has replication out of the box and backups. Much easier than AWS and from what I can tell, getting something going with Railway. And I don’t have to worry about operating it. My time is valuable.
  - mbreese2 months ago
    All of that can be true. What I wonder is — if that all is true — how much of a moat is there around that? It seems like the secret sauce in that company isn’t some custom technology, it’s execution. Execution can be replicated by another competent team. Or is there some other secret sauce that I can’t see?
    vladich2 months ago
    It's the team, they have a few Postgres committers and major contributors, and there are not that many of them. But that's a bit precarious, the team may leave after the acquisition for many reasons.
    datadrivenangel2 months ago
    Execution is some of the hardest secret sauce of all
    mbreese2 months ago
    I completely agree... in my comment, the word "competent" was doing a lot of heavy lifting.
    And it begs comparisons to comments about Dropbox/rsync, etc...
    But, I personally think the Neon concept of branching databases with CoW storage is quite interesting. That, combined with cost-management with autoscaling does seem like at least a serviceable moat.
  - impulser_2 months ago
    These are features of any managed database service.
    DigitalOcean, Railway, Render, and so on all offer the exact same feature except it's just pure Postgres and you can deploy them in the same data center as your application.
    cpursley2 months ago
    Render nor DO offer logical replication and are missing some other features.
  - myflash132 months ago
    400ms added latency is really bad for user experience. Do a few queries and you’re going to need to add caching. Now you’re spending your precious developer time managing caching invalidation in lots of places instead of just setting up your database properly in the beginning.
    cpursley2 months ago
    Except it's not between neon and fly:
    https://neon.tech/blog/how-to-minimise-the-impact-of-databas...
    https://neon.tech/demos/regional-latency
    myflash132 months ago
    I understand there are ways to deal with the problem of latency in serverless, but this is a problem I'd rather not deal with in the first place. The database IS the application, and I would not want to sacrifice speed of the database for anything. Serverless is totally not worth the trade-off for me: slightly more convenient deployments, for much higher latency to the database.
    I'm a solo dev that has been installing and running my own database server with backups for decades and have never had a problem with it. It's so simple, and I have no idea why people are so allergic to managing their own server. 99% of apps can run very snappily on a single server, and the simplicity is a breath of fresh air.
    pdimitar2 months ago
    That's why I'm working hard on bringing in a tightly integrated support for SQLite in the Elixir ecosystem (via a Rust FFI bridge): because in my professional experience not many applications need something as hardcore and amazing as PostgreSQL; at least 80% of all apps I ever witnessed would be just fine with an embedded database.
    I share similar experiences like yours and others in this thread, and to me all those operational concerns grow into unnecessary noise that distracts from the real problems that we are paid to solve.
    tristan9572 months ago
    Are you referring to cold start latencies?
    myflash132 months ago
    Not just cold start (another problem you have to worry about with serverless). There's the simple fact that network latency outside of the same datacenter is ALWAYS slow and randomly unpredictable, especially if you have to run multiple queries just to render a single page to your user. A database should always be over LAN in my opinion, if you need to access data over the internet, at that point it should be over an API/HTTP, not internal database access.
- mritchie7122 months ago
  supabase lured me in with built-in oauth, real-time, and some nice client side features in their JS lib, but I do worry about the latency sometimes.
  It'd be a lot of work to run an apples to apples test with a Google Cloud Postgres db vs. Supabase and see what the difference is.
- atombender2 months ago
  Isn't this an apples-to-orange comparison?
  Neon's multi-region support isn't directly comparable to a single Postgres database in a single data center. You can set up Neon in a single data center, too, and I would expect the same performance in that case.
  Meanwhile, if you tried to scale your single-Postgres to a multi-region setup, you'd expect higher latencies relative to the location of your data.
- myflash132 months ago
  Even managed databases are a scam. You can easily get 10x cheaper pricing for the same workload, by, wait for it, installing Postgres yourself on a baremetal machine. Plus you get much better performance, no noisy neighbors, and ability to actually control and measure low level performance. I never got the hype for serverless. Why are people so allergic to setting up a server? It takes a few hours a year of investment, and the performance benefits are huge.
  - tristan9572 months ago
    > Even managed databases are a scam
    Just because you don't derive value out of something doesn't mean it is a scam.
forgetfulness2 months ago
What is the lowdown on Databricks? Their bread and butter were hosted Spark and notebooks. As tasks done in Spark over a data lake began to be delegated wholesale to columnar store ELT, they tried to pivot to "lake houses", then I sort of lost track of them after I got out of Spark myself.
Did Delta Lake ever catch on? Where are they going now?
- richardw2 months ago
  Capture enterprise AI enthusiasm by providing a 1-stop shop for data and AI, optionally hosted on your own cloud tenant. Keep deploying functionality so clients never need another supplier. Partner with SAP, OpenAI, anyone who holds market share. Buy anyone that either helps growth or might help a competitor grow.
  Enterprise view: delegate AI environment to Databricks unless you’re a real player. Market is too chaotic, so rely on them to keep your innovation pipeline fed. Focus on building your own core data and AI within their environment. Nobody got fired for choosing Databricks.
  - jimbokun2 months ago
    Can someone translate this to non-CEO speak?
    baggiponte2 months ago
    You basically pay databricks a “fee” to choose the more appropriate and modern stack for you to build on, and keep it up to date. Never used it, but it handles with lots of the administrative bs (compliance, SLAs, idk) for you so you can just ship.
    janderson2152 months ago
    [flagged]
  - forgetfulness2 months ago
    That does sound, as you allude, like IBM on its long downward spiral of globbing up products to stay relevant and touting them as an integral solution, while in-house development stuck to keeping legacy products alive for their Enterprise contracts. I wonder if they'll be foolish enough to start doing consulting around them, obliterating their economies of scale in the process; so far they are going with the "consulting partners" approach.
    Oh well. Databricks notebooks were hella cool back when companies were willing to spend lavishly on having engineers write cloud hosted Scala in the first place, and at premium prices to boot.
  - cactusfrog2 months ago
    A nice UI for a data lake house is underrated. I use AWS Athena at my work and it is just so bad for no good reason. For example, big columns of text are expanded outwards making reading the subsequent columns impossible.
    senderista2 months ago
    Well UI has never exactly been Amazon's strong suit.
- mritchie7122 months ago
  Delta Lake is not catching on, but no worries, they bought Iceberg[0] (the competing standard).
  I'm joking, but only a bit. Iceberg is open source (Apache), but a lot of the core team and the creator worked at Tabular and Databricks bought them for $1B.
  0 - https://www.definite.app/blog/databricks-tabular-acquisition
- rogermavis2 months ago
  It provides central place to store and query data. A big org might have a few hundred databases for various purposes - databricks lets data engineers set up pipelines to ETL that data into databricks and when the data is there it can be queried (using spark, so there's some downsides - namely a more restrictive SQL variant - but some advantages - better performance across very large datasets).
  Personally, I hated databricks, it caused endless pain. Our org has less than 10TB of data and so it's overkill. Good ol' Postgres or SQL Server does just fine on tables of a few hundred GB, and bigquery chomps up 1TB+ without breaking a sweat.
  Everything in databricks - everything - is clunky and slow. Booting up clusters can take 15 minutes whereas something like bigquery is essentially on-demand and instant. Data ETL'd into databricks usually differs slightly from its original source in subtle but annoying ways. Your IDE (which looks like jupyter notebook, but is not) absolutely suck (limited/unfamiliar keyboard shortcuts, flakey, can only be edited in browser), and you're out of luck if you want to use your favorite IDE, vim etc.
  Almost every databricks feature makes huge concessions on the functionality you'd get if you just used that feature outside of databricks. For example databricks has it's own git-like functionality (which is the 5% of git that gets most used, but no way to do the less common git operations).
  My personal take is databricks is fine for users who'd otherwise use their laptop's computer/memory - this gets them an environment where they can access much more, at about 10x the cost of what you'd pay for the underlying infra if you just set it up yourself. Ironically, all the databricks-specific cruft (config files, click ops) that's required to get going will probably be difficult for that kind of user anyway, so it negates its value.
  For more advanced users (i.e. those that know how to start an ec2 or anything more advanced), databricks will slow you down and be endlessly frustrating. It will basically 2-10x the time it takes to do anything, and sap the joy out of it. I almost quit my job of 12 years because the org moved to databricks. I got permission to use better, faster, cheaper, less clunky, open-source tooling, so I stayed.
  - bokenator2 months ago
    Which open source option did you end up going with? I'm in the same boat and would like to evaluate my options.
    rogermavis2 months ago
    My stack atm is neovim, python/R, an EC2 and postgres (sometimes Sql Server). Some use of arrow and duckdb. For queries on less than few hundred GB this stack does great. Fast, familiar, the ec2 is running 24/7 so it's there when I need it and can easily schedule overnight jobs, and no time wasted waiting for it to boot.
    creeksai2 months ago
    You mentioned earlier about how long it would take to acquire a new cluster in Databricks, but you are comparing it here to something that's always on here. In a much larger environment, your setup is not really practical to have a lot of people collaborating.
    Note that Databricks SQL Serverless these days can be provisioned in a few seconds.
    rogermavis2 months ago
    > you are comparing it here to something that's always on
    That's the point. Our org was told databricks would solve problems we just didn't have. Serverful has some wonderful advantages: simplicity, (ironically) cheaper (than something running just 3-4 hours a day but which costs 10x), familiarity, reliability. Serverless also has advantages, but only if it runs smoothly, doesn't take an eternity to boot, isn't prohibitively expensive, and has little friction before using it - databricks meets 0/4 of those critera, with the additional downside of restrictive SQL due to spark backend, adding unnecessary refactoring/complexity to queries.
    > your setup is not really practical to have a lot of people collaborating
    Hard disagree. Our methods are simple and time-tested. We use git to share code (100x improvement on databricks' version of git). We share data in a few ways, the most common are by creating a table in a database or in S3. It doesn't have to be a whole lot more complicated.
    creeksai2 months ago
    I totally understand if Databricks doesn't fit your use cases.
    But you are doing a disingenuous comparison here because one can keep a "serverful" cluster up without shutting it down, and in that case, you'd never need to wait for anything to boot up. If you shut down your EC2 instances, it will also take time to boot up. Alternatively, you can use the (relatively new) serverless offering from them that gets you compute resources in seconds.
    rogermavis2 months ago
    To ensure I'm not speaking incorrectly (as I was going from memory), I grep'ed my several years' of databricks notes. Oh boy.. the memories came flooding back!
    We had 8 data engineers onboarding the org to databricks, it was only after 2 solid years before they got to working on serverless (it was because users complained of user unfriendliness of 'nodes', and managers of cost). But then, there were problems. A common pattern through my grep of slack convos is "I'm having this esoteric error where X doesn't work on serverless databricks, can you help".. a bunch of back and forth (sometimes over days) and screenshots followed by "oh, unfortunately, serverless doesn't support X".
    Another interesting note is someone compared serverless databricks to bigquery, and bigquery was 3x faster without the databricks-specific cruft (all bigquery needs is an authenticated user and a sql query).
    Databricks isn't useless. It's just a swiss army knife that doesn't do anything well, except sales, and may improve the workflows for the least advanced data analysts/scientists at the expense of everyone else.
    datadrivenangel2 months ago
    This matches my experiences as well. Databricks is great if 1. your data is actually big (processing 10s/100s of terabytes daily), and 2. you don't care about money.
    thr0w2 months ago
    > Fast > ec2
    Are you doing this on EBS? Honest question.
  - walamaking2 months ago
    Dumb question - how is this different from Snowflake?
    pm902 months ago
    they are competitors and are similar. Snowflake popularized the cloud datawarehouse concept (after aws fumbled it big with Redshift). DB is the hot new tool.
    levanten2 months ago
    They are very similar; with various similar solutions at differing stages of maturity.
- ajma2 months ago
  when you got out of Spark, what did you go to?
  - forgetfulness2 months ago
    BigQuery ELT, the org I went to was rather immature in their data practice, and I sold them on getting some proper orchestration (Dataform, their preference over DBT, and Airflow), and keeping the architecture coherent.
    I'd have rather stuck with Spark just because I prefer Scala or Python to SQL (and that comes with e.g. being far easier to unit test), but life happened and that ecosystem was getting disrupted anyway.
datadrivenangel2 months ago
Databricks is trying hard to get into serverless, but it seems like they refuse to allow it to actually be cheaper, which defeats the purpose of serverless.
- thrance2 months ago
  I don't think being cheaper is the main value sell of serverless. When I hear "serverless" I think "ease of deployment and automatic scaling".
  - whstl2 months ago
    Serverless is incredibly cheap for endpoints that don't get called too often, and incredibly expensive for endpoints that are.
    I guess different people just have different experiences.
  - whateveracct2 months ago
    Right but ultimately that's a cost thing, right? Because you can solve those problems through other means and by hiring internally.
    Serverless is meant to obviate some of that. But it is less compelling when the vendor tries to gobble up that margin for themselves.
    sitkack2 months ago
    You will all forced to go serverless because new grads can't use the command line. Running a database is about the hardest thing you can do. If it is serverless, you don't need special skills, preventing employees from becoming valuable lowers costs across the board.
    vhcr2 months ago
    Have you tried being less jaded? Running a database is NOT about the hardest thing you can do.
    sitkack2 months ago
    When running a service, databases are the hardest to run. K8S still doesn't handle them well (this is by design), so they are the first thing to get outsourced to a managed service.
    This is me being less jaded. Support those little wins!
- programmertote2 months ago
  I had an interview with a senior data engineering candidate and we were talking about how expensive Databricks can get. :D I set up specific budget alerts in Azure just for Databricks resources in DEV and PROD environments.
- viccis2 months ago
  There are so many gotchas. I'm getting so tired of working around it, but my company is all in on serverless so the pain will continue. A lot of it is tied up with Unity Catalog shortcomings, but Serverless and UC are basically joined at the hip.
  A few just off the top of my head:
  * You can't .persist() DataFrames in serverless. Some of my work involves long pipelines that wind up with relatively small DFs at the end of them, but need to do several things with that DF. Nowhere near as easy as just caching it. * Handling object storage mounted to Unity Catalog can be a nightmare. If you want to support multiple types of Databricks platforms (AWS, Azure, Google, etc.), then you will have to deal with the fact that you can't mount one type's object storage with another. If you're on Azure Databricks, you can't access S3 via Unity Catalog. * There's no API to get metrics like how much memory or CPU was consumed for a given job. If you want to handle monitoring and alerting on it yourself, you're out of luck. * For some types of Serverless compute, startup times from cold can be 1 minute or more.
  They're getting better, but Databricks is an endless progression of unpleasant surprises and being told "oh no you can't do it that way", especially compared to Snowflake, whose business Databricks has been working to chew away at for a while. Their Variant type is a great example. It's so much more limited than Snowflake's that I'm still learning new and arbitrary ways in which it's incompatible with Snowflake's implementation.
- avg_dev2 months ago
  hmm, what is a serverless Pg? I don't quite understand. I thought you needed a database server if you wanted to run Pg.
  - mohon2 months ago
    basically they separate the compute and storage into different components, where the traditional PG use both compute and storage at the same server.
    because of this separation, the compute (e.q SQL parsing, etc) can be scaled independently and the storage can also do the same, which for example use AWS S3
    so if your SQL query is CPU heavy, then Neon can just add more "compute" nodes while the "storage" cluster remain the same
    to me, this is similar to what the usual microservice where you have a API service and DB. the difference is Neon is purposely running DB on top of that structure
    fock2 months ago
    So how is this distributed Postgres still an ACID-compliant database? If you allow multiple nodes to query the same data this likely is just Trino/an OLAP-tool using Postgres syntax? Or did they rebuild Postgres and not upstream anything?
    mohon2 months ago
    They keep using the core Postgre while they touch the storage layer to works with S3. Can try ro read more here https://jack-vanlightly.com/analyses/2023/11/15/neon-serverl...
    fock2 months ago
    Thank you, very nice read! (Though from some scanning it looks like it mostly helps reads)
    mohon2 months ago
    You're welcome. I think for the write part, it's always back to the old classic consensus. In then end there always that distributed voting mechanism to decide the write order
  - kwillets2 months ago
    It's only serverless in the way it commits transactions to cloud storage, making the server instance ephemeral; otherwise it has a server process with compute and in-memory buffer pool almost identical to pg, with the same overheads.
  - LtWorf2 months ago
    Marketing speech.
    udev40962 months ago
    You shouldn't be getting downvoted. Serverless is nothing more than a hype which is meant to overcharge you instead of running it on a server owned by you
    vasco2 months ago
    That's a reductionist view of a technical aspect because of the way the technical aspect is sold. Serverless are VMs that launch and turn off extremely quickly, so much so that they open up new ways of using said compute.
    You can deploy serverless technologies in a self hosted setup and not get "overcharged". Is a system thread bullshit marketing over a system process?
thiagoeh2 months ago
Looks like the acquihire of Bit.io in 2023 wasn't enough to be able to deliver their own OLTP offering
https://blog.bit.io/whats-next-for-bit-io-joining-databricks... https://www.databricks.com/blog/welcoming-bit-io-databricks-...
Or it's just a business decision to corner the market, as someone else said
- timenova2 months ago
  Okay now I am concerned. We're using Neon. We can move easily at this point, but I'm sure they have huge customers storing many terabytes of data where this may be genuinely hard to do.
  I went to Archive.org and figured out that in 2023, they announced they were shutting down on May 30th, all databases shutdown on June 30th, only available for downloads after that, and deleted on July 30th.
  - joshstrange2 months ago
    Same boat here. Not really looking to have to move but I'm incredibly thankful that I never integrated with Neon more than using Postgres. I don't depend on/need their API or other branching features.
    I hate that this is what I've become, I want to try some of the cool features "postgres++" providers offer but I actively avoid most features fearing the potential future migration. I got burned using the Data API on Aurora Serverless and then leaving them and having to rewrite a bunch of code.
- klabb32 months ago
  They aren’t exactly hiding it. I kept my eye on bit.io because they looked very promising. Next day, gone. Shut down immediately. Something is fucky with the investment pipeline because it’s not ”worth” that much on its own, it’s a market dominance play, bad for innovation..
- mcmcmc2 months ago
  > Or it's just a business decision to corner the market, as someone else said
  Given how lax antitrust enforcement is, probably this
clpm4j2 months ago
I've been seriously considering neon for a new application. This definitely gives me pause... maybe plain ol' Postgres is going to be the winner for me again.
- mdaniel2 months ago
  Lucky you, you still can as it's Apache 2 https://github.com/neondatabase/neon/blob/release-8516/LICEN...
  I haven't studied the CLA situation in order to know if a rug pull is on the table but Tofu and Valkey have shown that where there's a will there's a way
  - ddorian432 months ago
    It's open source like a code dump. There's no support for open source IIRC.
    mdaniel2 months ago
    I can't easily add &exclude_maintainers=true but https://github.com/neondatabase/neon/pulls?q=is%3Apr+is%3Acl... sure does look like quite a bit of merged contributions to me, which is not what I would consider "code dump"
  - senderista2 months ago
    The whole point of a serverless platform is that it's hosted infrastructure. Open source doesn't mean it's feasible to run it yourself.
    mdaniel2 months ago
    The whole point to you, but the whole point to me was having scale-to-zero because Aurora Serverless hurp-durp-ed on that. And I deeply enjoy the ability to fix bugs instead of contacting AWS Support with my hat in my hand asking to be put on some corporate backlog for 2073
    Thankfully, you can continue to pay Databricks whatever they ask for the privilege of them hosting it for you
    senderista2 months ago
    Aurora Serverless v2 now scales to zero[1]. And DSQL does pretty much by definition (they use an architecture closer to Neon).
    [1] https://aws.amazon.com/blogs/database/introducing-scaling-to...
- jedberg2 months ago
  Why would this give you pause? You just don't want the data to be where Databricks is?
  Either way, there are plenty of other serverless Postgres options out there, Supabase being one of the most popular.
  - MOARDONGZPLZ2 months ago
    Can’t speak for anyone but myself and my experience anecdotally, having used Databricks: I consider them to be the Oracle of the modern era. Under no circumstances would I let them get their hooks into any company I have the power from preventing it.
    clpm4j2 months ago
    This is exactly how I feel. I do not want to be in the Databricks ecosystem.
    thor242 months ago
    Why do think so? Databricks notebook product I have used in couple of companies is pretty solid. I have done any google research but they are generally known to be very high talent dense kind of place to work.
    sitkack2 months ago
    You and the parent are not talking about the same things.
    2 months ago
    undefined
  - omneity2 months ago
    Supabase, while a great product, does not offer serverless Postgress.
    jedberg2 months ago
    What would you say they offer then if not serverless Postgres?
    You set up a database, you connect to it, they take care of the rest. It even scales to $0 if you don't use it.
    Is that not serverless Postgres?
    omneity2 months ago
    Serverless in the context of Postgres means to decouple storage and compute, so you could scale compute "infinitely" without setting up replica servers. This is what Neon offers, where you can just keep hitting their endpoints with your pg client and it should just take whatever load (in principle) and bill you per request.
    Supabase gives you a server that runs classic Postgres in a process. Scaling in this scenario means you increase your server's capacity, with a potential downtime while the upgrade is happening.
    You are confusing _managed_ Postgres for _serverless_.
    Others in the serverless Postgres space:
    - https://www.orioledb.com/ (pg extension)
    - https://www.thenile.dev/ (pg "distribution")
    - https://www.yugabyte.com/ (not emphasizing serverless but their architecture would allow for it)
    daniel_levine2 months ago
    https://supabase.com/blog/supabase-acquires-oriole
    omneity2 months ago
    Interesting. Maybe a new product line will come out of this.
    edoceo2 months ago
    That's postgre on their server.
    jedberg2 months ago
    Yes, serverless doesn’t mean no servers.
    How is what Supabase offers different from what Neon offers from a user perspective?
    anilgulecha2 months ago
    Exactly how EC2 is different from Lambda from a user's perspective.
  - greenavocado2 months ago
    > Why would this give you pause?
    After a funding round the value extraction from customers is just over the horizon
- vibhork2 months ago
  Try Supabase!
esadek2 months ago
I migrated to Neon from bit.io after Databricks acquired and sunset it. Really hope I won't have to migrate again.
yalogin2 months ago
A tangential question here, will Databricks ever go public? At this point it's a large company making billion dollar acquisitions.
For someone looking to join the company, I cannot imagine IPO to be a motivation anymore.
- manquer2 months ago
  Later stage things are , the potential IPO is a benefit not deterrent. Recruiters and hiring managers will hint at potential IPO being not far off as an incentive to join. It minimizes risk, they do same for potential target’s founders like Neon here .
  This is better than earlier stage startups , while you get far better multiples , it is also quite possible that you are let go somewhere into the cycle without the money to vest the options for tax reasons and there is short vesting period on exit.
  For this reason companies these days offer 5/10 yr post leaving as a more favorable offer
  ——
  For founders it is gives them a shorter window to a exit than on their own, and in revenue light and tech heavy startup like neon (compared to databricks) the value risk is reduced because stock they get in acquisition is based on real revenue and growth not early stage product traction as neon would be today .
  They also have some cash component which is usually enough to buy core things in most founders look at like buying a house in few million range or closing mortgages or invest in few early stage projects directly or through funds
- ww5202 months ago
  If they are making money, there is no pressure to raise money from IPO.
- VirusNewbie2 months ago
  Why does it matter if you get liquidity events 2-4x per year
- kyawzazaw2 months ago
  they can do employee liquidity event
  - yalogin2 months ago
    That is not the same as an IPO right?
    manquer2 months ago
    No, basically it is a buy back of employee options and stock .
    Many companies raise money only to give liquidity to founders / employees and some early investors even if they don’t money for operations at all.
    While Databricks is large , there are much bigger companies which would have IPOed at smaller sizes in the past which are delaying (may never do) today. Stripe and SpaceX are the biggest examples both have healthy positive cash flows but don’t feel the value of going public . Buying back shares and options is the only route if you don’t have IPO plans if you want to keep early stage employees happy
    hgontijo2 months ago
    Company offers to purchase employee pre-ipo shares.
markus_zhang2 months ago
I'm confused. I saw users left Databricks left and right. Two companies I worked for previously got out of it due to cost.
Do they still have a lot of $$$?
- hgontijo2 months ago
  https://www.databricks.com/company/newsroom/press-releases/d...
  - markus_zhang2 months ago
    Thanks. OK they still have a lot of money.
joshstrange2 months ago
Well this isn't great news. I quite enjoy using Neon but I doubt it's going to continue to cater to people like me if it's bought by Databricks (from the little I know about them and from looking at their website).
Thankfully, I just need "Postgres", I wasn't depending on any other features so I can migrate easily if things start going south.
beoberha2 months ago
Congrats to the Neon team - they make an awesome product. That’s about all the good I can say here. I don’t blame them for selling out. It’s always felt like a “when” not an “if”. I would be surprised if you can make money selling cloud databases - especially when funded by VCs.
9999000009992 months ago
Supabase just raised 200 million.
What’s with all these Postgres hosting services being worth so much now?
Someone at AWS probably thought about this, easy to provision serverless Postgres, and they just didn’t build it.
I’m still looking for something that can generate types and spit it out in a solid sdk.
It’s amazing this isn’t a solved problem. A long long time ago, I was apart of a team trying to sort this out. I’m tempted to hit up my old CEO and ask him what he thinks.
The company is long gone…
If anything we tried to do way too much with a fraction of the funding.
In a hypothetical almost movie like situation I wouldn’t hesitate to rejoin my old colleagues.
The issue then, as is today is applications need backends. But building backends is boring, tedious and difficult.
Maybe a NoSql DB that “understands” the Postgres API?
- investa2 months ago
  Building backends is easy. It is sort of weird. In 2003 no one would bat an eyelid at building an entire app and chucking it on a server. I guess front-end complexity had made that a specialism so with all that dev energy drained they have no time for the backend. The backend is substantial easier though!
  These high value startups timed well to capture the vibe coding (was known as builidng an MVP before), front end culture and sheer volume of internet use and developers.
  - 9999000009992 months ago
    It’s harder than signing up for Firebase.
    You have to understand a separate set of concerns. Spin something up on ec2, hook it into a db, configure https , figure out why it went down, etc.
    You’re right though, once I build a complex front end I want someone else to do the backend.
    jimbokun2 months ago
    You need all that stuff when you need to scale. For an MVP you can get away with very little.
    9999000009992 months ago
    It depends on how you deploy it.
    Django on Render( and presumably a heroku) just works.
    It's still much more work that just dropping in a Firebase url. Firebase can lead to poor design choices and come back to bite you, but hopefully by then you've already raised a few VC rounds and you're rolling in dough.
- _bohm2 months ago
  "Easy to provision" is mostly a strategic feature for acquiring new users/customers. The more difficult parts of building a database platform are reliability and performance, and it can take a long time to establish a reputation for having these qualities. There's a reason why most large enterprises stick to the hyperscalers for their mission-critical workloads.
  - investa2 months ago
    That reason also includes SOC2, FedRAMP, data at rest jurisdiction, availability zones etc. And if large enough you can negotiate the standard pricing.
    _bohm2 months ago
    For sure. And oftentimes these less sexy features or certifications are much more cumbersome to implement/acquire than the flashy stuff these startups lead with
- zmj2 months ago
  > Someone at AWS probably thought about this, easy to provision serverless Postgres, and they just didn’t build it.
  AWS is working on this as well: https://aws.amazon.com/blogs/database/introducing-amazon-aur...
  - senderista2 months ago
    DSQL is genuinely serverless (much more so than "Aurora Serverless"), but it's a very long way from vanilla Postgres. Think of it more like a SQL version of DynamoDB.
- cpursley2 months ago
  Supabase is not just a hosted Postgres, it’s a full(ish) backend stack built on open source components comparable with something like firebase. But being Postgres, encourages same data modeling (and an escape hatch). Their type generation and SDK is quite good, too. It’s one of my favorite services and powers to projects of mine, soon to be 3.
  - 9999000009992 months ago
    I've tried Superbase.
    Their choice of Deno for edge functions is... Well, unique.
    For my current project I have to do a lot of quirky logic, and I kept hitting a brick wall with Supabase.
    I also didn't enjoy the self hosting journey. Not exactly easy.
    cpursley2 months ago
    Haven't used their edge functions yet. What's the issue with Deno (I'm not familier with it)?
    For the other stuff, what do you find quirky?
    9999000009992 months ago
    Firebase let's you write functions in normal node js and Python.
    Supabase only supports Deno. The quirkiness is my own server side logic. Tbf, I've tried to build this project at least 4 times and I might need to take a step back.
    cpursley2 months ago
    You aren’t forced to use their serverless runtime - you can pull their js lib into anything and sent events in several ways. There’s even Python libs.
  - cpursley2 months ago
    *encourages sane modeling. I can’t type today.
- jimbokun2 months ago
  > Maybe a NoSql DB that “understands” the Postgres API?
  I believe there are several of these already, like Cockroach DB.
- zamderax2 months ago
  Supabase is particularly valuable for its users. Or right now “vibecoders”
briandear2 months ago
Neon is awesome. I hope Databricks doesn’t brick it.
crowcroft2 months ago
If I'm guessing this either:
1. An acquihire (if your a Neon customer this would probably be a bad outcome for you).
2. A growth play. Neon will be positioned as an 'application layer' product offered cheap to bring SaaS startups into the ecosystem. As those growth startups grow and need more services sell them everything else.
- aurareturn2 months ago
  Who pays $1b for an acquihire?
  - crowcroft2 months ago
    Character AI is the only one I can think of. Although point taken, there must be more going on than a pure acquihire.
chachra2 months ago
Hope they don't increase the price!!
- kelnos2 months ago
  I'd be more worried that they'd shut it down...
AbstractH242 months ago
Anyone notice a rapid ramp up in acquisitions?
As though folks are looking for exits but IPO isn’t an option.
Think we’re approaching a reckoning for lots of companies that raised circa 2021 at valuations that are no longer plausible and AI startups.
Oh, and ones in the first group that tried to rebrand as the second…
matt-p2 months ago
Neon is a interesting product and they've got some great Postgres engineers. Having said that 1 Second cold starts are still quite painful for a website/web app.
I hope the $19 plans are there to stay - but I somewhat doubt it.
- clarkbw2 months ago
  cold starts are 500ms on average, and that's only for the first call that wakes up the db from hibernation. people still seem to think that this latency happens for every call (see other threads here) but once the service has woken up (cold start over) you're back to regular (sub 10ms) latency timings and the service continues to run that way. you'll only hit a cold start again if (you have this option turned on) your service goes idle for > 5 min. You can turn scale-to-zero off and you'll run 24/7, have zero cold starts.
  $19 plan is going away, will launch a better $5 plan soon.
  - matt-p2 months ago
    I use neon quite a bit, profiling seems to show ~600-980ms of extra latency. This is in the AWS London region, on postgres 15/16.
    Regardless if I've got a website that's used a couple of times a hour every hour then the practical reality is almost all users have a extra second of latency or so.
    I'm not complaining, it's a great product that I'll continue to use, but it's the biggest pain point.
taw12852 months ago
I am fairly new to all this data pipeline services (Databricks, Snowflakes etc).
Say right now I have an e-commerce site with 20K MAU. All metrics are going to Amplitude and we can use that to see DAU, retention, and purchase volume. At what point in my startup lifecycle do we need to enlist the services?
- speakfreely2 months ago
  A non-trivial portion of my consulting work over the past 10 years has been working on data pipelines at various big corporations that move absurdly small amounts of data around using big data tools like spark. I would not worry about purchasing services from Databricks, but I would definitely try to poach their sales people if you can.
  - lizard2 months ago
    Just curious, what would you consider, "absurdly small amounts of data around using big data tools like spark" and what do you recommend instead?
    I recently worked on some data pipelines with Databricks notebooks ala Azure Fabric. I'm currently using ~30% of our capacity and starting to get pushback to run things less frequently to reduce the load.
    I'm not convinced I actually need Fabric here, but the value for me has been its the first time the company has been able to provision a platform that can handle the data at all. I have a small portion of it running into a datbase as well which has been constant complaints about volume.
    At this point I can't tell if we just have unrealistic expectations about the costs of having this data that everyone wants, or if our data engineers are just completely out of touch with the current state of the industry, so Fabric is just the cost we have to pay to keep up.
    speakfreely2 months ago
    One financial services company has hundreds of Glue jobs that are using pyspark to read and write less than 4GB of data per run. These jobs run every day.
  - emmelaich2 months ago
    I'm aware of a govt agency with a few hundred gb of data using Mongo, Databricks and were being pushed towards Snowflake as well. Boggles the mind.
  - spratzt2 months ago
    I used to do similar work. Back in the day I used 25 TB as the cut off point for single node design. It’s certainly larger now.
  - jimbokun2 months ago
    Which is also a reason to not use Databricks, as they will cost your company money by selling gullible users things they don’t need.
ashvardanian2 months ago
Of all the billion-scale investment and acquisition news of the last 24 hours this is the only one that makes sense. Especially after the record-breaking $15B round, that Databricks closed last year.
senderista2 months ago
AWS just breathed a huge sigh of relief at the neutralization of Aurora's most dangerous competitor.
vladich2 months ago
Databricks previously invested in Neon through Databricks Ventures btw.
anshumankmr2 months ago
Great.As someone using Neon, how might this impact me? Price bumps?
- joshstrange2 months ago
  I'd be most concerned with Neon being shut down. That's what Databricks did to bit.io (another serverless Postgres provider they bought).
  I'm really not looking forward to a migration.
User232 months ago
Meanwhile here I am wondering why everyone isn’t using SQLite.
- jimbokun2 months ago
  If you can serve all your traffic by a single instance running Sqlite in same process as your application, have at it.
  If you need to serve your dats across a network to many clients, managing that with SQLite is much trickier.
- HWR_142 months ago
  I thought SQLite's use case was for a single-user local database.
  - 0x6c6f6c2 months ago
    More like "single process application's database".
    There are interesting use cases for DB-per-user which can be server or client side, or litestream's continuous backup/sync that can extend it beyond this use case a bit too.
    You _can_ use SQLite as your service's sole database, if you vertically scale it up and the load isn't too much. It'll handle a reasonable amount of traffic. Once you hit that ceiling though, you'll have to rethink your architecture, and undergo some kind of migration.
    The common argument for SQLite is deferring complexity of hosting until you've actually reached the type of load you have to use a more complex stack for.
outside12342 months ago
Ok, can we just. How is Databricks an AI unicorn exactly?
- ivape2 months ago
  Enterprises have lots of data. They store it somewhere, and there are multiple vendors that provide such "credible" infrastructure for this type of storage. Think of it like, your dad says he's willing to get a dog, but only trusts these-five-animal-shelters and nothing else. That doesn't mean that's correct (that those are the only places to get a dog), it just means that's what he trusts. Databricks is most likely a unicorn because they have successfully sold the idea that they are one of those trusted vendors, like Snowflake.
  The truth of the 2010s up until now is that every startup was a massive sales con job. The wealth of this industry is not truly built on incredible tech, but on the audacity of salesmanship. It's a billion-dollar con job. That's one of the reasons I take every ridiculous startup that launches quite seriously, because you have no idea just how audacious their sales people are. They can sell anything.
  Your question is very fundamental, and the answer is just as raw and fundamental too. I would love it if some of these sales people actually reform and write tell-alls about how they conned so many large companies in their years of working. This content has got to be out there somewhere.
  - woooooo2 months ago
    So, I'm not sure if this is less cynical or more cynical, but.. have you ever talked to the decision-makers who buy something like databricks?
    They can't build it themselves, and it's highly dubious that they'd be able to hire and supervise someone to build it. Databricks may be selling "nothing special", but it's needed, and the buyers can't build it themselves.
    tibbar2 months ago
    The thing is, it's actually a very difficult engineering/research/infra problem to run complicated queries on enormous data lakes. All the obvious ways to do it are prohibitively slow and expensive. Every bit of performance you can squeeze out of this, you unlock the ability for people to work with their data more easily. So there is huge value in having some centralized companies sink lots of R&D into trying to solve these problems well.
    th0ma52 months ago
    Is that how Databricks sees their customers? Yikes
    fock2 months ago
    I can tell you the company I work at (4000 people, legacy banking IT) has 4 people running our Datalake. We likely have more people buying/"evaluating" Databricks currently (from overhearing calls in open-plan offices), so I guess they have a point. A very sad point...
  - s1artibartfast2 months ago
    My mental model is that there are few big money printing industries, and the major players and it will pay just about anything for a slight advantage. It's really about additive revenue, it's about protecting market share.
  - 2 months ago
    undefined