SQLite-on-the-server is misunderstood: Better at hyper-scale than micro-scale(rivet.gg)

370 pointsby PaulHoule4 months ago24 comments

rstuart41334 months ago
Just a single data point but... I am looking at replacing a custom database with SQL. The application will only ever run on one server, so Sqlite3 was a contender. The database is very read mostly, which makes it a strong contender.
The custom database is extraordinary fast at some things, so it's a complex decision. I needed benchmarks.
I chose something that mirrors how data is accessed in the custom database. The existing database does have several tables, but it's hard to use so most accesses are single table, indexed by rowid, with maybe 100,000,000 rows. So I benchmarked a single table, indexed on primary key only, 100,000,000 small rows. Since it was easy because I could use the same SQL I benchmarked both Sqlite3 and Postgresql.
I expected Sqlite3 to beat Postgresql on single row access / updates, and Postgresql3 to get the upper hand on multi row access / updates. I was surprised to see Sqlite3 was about twice as fast as Postgresql on everything, including inserting the 100,000,000 records. It was always single writer. Had I tested multiple writers I expect I would have seen really flex its muscles.
In case your wondering, the custom database is 100 to 1,000 times faster than Sqlite3 at accessing a single record. Getting that performance was achieved with some big tradeoffs, which make it so hard to code for it creates reliability problems.
- wewewedxfgdf4 months ago
  Databases are configured for different priorities out of the box.
  For example when you insert a row that result in a write to the WAL. That's a data safety measure sqlite unlikely to be taking.
  You can configure Postgres to perform better with different tradeoffs - here's some thing to investigate:
  - synchronous_commit=off - wal_level=minimal - wal_writer_delay=200ms - checkpoint_completion_target=0.9 - autovacuum=off - bgwriter_delay=10000ms - shared_buffers=512MB - work_mem=64MB - UNLOGGED tables - run from RAM Disk (/dev/shm)
  - toxik4 months ago
    SQLite has had a WAL (write-ahead log) since 2010. Admittedly, it is not the default. [1]
    [1] https://www.sqlite.org/wal.html
    Sammi4 months ago
    * Not the default in the original c library ... except it is the default in language wrappers like better-sqlite3 for nodejs.
    It's probably unlikely that you're consuming the upstream sqlite c lib directly with its intentionally antiquated defaults. You're much more likely using a provider that has tweaked the defaults to more modern standards.
    jamal-kumar4 months ago
    I found the old wrappers to be really messed up and buggy compared to the new nodejs builtin that works charmingly well
    Do you know any details about those tweaks you mention because it would be very nice to know what edge cases I gotta watch out for using any of these
    Sammi4 months ago
    Been running better-sqlite3 in production for a long time and it's been fast and reliable for me.
    Looking at their github stats and open issues and prs, it looks like the project is healthy.
    cute_boi4 months ago
    > new nodejs builtin
    Do you mean this node js 23 sqlite api[1]?
    [1] https://nodejs.org/api/sqlite.html
  - MrLeap4 months ago
    In postgres, Is there a way to create a table that stores in ramdisk, but allows you to trigger writing to disk yourself?
    I have a usecase where our application has events where it would be a good time to trigger disk IO. The changes that occur between those app level checkpoints can totally be ephemeral.
    sa464 months ago
    Maybe something like (untested):
    - Create an unlogged table named events_active.
    - At the app_checkpoint time, start a transaction:
    1. Rename events_active to events_pending_write. 2. Create a new unlogged table, events_active, to handle writes for the next app_checkpoint.
    - In a new transaction:
    1. Set events_pending_write to LOGGED. This writes the entire table into the WAL. [1] 2. Rename to events_pending_write to events_part_20250304 3. Attach events_part_20250304 to a partitioned events table.
    For stronger consistency, combine the two transactions, but you'll block incoming writes until the transaction completes.
    Unlogged tables don't guarantee Postgres uses RAM for the table. If you need a stronger guarantee, mount a directory on a ramdisk and create a table space to hold the events_active table. Then, to promote to a logged table, use a create-table-as statement to copy the table into a new partition.
    [1]: https://dba.stackexchange.com/a/195829
    MrLeap4 months ago
    Thank you, gives me something to play with.
  - 4 months ago
    undefined
- remram4 months ago
  Is that SQLite with default settings? There are a few tunables for SQLite like mmap_size and page_size which help in most situations: https://sqlite.org/mmap.html (as well as journal_mode=wal)
  - rstuart41334 months ago
    Postgresql was a default install. The only sqlite3 customisation was: "PRAGMA main.journal_mode = WAL". In particular I suspect people are asking whether I used exclusive mode or unsafe writes for Sqlite3. The answer is no.
    Apart from that, everything was done in "reasonable sized transactions", with the same size being used for both Postgresql and Sqlite3. Necessarily so, because I ran the same SQL statements over both. For single row fetches or updates the transaction was a single row. When I was testing how fast I could update a batch of 10 rows, all 10 were done in a single transaction. (Aside: the transaction overhead far outweighed the cost of one statement. Updating 100 rows only took twice the time of updating 1.)
    When creating the database of 100,000,000 rows, they were inserted in transactions of 10,000 rows. I was emulating how our system inserts records into this 100,000,000 row table. It uses an almost ascending primary key, which makes life easier for the underlying b-tree. I've found emulating that sort of thing is important for realistic benchmarks. That's because on this particular table the primary key is (current-unix-time, 32-bit-random-int). current-unix-time was monotonically increasing, but in any given second some 100 randomly ordered 32-bit-random-int's were used. From memory Sqlite3 inserted 100k records per second, Postgresql some number less than that. But don't pay much attention to the absolute numbers, as it was on spinning rust while the real system was doing other things (again because the real system had the same constraints). Only my Macbook with NVME it was several times faster in absolute terms, but Sqlite3 and Postgresql maintained their relative positions.
- Quiark4 months ago
  How much of it was because you didn't go through TCP/IP with sqlite?
- RestartKernel4 months ago
  Are you able to elaborate on your custom database? Not that I'm in the position to offer any advice, but you got me curious about it.
- zhwu4 months ago
  This recent blog actually looks into the case with multiple writers and the distribution for the time for a writer to take the lock: https://blog.skypilot.co/abusing-sqlite-to-handle-concurrenc...
- simlevesque4 months ago
  Did you try with duckdb ?
  - rstuart41334 months ago
    Not yet. It's on my list. Mostly out of curiosity, because it isn't an OLAP application. While it is read mostly there are single row updates, just not enough to make multiple writers a frequent event.
- Salgat4 months ago
  Is something like redis with aof enabled an option? Assuming the rows aren't holding too much data.
bob10294 months ago
> Benefits of SQLite-Per-Partition
The obvious caveat here is any situation where you need global tables. I've never worked on a product where 100% of the schema could be partitioned like this. I've done a ton of SQLite-per-<entity type>, but there's always been a singleton SQLite database above this tracking the metadata.
Looking up a user session via token is not an analytics task. Neither is checking if an email address is already registered. These are very pedestrian queries that demand global visibility, ideally with hard transactional guarantees.
- kgeist4 months ago
  >I've never worked on a product where 100% of the schema could be partitioned like this
  SaaS for B2B works well with this, you partition by organization. Each organization has their own DB. Most organizations are typically small (no more than 100 users), and it greatly simplifies things. However, it's still problematic with large organizations (200k-300k is where it starts to fall apart).
  - skrebbel4 months ago
    But then any sort of management information, eg "find out how many active customers we have" or "which customer has the most activity" becomes a query over all the DBs. Similarly, admin tools for customer support etc easily become very hard to build. I'd love to know if anybody has a solution for this because otherwise DB-per-tenant sounds very nice indeed.
    kgeist4 months ago
    >But then any sort of management information, eg "find out how many active customers we have" or "which customer has the most activity" becomes a query over all the DBs
    Create a separate DWH storage where all necessary data from all databases is pulled and processed/aggregated in the background (via change data capture) and saved in a format that is more suitable for analytical queries. The downside is that it requires additional preparation and some upfront design.
    >Similarly, admin tools for customer support etc easily become very hard to build.
    Our CRM understands the concept of organizations/tenants, so when a customer files a ticket, we know their organization, and from there, for a tech support person, accessing "coolcompany.mysaas.com" is no different from accessing "mysaas.com". In my opinion, it makes tech support easier, because you have far less data/fewer logs to analyze/sift through (only a specific organization's DB) when investigating a problem.
    superq4 months ago
    OLAP questions are usually out-of-band and preferably by a tool designed for it (like Clickhouse). Scanning all DB's is something that can be done in the background for most of these use cases.
    osigurdson4 months ago
    I don't think we currently have any solutions that allow you get the answer to any question with optimal performance. You could instead push the needed metrics to a stream during your normal processing and aggregate it. This of course duplicates the data as the same information is already available in the tenant databases but is a very reasonable trade-off.
  - NathanFlurry4 months ago
    Yep.
    Nile (https://www.thenile.dev/) is trying to address this use case with a fully isolated PG databases. Though, I don't know how they handle scaling/sharding.
- hinkley4 months ago
  Global tables with anything but a tiny rate of write operations is going to bring Amdahl's Law in to ruin several years of your life while you fight against it.
  I had a service that was adding 30 ms to TTFB because most of the other decisions we made during page render were predicated on this response. I would later find about a quarter of this time was in bookkeeping code that got out of hand, but that's a story for another day.
  The biggest problem was that this data requires a capital-P Process to change, so it took about 20 minutes to change it and fifteen seconds to roll it back. There had been plans to have the service do more but in the end it looked more like a remote request for a feature toggle. We were already broadcasting feature toggle changes through Consul, which is great for this kind of data. So I did more of the same and got to decom a cluster.
  Moral of the story is, it matters what kind of global data you require to build your system, and that's the sort of fact you should keep in mind while designing the system because you can pick a design that will scale or one that absolutely won't, because you've given the users features that make the entire system 10x more expensive per request.
- jitl4 months ago
  Put your global tables in Postgres/CockroachDB/Vitess/etc. You don't need to force everything to be a SQLite table. Use the right tool for the job. SQLite can be the right tool, and it can not be the right tool.
  - superq4 months ago
    Agreed, or even just start with sqlite for your globals and then scale to those later.
- apitman4 months ago
  Agree with your overall point, but for user session specifically, once a user has entered an identifier can't you use their specific DB for sessions? You also have things like JWTs/PASETO that can carry the DB mapping in the token.
- NathanFlurry4 months ago
  > The obvious caveat here is any situation where you need global tables
  A lot of people still end up storing data that's not frequently updated in a traditional OLTP database like Postgres.
  However:
  I think it always helps to think about these problems as "how would you do it in Cassandra/DynamoDB?"
  In the case of Cassandra/DynamoDB, the relevant data (e.g. user ID, channel ID, etc) is always in the partitioning key.
  For Durable Objects, you can do the same thing by building a key that's something like:
``` // for a simple keys: env.USER_DO.idFromName(userId);
// or for composite keys: env.DIRECT_MESSAGE_CHANNEL_DO.idFromName(`${userAId}:${userBId}`); // assumes user A and B are sorted ```
I've spoken with a lot of companies using _only_ this architecture for Durable Objects and it's working well.
- manmal4 months ago
  Is it even realistic to depend on transactional guarantees, with hundreds of services hammering the DB(s) more or less concurrently? Don’t they need to coordinate themselves outside of transactions?
  - bob10294 months ago
    > Is it even realistic to depend on transactional guarantees, with hundreds of services hammering the DB(s) more or less concurrently?
    If you have hundreds of services hammering the datastore concurrently, then the ability to offer transactional guarantees seems absolutely mandatory for me.
    > Don’t they need to coordinate themselves outside of transactions?
    I think we're back at the beginning of our journey here.
    manmal4 months ago
    Do you mean via distributed transactions? I‘m clueless in that area, and don’t understand how hundreds of self sufficient services would partake in transactions that span different sets of those services, without entangling them badly.
    refulgentis4 months ago
    I find this topic difficult to reason about because I'm not intimately familiar with DBs at scale.
    That being said: my understanding is we're always going to have something that needs to maintain its own state that's global, and you're naming that problem as well.
    For example, let's say we partition users based on the first letter of their email addresses.
    This works great for most user-specific queries (e.g., fetching a user profile).
    But what happens when someone registers a new account?
    At that point, we must ensure the email is globally unique.
    A purely partitioned approach won't help here—we'll need some kind of global database or service maintaining a single source of truth for email uniqueness checks.
    (then it gets complicated, because of the simple level at which I can understand and communicate about it. Why not just partition based on the first letter of an email? Well, yes, then we just have to deal with emails changing. Maybe a better example is session tokens, because they don't come with an email. But we could require that, or do some bespoke thing...there's security concerns there but they seem overrated...but to your point, you end up adding a ton of complexity just so you can fit a square peg in a round hole)
    humodz4 months ago
    Do you mind elaborating why a db partitioned like that is not enough for your registration example? If the partitioning is based on the email address, then you know where the new user's email has to be if exists, you don't need to query all partitions.
    For example, following your partitioning logic, if the user registers as john.smith@example.com, we'd need to query only partition j.
    refulgentis4 months ago
    You're right, the email address example isn't clearcut -- its not an issue at all at registration. From there, you could never allow an email change. Or you could just add a layer for coordination, ex. we can imagine some global index that's only used for email changes and then somehow coordinates the partition change
    My broad understanding is that you can always "patch" or "work around" any single objection to partitioning or sharding—like using extra coordination services, adding more layers, or creating special-case code.
    But each of these patches adds complexity, reduces flexibility, and constrains your ability to cleanly refactor or adapt later. Sure, partitioning email addresses might neatly solve registration checks initially, but then email changes require extra complexity (such as maintaining global indices and coordinating between partitions).
    In other words, the real issue isn't that partitioning fails in a single obvious way—it usually doesn’t—but rather that global state always emerges somewhere, inevitably. You can try to bury this inevitability with clever workarounds and layers, but eventually you find yourself buried under a mountain of complexity.
    At some point, the question becomes: are we building complexity to solve genuine problems, or just to preserve the appearance that we're fully partitioned?
    (My visceral objection to it is, coming from client-side dev virtually my entire career: if you don't need global state, why do you have the server at all? Just give use a .sqlite for my account, and store it for me on S3 for retrieval at will. And if you do need global state...odds are you or a nearby experienced engineer has Seen Some Shit, i.e. the horror that arises in a codebase worked on over years, doubling down on an seemingly small, innocuous, initial decision. and knows it'll never just be one neat design decision or patch)
    gabeio4 months ago
    > but then email changes require extra complexity
    Check the other partition for the user name. Create the new user with the same pointer (uuid, etc) to the user’s sqlite file, delete the old user in the other partition. Simple user name changed. Not really that complex to be honest. (After thinking this through I’m probably going to suggest us changing to sqlite at work…)
    > if you don't need global state, why do you have the server at all?
    2 reasons I can think of right off of the top of my head are:
    - validation (preventing bad actors, or just bad input)
    - calls to external services
    juliuskiesian4 months ago
    What if the users are partitioned by ID instead of email? You would have to iterate through all the partitions.
    TylerE4 months ago
    Not much of a partition if it's on what is essentially an opaque unique key.
    manmal4 months ago
    FWIW, I‘ve seen consensus here on HN in another thread on SQLite-on-server, that there must indeed be a central DB for metadata (user profiles, billing etc), and all the rest is then partitioned.
    NathanFlurry4 months ago
    I (sort of) disagree. Cassandra- & DynamoDB-based systems which are also partitioned do fine without a central OLTP DB.
    Wrote a bit about it here: https://news.ycombinator.com/item?id=43246212
  - NathanFlurry4 months ago
    > Is it even realistic to depend on transactional guarantees, with hundreds of services hammering the DB(s) more or less concurrently?
    If a single request frequently touches multiple partitions, your use cases may not work well.
    It's the same deal as Cassandra & DynamoDB: use cases like chat threads or social feeds fit really well because there's a clear ownership hierarchy. e.g. message belongs to a single thread partition, or a social post belongs to a feed partition.
- slt20214 months ago
  would redis/in memory DB be a better suited for this task? I don't think a regular rdbms will meet this requirement at scale
  - bob10294 months ago
    A regular RDBMS is insufficient for managing user accounts and sessions?
    What kind of scenario do you have in mind?
- stepanhruda4 months ago
  You could have another sqlite with this global information related to users / sessions / passwords etc
lbutler4 months ago
I’m building a local-first web app, and SQLite works well for my case since a single project can be contained in one database file, just like users are used to with existing desktop applications.
What I’d really like is an easy way to sync the SQLite database state to a cloud service. Most existing options expect you to query against a remotely hosted database and charge per read/write.
Since the database will have around 100,000 rows and you're typically working with all the data at once, streaming parts of it doesn’t make sense for my use case.
The closest I’ve found is Turso, which has offline writes in private beta, and SQLite Cloud, which lists local-first and offline sync as "coming soon."
The simplest approach might be letting users push to S3 storage with versioning. Ideally, it would also support point-in-time restores, tracking incremental updates alongside full snapshots.
Even better, I’d manage minimal server-side infrastructure and just pull the SQLite database from a service that handles syncing and management.
- rogerbinns4 months ago
  SQLite has a session extension that can record changes on a local database into a changeset and you can replay those changes on another SQLite instance. Note that it replays what the changes were, not the queries that resulted in the changes. When applying changes you provide a conflict handler. (You can also invert changesets making a handy undo/redo feature.)
  You can save conflicts to another changeset. There is also a rebaser to help deal with multiple way syncing.
  https://www.sqlite.org/sessionintro.html - overview
  https://www.sqlite.org/session/sqlite3changeset_apply.html - conflict information
  https://www.sqlite.org/session/rebaser.html - rebaser
  - chii4 months ago
    there's also a CRDT version of this, which allows two databases to be sync'ed to each other in real time (aka, updates to one will eventually make it to the other, and both database would eventually contain the same data).
    It's https://vlcn.io/docs/cr-sqlite/intro , and i find it amazing that this is doable in sqlite. It is perfect for small scale collaboration imho, but it also works to sync across local client and remote server (for a single db per user scenario).
    hitekker4 months ago
    Interesting link, it'd be great if their solution meets expectations.
    Right now, the proof-of-concept they've provided seems simplistic. Their progress seems to have shifted from cr-sqlite into "Zero" instead. I'm guessing it has something to do with CRDTs being quite app-specific and hard to generalize.
    I would want to see this library used in production first before hyping it
    chii4 months ago
    in a sense it is quite specific. In a difference sense, this is as generic a CRDT as you can get - it's CRDT on table(s). There's no merging of rows iirc (unless you write a custom merge, which is supported but probably need some tweaking and could lead to poor results?).
    stronglikedan4 months ago
    > in real time (aka, updates to one will eventually make it to the other
    The term you're looking for is "eventual consistency".
  - roncesvalles4 months ago
    This is just clobbering one of the divergent copies with per-field granularity.
- 0cf8612b2e1e4 months ago
  Maybe I am misunderstanding which part you want in the cloud, but that sounds like litestream. Let’s you transparently backup a live SQLite database to a remote destination.
  https://litestream.io/
  - gwking4 months ago
    I depend on litestream for production backups and as the months wear on without any releases I am getting more nervous. To be clear, I don’t feel entitled to anything with an open source project like this, but bug reports and fixes seem to be accumulating. I have flirted with the idea of building from main.
    I’ve also flirted with the idea of forking litestream and stripping it down dramatically. The reason why is that I don’t like the idea of the production server being in charge of rotation and deletion. It seems like the thing getting backed up shouldn’t have the privilege of deleting backups in case it gets compromised. I might even go so far as to propose that the “even liter stream” process merely writes to a different local volume and then some other process does the uploading but I haven’t gotten beyond the daydream stage.
    normie30004 months ago
    Having run litestream in prod for 2+ years, I share all of these concerns.
    > It seems like the thing getting backed up shouldn’t have the privilege of deleting backups in case it gets compromised.
    For backups, I added a nightly cron job which exports my SQLite db to a write-only S3 bucket.
    superq4 months ago
    If that will fit your RPO, why not only do that? Saves a lot of complexity (and risk).
    normie30004 months ago
    It doesn't fit my RPO.
    What's the additional risk?
    superq4 months ago
    Complexity == risk.
    > It seems like the thing getting backed up shouldn’t have the privilege of deleting backups in case it gets compromised.
    (agreed)
    > For backups, I added a nightly cron job which > exports my SQLite db to a write-only S3 bucket.
    Why not only do this and use an s3 sync instead? You can safely backup SQLite databases while they're being written to, so no need to export (dump) them; just copy the files themselves.
    This might mean that your entire backup/restore strategy is just to copy some files. If so, that's ideal.
    (Of course, s3 sync does require reading as well as writing, so perhaps just increase your cron job to run more often so it fits within your RPO)
    normie30004 months ago
    I'm starting to buy it.
    > You can safely backup SQLite databases while they're being written to
    Is this true with WAL enabled?
    chubot4 months ago
    What kind of bugs have you experienced or are you worried about? Backup software shouldn’t need to be frequently updated
    dspillett4 months ago
    I think the implication isn't that there are bugs they are immediately concerned about, but that other issues not being addressed might mean that should they run into a bug that does cause problems there may not be a timely solution, if any.
    edoceo4 months ago
    And comine with the session features that @rogerbinns mentioned. Feels resilient.
  - adamtaylor_134 months ago
    Yeah, I was about to suggest litestream. Isn't it local-first-with-cloud-backups?
- conradev4 months ago
  SQLite has the ability to do exactly this, minus the versioning: https://sqlite.org/cloudsqlite/doc/trunk/www/index.wiki
  Implementing snapshot backups on top of that would be as simple as a VACUUM and S3 PUT
  For point-in-time restores, mvsqlite is a cool solution: https://github.com/losfair/mvsqlite
- Daril4 months ago
  Have your tried CR-SQLite ? https://vlcn.io/docs/cr-sqlite/intro
  It implements CRDT as SQLite extension.
- roncesvalles4 months ago
  Offline-first databases are a hard problem because there isn't just one copy of the database on the user's side, there are N copies - every browser tab or device on which the user can open the local database and make an edit. It's basically an AP multi-master database (= the same row can be edited at different nodes at the same time), and you likely cannot achieve good results without a database that natively supports multi-master operations.
  - 9dev4 months ago
    That’s not necessarily true; if you use Origin Private Filesystem along with a Web Worker that acts as a local database server and works off a single SQLite database, you at least have a single DB file per device. From there on, your problem becomes state reconciliation on the server, which CRDTs should help solving.
    Not an easy problem for sure, but the web platform is surprisingly capable these days.
    roncesvalles4 months ago
    CRDTs are so-so and likely cause issues with maintaining relational DBs' transactional consistency. There's a reason none of the NewSQL databases (to my knowledge) are multi-leader.
    4 months ago
    undefined
- larkost4 months ago
  I too think that CRDT databases are probably something you should explore. You generally have the whole database locally, and changes get synced pretty easily (but you have to live within the rules of your CRDT).
  The one I thought of (mostly because I worked there before they went under/bought by MongoDB) is RealmDB: https://en.wikipedia.org/wiki/Realm_(database)
  I have long since lost touch with the state of it, but at the time the syncing to their server was fast and had worked with a long list of environments/languages.
  The one thing I will caution: their model was that you almost had to have a database-per-customer. You could have a second one that contained common information, but they had no concept of only syncing part of a database based on some logic. So many customer implications had the clients syncing multiple databases, and then a back-end client that would aggregate the data from all of those databases into one for backend processes. Extra complexity that I always thought was a real killer.
- jimbokun4 months ago
  Isn't the simplest way to "sync" to just replace the remote database file with the local database file? One of the nice things about each database being encapsulated as a single file.
  - timewizard4 months ago
    Enabling WAL mode means you don't have a single file anymore.
    catgirlinspace4 months ago
    You could do a checkpoint first though I believe? And if the database is only being updated on your local client I don’t think WAL mode would have much benefit since it’s probably not getting many concurrent writes.
    timewizard4 months ago
    The WAL has a minimum size. In this context I assumed you would not be using SQLite to serve requests serially.
    pstuart4 months ago
    More work than grabbing a single file but still easy enough to use: https://www.sqlite.org/backup.html
- Cthulhu_4 months ago
  > What I’d really like is an easy way to sync the SQLite database state to a cloud service.
  Don't do this, but an application I used to work on (to replace it) copied the sqlite file to a folder then used rsync to sync it with a backup node. Apparently it worked and was good enough for that use case (inefficient php backend application with at most a dozen concurrent users).
  100.000 rows is only a few megabytes at most, right? Should be fine.
  - superq4 months ago
    > Don't do this
    What's wrong with that? Of course it will work fine; SQLite, with or without WAL, has a ton of protections against corruption from writes-in-progress, which is what makes hot backups work.
- galaxyLogic4 months ago
  How about: Have 1 + N separate SQLite database-files.
  Each user would have their own database-file which contains only information about that user. Then 1 shared database-file which contains info needed for all users.
  Users would update their own data, which is a small database file which can be easily uploaded. They would not need to update the shared data.
  Not knowing your app I don't know what the shared data would contain, presumably something. Perhaps the shared data-file would be updated on the server based on what individual user-data the users upload.
  - anovick4 months ago
    In this multi-tenant arrangement, you run into synchronization problems.
    Developers should expect users to connect to the service using multiple devices (clients).
    AFAIK bare SQLite doesn't offer synchronization mechanisms between multiple SQLite instances.
    I believe Turso offers some solution of this kind, but not sure if that's open source or not.
    galaxyLogic4 months ago
    > expect users to connect to the service using multiple devices (clients).
    But probably using only one device at a time by a single user?
    My thought, and it is just a thought, here is that instead of trying to provide a GENERAL solution for all kinds of data-update patterns, it is often possible to think in terms of what my current application specifically needs. It is easier to come up with such a solution with SQLite per app because SQLite is so "lite".
    I can't speak for the "general solution" except to say that many times you don't need an all-encompassing general solution, just a solution for your current app.
    normie30004 months ago
    > But probably using only one device at a time by a single user?
    It depends on your expectations of concurrent use. Computer + tablet + phone means many users may use different devices within seconds of each other. If you want to support offline-first usage, concurrent updates from different clients for the same user becomes more likely.
- ammo16624 months ago
  A simple, manual backup would be fine I think. You can just put an "upload" or "backup to cloud" button to allow user push a full version with timestamp to S3.
  Synchronization may introduce a lot more problems, especially when you want to automatically sync the database to some other place. You will need to deal with sync errors, inconsistency, version conflicts, rollbacks...
  If your users could accept that, a simple full version backup is the best solution.
- pbronez4 months ago
  Dolt would do that for you. It has push/pull semantics like git. As a bonus you can use its version control features to implement sophisticated undo/redo features.
  https://dolthub.com/blog/2022-09-23-dolt-rollback-options/
- osigurdson4 months ago
  I've wanted to use SQLite a few times for the simplicity. I always end up using Postgres though because I don't understand how multiple services / replicas can make use of it. If another piece of infrastructure is needed to support it (even nfs), that seemingly counters any simplicity gains.
- vvern4 months ago
  Check out https://sqlsync.dev/
- ozim4 months ago
  Why not local storage or in browser db? If it is a local web app there is no need for backend.
- redwood4 months ago
  Have you seen PowerSync?
- isaachinman4 months ago
  Forget some roll-your-own stuff. Have a look at Replicache, Zero, and Evolu.
- mharig4 months ago
  [dead]
runako4 months ago
Notable that Ruby on Rails recently shipped its 8.0 release, which had as one of its pillars extending SQLite support to replace cache and job queue components, and to make it production-ready for common types of Web apps.
https://rubyonrails.org/2024/11/7/rails-8-no-paas-required
akavi4 months ago
Hmm, this article is a little confusing. I'm not familiar with Vitess or Citus, but am familiar with "manually" sharded Postgres/Mysql, and I'm not sure I understand if there's any "interaction effects" of the decision to shard or not and the decision between MySQL/Postgres and Sqlite.
Like, the article's three sections are:
1. The challenges of sharding
2. The benefits of these new sharded Sqlite solutions over conventional Sqlite
3. A list conflating the benefits of SQL databases generally with the benefits of Sqlite
None of which answer the question of "Why should I use sharded Sqlite instead of, say, sharded Postgres, for hyperscale?".
- NathanFlurry4 months ago
  Author here.
  Agreed — I think adding some comparisons to other database partitioning strategies would be helpful.
  My 2 cents, specifically about manually partitioning Postgres/MySQL (rather than using something like Citus or Vitess):
  SQLite-on-the-server works similarly to Cassandra/DynamoDB in how it partitions data. The number of partitions is decoupled from the number of databases you're running, since data is automatically rebalanced for you. If you're curious, Dagster has a good post on data rebalancing: https://dagster.io/glossary/data-rebalancing.
  With manual partitioning, compared to automatic partitioning, you end up writing a lot of extra complex logic for:
  - Determining which database each piece of data lives on (as opposed to using partitioning keys which do that automatically)
  - Manually rebalancing data, which is often difficult and error-prone
  - Adding partitions manually as the system grows
  - (Anecdotally) Higher operational costs, since matching node count to workload is tricky
  Manual partitioning can work fine for companies like Notion, where teams are already invested in Postgres and its tooling. But overall, I think it introduces more long-term problems than using a more naturally partitioned system.
  To be clear: OLTP databases are great — you don’t always need to reach for Cassandra, DynamoDB, or SQLite-on-the-server depending on your workload. But I do think SQLite-on-the-server offers a really compelling blend of the developer experience of Postgres with the scalability of Cassandra.
  - itake4 months ago
    > - Determining which database each piece of data lives on (as opposed to using partitioning keys which do that automatically)
    Most sharding databases use consistent hashing.
    > - Manually rebalancing data, which is often difficult and error-prone
    not really. When you setup the database, you choose a highly divisible number of shards and then consistent hashing to spread the data across the shards. Each server hosts N-shards where N changes as your data (and server count) grows
    > - Adding partitions manually as the system grows
    Not really. Just choose a reasonably high number of shards and divide them across your servers.
    > - (Anecdotally) Higher operational costs, since matching node count to workload is tricky
    This could be true, but also, there is overhead to managing a ton of SQLite databases too. I think there are tradeoffs here.
- koakuma-chan4 months ago
  If you think this is a good fit for your case, you should embed SQLite in your application and shard your application. An embedded SQLite is faster and uses less memory than a PostgreSQL running as a separate process and possibly on a different machine.
renegat0x04 months ago
Might be a little bit off topic. I created a web page, with data. I didn't want to host VPS, be charged for traffic. I do not want also to play with cloudflare and self-hosting.
My solution? The app is a webpage, which reads SQLite. If user wants to use app, the database is downloaded, unpacked, and used on users device.
Links:
- https://github.com/rumca-js/Internet-Places-Database - search.html provides a preview for my database file (but code supports also reading zip file).
- https://rumca-js.github.io/search?file=top&page=1&search=neo... - uses JSON files stored in a zip file. Will be replaced soon with zip + sqlite file
- https://rumca-js.github.io/search?file=music&view_display_ty... - example showing my fav music. As above uses JSON files, in a zip file
- billywhizz4 months ago
  this is nice. i like the idea which has been tried in a few places of running sqlite in the browser directly/locally. the only thing that is really missing to make this work at a bigger scale for read-heavy databases is a very cheap or free static hosting service which does range requests, allows you control of CORS and doesn't have the file size limitations of gist or github pages. maybe this exists already? S3 would do i guess?
  you can do kinda magic things like this and build websites that connect to multiple different databases around the web and... well, i'll leave the rest up to your imagination.
  go here: https://just.billywhizz.io/sqlite/squeel/
  hit CTRL/CMD + Q on your keyboard.
  paste in this sql
``` attach database 'https://raw.githubusercontent.com/just-js/just.billywhizz.io...' as chinook ;
select * from albums ; ```
and hit CTRL/CMD + g to run the queries.
- IshKebab4 months ago
  I mean if you only have a few thousand records you barely need a database at all.
  - x-complexity4 months ago
    > I mean if you only have a few thousand records you barely need a database at all.
    Keyword being "barely".
    There are organization benefits if you can structure your data into a DB, instead of having each page redundantly hold the same header & metadata info.
  - renegat0x04 months ago
    Previously I have been using JSON. However there are multiple structures with relations between them so... this seems to be a database.
    Extracting data from it also becomes really really easy with selects. Otherwise I would have to implement, or reuse some algorithms to filter JSON data, etc.
koakuma-chan4 months ago
TDLib (Telegram Database library)[0] uses SQLite.
In Telegram Bot API, each TDLib instance handles more than 24,000 active bots simultaneously.[1]
[0]: https://github.com/tdlib/td
[1]: https://core.telegram.org/tdlib
apitman4 months ago
This tidbit caught my eye:
> Apple runs the world's largest known number of Cassandra/ScyllaDB instances with roughly 300,000 nodes as of 2022
The Big Tech approach to web services is like everyone piling on a cruise ship to commute to work. Certainly brings some conveniences, but the complexity is insane and overall it feels absurd.
The relatively new DB-per-tenant type approaches described in this article are more like renting/leasing a car, or maybe riding the bus to work. Definitely a step in the right direction.
The future will be a mix of these solutions, and selfhosting/indiehosting. Indiehosting is like owning a car. Once the software is as reliable as cars are, many more people will do it.
- NathanFlurry4 months ago
  Author here, love this take.
  I've chatted with a few medium-sized companies looking at Durable Objects for this reason. DB-per-tentant removes much of the need for another dedicated team to provision & maintain infrastructure for the services. It's almost like what microservices were trying to be but fell woefully short of achieving.
  It's disappointing (but understandable) that "serverless" received a bad rap. It's never going to fully replace traditional infrastructure, but it does solve a lot of problems.
PaulHoule4 months ago
I rebuilt my home server a few weeks ago and the thing I was most anxious about getting working was the Ubiquiti controller which controls maybe 5 access points and managed hubs in my house.
The damn thing depends on mongodb and, worse, an old version of mongodb. I found a script that installed it all but it seems a system like that Ubiquiti controller really ought to use sqlite for a "just works" experience. For a while I thought maybe mongo was necessary if you had a really big installation with 10,000+ access points but if this article is right, sqlite ought to be good for the biggest installations out there.
- briHass4 months ago
  Been there, have the scars from googling mongo error log messages to prove it.
  I've been pleased at work to see some on-prem server apps switching to SQLite recently. Most recent that comes to mind is EFT Server (Windows SFTP/FTPS server), which switched from SQL Server Express to SQLite. SSE always makes me groan: huge install, pain to manage, etc. SQLite is just so easy to copy files around and make backups. I'm sure it's probably faster also.
  - qingcharles4 months ago
    I've been switching all the projects I had running on SSE to SQLite. What a difference in ease of maintenance!
- z3ratul1630714 months ago
  Using mongodb for the controller was a very poor decision. I went through similar issues migrating my controller :(
simonw4 months ago
"Better At Hyper-Scale Than Micro-Scale"
That's a slightly misleading headline, because it can be interpreted as implying SQLite is BAD at micro-scale. That's not an argument the rest of the piece makes - it's more about how SQLite is great at large scale and people who focus on the small-scale aspects might miss that.
- NathanFlurry4 months ago
  Fair point, noted.
vetrom4 months ago
In my experience, in a multiuser environment SQLite falls over pretty hard, due to a lack of anything like MVCC.
I'm aware that there a bunch of sqlite-compatible implementations and extensions that add MVCC. How close are those to actually being SQLite? Are they just a totally different table storage and transaction engine sitting behind SQLite's SQL parser/compiler and VM?
shipp024 months ago
If anyone has heard Joe Armstrong's talk about how communication is limited by latency and data can only travel so fast. I think having smaller a partitions locally is an optimal point.
If You want global consistency then you'll have to either spend some time at runtime to achieve it, Have complicated protocols, fast networking, synchronized clocks.
Does this look like actor model (from Erlang) if you squint a bit?
- NathanFlurry4 months ago
  Author here! I agree it's very similar to the actor model, but I kept the article's scope small, so I didn’t cover that.
  In fact – Durable Objects talks a bit about its parallels with the actor model here: https://developers.cloudflare.com/durable-objects/what-are-d...
  You might also appreciate this talk on building a loosely related architecture using Erlang, though it doesn't implement an actor-per-database pattern – https://www.youtube.com/watch?v=huGVdGLBJEo
- chasemp4 months ago
  I was thinking something very similar. Once you've accepted any need at all for global state the next move is to reorient to minimizing it with horizontally scalable point local state and a small targeting dataset and tiered caching system.
talkingtab4 months ago
I have a fair amount of PostgreSQL experience at this point. Enough so the advantages of using SQL and a database are very clear to me. But PostgreSQL is a bulldozer and sometimes I want the tactical benefits of an SQL db in situations where installation of PostgreSQL is not warranted. Throw away, small scale, experimental, etc.
SQLite is doing very well for my use cases. Especially using the experimental Node.js version - all I have been using. So I would highly recommend using it when you get to the point of moving data around.
On the other hand there is one big thing. And this may be only the Node.js version. But several times when I have been doing development work my file based database has become corrupted. I violated some constraint, the sql was poorly formed, etc. Always my fault. I simply remove the file that SQLite is using and create a new one or use a backup copy.
The thing is that I have never seen a PostgreSQL DB become corrupt. No matter what horrendous code I wrote. Never. I have not really appreciated how completely bullet proof it has been for me.
Just my 2 cents worth.
- MyOutfitIsVague4 months ago
  > But several times when I have been doing development work my file based database has become corrupted. I violated some constraint, the sql was poorly formed, etc. Always my fault. I simply remove the file that SQLite is using and create a new one or use a backup copy.
  That really shouldn't be able to happen unless you're using the API wrong or have weakened safety with pragmas or database configurations that trade off resilience for performance. Even with crashes, attempts to violate constraints, poorly formed SQL, etc, the file should never become corrupt. If it does, that's a bug with SQLite, not you. Corruption bugs are exceptionally rare in SQLite these days.
  - talkingtab4 months ago
    Yes, I understand I was "using the API wrong". That is true. But the point I was trying to make was that I have used the PostgreSQL API "wrong" or "worse" and my data has never been corrupted. With SQLite I tried to open an SQLite database several different times where I got an error when dropping a table. If I simply restarted with a clean copy of the file, the code worked perfectly { fix bug, replace copy, restart }. And to be clear this was not data - like the row had a 3 not 2 - this was that running "drop table" on an existing table failed. Internal stuff in the file got broken. It happens when a process crashes.
    My point is NOT that users should not adopt SQLite, just the opposite. I have found that using SQL helps me build things much more easily. And better. It help s you think about your code more effectively.
    My recent adoption of SQLite and using it in smaller, lighter situations has worked extremely well and I will keep on using it. But... my limited experience and comparison of the two products suggests that as you would expect given the Lite, there are some differences that have real life consequences.
  - SomeUserName4324 months ago
    > Corruption bugs are exceptionally rare in SQLite these days.
    I develop a third party software and I receive reports of corrupted SQLite databases roughly every 2 months.
    However, without any reproducibility, it might very well be due to hardware faults etc.
    Still beats having end-users install something like postgres.
- djtidau4 months ago
  Have you given pglite a try?
byoung24 months ago
Looks interesting. How would you approach write and multiple read replicas with this setup? My holy grail would be allowing users to get data from read replica, but also writing to the read replica which then forwards those writes to the primary write DB
- NathanFlurry4 months ago
  I think the other comments have the application-level approaches covered.
  However, I suspect the infrastructure will provide this natively as it matures:
  - Cloudflare will probably eventually add read replicas for Durable Objects. They're already rolling it out for D1 (their other SQLite database offering). [1]
  - Turso has their own story for read replicas. [2]
  [1] https://blog.cloudflare.com/building-d1-a-global-database/#s... [2] https://docs.turso.tech/features/embedded-replicas/introduct...
- hinkley4 months ago
  One of the classic solutions to this was to put the data behind a cluster and route by verb. All the GET requests go to the replicas and everything else goes to the writeable instance.
  - datadrivenangel4 months ago
    Suffers from dirty reads: Write to the primary, get confirmation of write-success, reload page, user sees 'stale' data from the read-replica and thinks you lost their data.
    yurishimo4 months ago
    A temporary header that forwards the next X requests to the main write database is a possible solution. Requires a bit more ooomfph to handle the traffic and probably isn’t good for a write heavy app, but for a generally read heavy CRUD app, it’s a decent compromise.
    hinkley4 months ago
    I've also seen a writeup of someone who used clocks. The write is transaction 1234, so the client has to read from a replica that is caught up to 1234 in the WAL, or get bounced to another that does.
    And of course there's the Phoenix LiveView solution where the write contains the new data, gets injected back into the page that contained the edit link. That doesn't quite cover the save-reload loop that some users with well-earned trust issues follow.
- slt20214 months ago
  there is no need in read replica, because every micro sqlite partition has very low workload (single digit queries per second)
NathanFlurry4 months ago
Author here, happy to answer questions!
- robertlagrant4 months ago
  I love the idea of Durable Objects, and have thoughts about it in the health space (although having a single provider of them is probably a bad thing) but cross-cutting questions such as this seem to be quite difficult to achieve.
  From the chat logs example in your article: how do you cope with a requirement such as "I as a user want to see all my latest thread activity in one place, across all my chat rooms?"
  - jmull4 months ago
    That's an easy case: write to two places (one for the chat and one for the user's activity).
    In this case I think you can let them become inconsistent in the face of, e.g., write errors.
    NathanFlurry4 months ago
    This.
    > In this case I think you can let them become inconsistent in the face of, e.g., write errors.
    For devs using CF Durable Objects, people frequently use CF Queues or CF Workflows to ensure that everything is eventually consistent without significant overhead.
    It's a similar pattern to what large cos already do at scale with keeping data up to date across multiple partitions with Cassandra/DynamoDB.
  - korkybuchek4 months ago
    > From the chat logs example in your article: how do you cope with a requirement such as "I as a user want to see all my latest thread activity in one place, across all my chat rooms?"
    create a copy, for example
- grounder4 months ago
  In your chat channel example, you have a table for messages, and a table for participants. How do you join the participants in this chat channel database with whichever database the participants are actually defined in, so the application would be able to show participant details (name, avatar, etc.)?
  - NathanFlurry4 months ago
    The two tables are intended to be part of the same "chat" partition (ie SQLite database). You can join them with a native SQLite query. Seems I should make this more clear.
    Cheers
    grounder4 months ago
    I understand the two tables in your chat example, I think. I'm wondering how you get the rest of the user profile data (name, for example). Is that table in a totally different SQLite database? If so, can you join on that, or do you need to query it separately? Thanks!
    NathanFlurry4 months ago
    Yep. Most designs I've seen put user profiles under a different table which requires a separate query.
- abdullin4 months ago
  Interesting read, thank you!
  Do you use any special tools to manage all these separate databases, track performance and debug problems?
  - NathanFlurry4 months ago
    The folks over at StarbaseDB (https://starbasedb.com/) are working on building tools for shareded SQLite.
    From the companies I've talked to, most developers using this architecture are building quick scripts to do this in-house. Both Turso and Durable Objects SQLite already a surprising amount of usage that people don't talk about much publicly yet, so I suspect some of this tooling will start to be published in the next year.
- neilpa4 months ago
  Randomly noticed your post is dated in the future - December 16, 2025
  - NathanFlurry4 months ago
    Great Scott!
DeathArrow4 months ago
How you solve load balancing, availability, replication and backups with SQL lite?
If running using an orchestration framework like Kubernetes, each container in a pod will have its own SQlite database. How can we assure the data is consistent across pods?
- nicky04 months ago
  I think of SQLite on the server as something that you would use when running in a traditional server (VPS or dedicated host) setup rather than an orchestrated, container-based one.
- codethief4 months ago
  I remember there being an HN post a month or two ago about a solution to this exact problem and people using it seemed very happy with it.
EGreg4 months ago
Lately I've seen posts about DuckDB, which looks really cool, but Sqlite seems to be compileable with WASM so it can be used in some kind of container. How do the two compare?
- necubi4 months ago
  DuckDB is an OLAP (analytical) query engine, sqlite is an OLTP (transactional) database. Modern OLAP engines store and represent data in columnar formats, which makes them very fast at queries that touch many rows (particularly if only a few columns are needed). Queries like "sum all sales for the past month by store."
  But they're slow (or incapable) of doing inserts, updates, and deletes, because the columnar formats are typically immutable. They're also relatively slow at operations that need to look at all of the data for a particular row.
  OLTP databases are much better for use cases where you're frequently inserting, updating, and accessing individual rows, as for the database backing a web application.
  A common pattern is to use an OLTP database (like postgres) to back your application, then replicate the data to an OLAP store like Clickhouse or a data lake to run analytical queries that would overwhelm postgres.
- NathanFlurry4 months ago
  DuckDB crushes SQLite in heavy data workloads according to ClickBench by 915x. (Link below since it's looong.)
  DuckDB also has a WASM target: https://duckdb.org/docs/stable/clients/wasm/overview.html
  I don't know enough about DuckDB to understand the tradeoffs it made compared to SQLite to achieve this performance.
  https://benchmark.clickhouse.com/#eyJzeXN0ZW0iOnsiQWxsb3lEQi...
  - IshKebab4 months ago
    As I understand it DuckDB stores columns separately (column major), whereas SQLite stores rows separately (row major). DuckDB is like structure or arrays and SQLite is like array of structs.
    So which is faster depends on your access pattern. There are dumb stupid terrible names for "access all of one row" (OLTP) and "access all of one column" (OLAP) type access patterns.
  - crazygringo4 months ago
    "By 915x" doesn't seem remotely plausible.
    Maybe there's some edge case they've found where SQLite is badly optimized and DuckDB is totally optimized, but that's absolutely not the general case.
    Databases are primarily limited by disk/IO speed. Yes there are plenty of optimizations but they result in 10% improvements, not 915x.
    ianburrell4 months ago
    DuckDB is an in-memory columnar OLAP database. It is going to be much faster at analytics queries than disk-based OLTP database. It is optimized for fast queries but can't write or handle large data.
    crazygringo4 months ago
    Oh, got it, thanks. So it's a totally different product, not an alternative. Yes, that kind of speedup can be explained by using memory instead of disk -- like I said, it's disk/IO speed. Thanks!
    GrayShade4 months ago
    Not necessarily. If your table is very wide but you're only reading one column, you'll do massively less I/O with a columnar or hybrid structure. And that's even before other tricks like storing the min/max values of each column in the pages (so you can skip pages for range queries) or SIMD.
  - EGreg4 months ago
    Why still use SQLite then?
    But how does WASM DuckDB store files in IndexedDB? Any info on that?
    gwking4 months ago
    I believe the locking models are different making DuckDB less suitable for concurrent read/write but you will have to look up the specifics. As always, for a server environment SQLite should be set to WAL mode and comparisons should be made against that rather than the much older, less concurrent default.
    As I recall duckdb’s concurrency model did not sound viable for a web server but I may be behind the times or outright wrong.
mickeyben4 months ago
> No built-in cross-database querying, making complex analytics difficult without a dedicated data lake
I've looked at Turso before and this is exactly what came to mind. I do see some use cases where it could work, but for most real-world applications, this is an absolute red flag that shouldn't be overlooked.
- NathanFlurry4 months ago
  This seems to be the biggest hesitation I've heard over and over by far. There absolutely needs to be a good story here for both (a) ad-hoc cross-partition queries and (b) automatically building a datalake without having to know what ETL stands for.
  However, this isn't so much different from Cassandra/DynamoDB which have a similar problem. You _can_ query cross-partition, but it's strongly discouraged and will strain any reasonably sized cluster.
0xbadcafebee4 months ago
Still in the peak of inflated expectations, I see. Give it 2-3 more years to get to the trough of disillusionment.
- NathanFlurry4 months ago
  Hopefully, it matures into a healthy open-source ecosystem that doesn’t rely on proprietary databases.
  More companies than people realize are already building and scaling with DO SQLite or Turso internally. Almost every company I've talked to that chooses Postgres hits scaling issues around Series A — these companies aren’t.
  - 0xbadcafebee4 months ago
    It's nice to have open source things. But not every open source thing is the right thing to use. You have to use systems design skills to pick a solution that is correct for your use case.
    "SQLite but distributed" is like NoSQL during its hype cycle: a good choice for one or two use cases, a bad choice for most every other use case, yet people are choosing it for every use case, because it's trendy and has lots of marketing hype. It appears to work at first, so people rejoice and make bold claims about it solving every problem in the universe. It's only later that the poor design choice bites them in the butt and they go looking for the next trendy thing.
c4pt0r4 months ago
I'm curious if it matters if it's sqlite since it's already a serverless product
- NathanFlurry4 months ago
  If you care only about serverless, databases like PlanetScale, CockroachDB Cloud, and DynamoDB work well.
  The biggest strength of using SQLite here is that it provides the benefits of a familiar SQL environment with the scaling benefits of Cassandra/DynamoDB.
eduction4 months ago
This is really bad.
DBs other than sqlite need their own dedicated servers, it says.
A good way to store online chats is db-per-chat, it says.
Sqlite scales great with [fusilade of very specific proprietary cloud services], it says.
Sqlite is great but companies like this seem determined to ruin it. MongoDB all over again.
- NathanFlurry4 months ago
  Author here.
  To clarify — is your concern that the only scaling options I listed are proprietary services?
  If so, I completely agree. This article was inspired by a tool we're building internally, based on the same architecture. We knew this was the right approach, but we refuse to rely on proprietary databases, so we built our own in-house.
  We’re planning to open-source it soon.
jacktheturtle4 months ago
SQLite is a very good technology.
throwaway9843934 months ago
[dead]
curtisszmania4 months ago
[dead]