SQLite async connection pool for high-performance(github.com)

197 pointsby slaily7 months ago9 comments

slaily7 months ago
If you’re building Python async apps (FastAPI, background jobs, etc.) with SQLite, you’ll eventually hit two issues
- Opening/closing connections is fast, but not free—overhead adds up under load
- SQLite writes are globally locked
aiosqlitepool is a tiny library that adds connection pooling for any asyncio SQLite driver (like aiosqlite):
- It avoids repeated database connection setup (syscalls, memory allocation) and teardown (syscalls, deallocation) by reusing long-lived connections
- Long-lived connections keep SQLite's in-memory page cache "hot." This serves frequently requested data directly from memory, speeding up repetitive queries and reducing I/O operations
- Allows your application to process significantly more database queries per second under heavy load
Enjoy!
- gwbas1c7 months ago
  Important word:
  > Python
  Your repo and the readme.md don't say "python." The title of this post doesn't say "python."
  It took me a while to realize that this is for python, as opposed to a general-purpose cache for, say, libsqlite.
  - sjsdaiuasgdia7 months ago
    Let's see...
    There's tags showing what Python versions are supported.
    The root dir of the repo contains a 'pyproject.toml' file.
    The readme contains installation instructions for pip, poetry, and uv, all of which are Python package managers.
    The readme contains example code, all of which is in Python.
    The readme references asyncio, a Python module that is included in the standard library for Python 3.
    The 'Languages' widget on the page shows 99.2% of the repo's code is in Python.
    Every file not in the root dir has a .py extension.
    Yeah, I can see why it was so hard to figure out...
    tracker17 months ago
    I'm mostly with you.. it would still be nice if the title reflected the language limitation/feature.
  - kstrauser7 months ago
    The tag at the top of the readme, under the title, shows which Python versions it supports. If it never mentioned Python at all, that would be the tipster.
- slashdev7 months ago
  How does this help with the second issue, the write locks?
  - ncruces7 months ago
    No idea if it applies, but one way would be to direct all writes (including any transaction that may eventually write) to a single connection.
    Then writers queue up, while readers are unimpeded.
    dathinab7 months ago
    if you enable WAL mode with sqlite then readers are not blocked by writer so only writers queue up without needing any special case handling to archive it
    (in general you _really_ should use WAL mode if using sqlite concurrently, you also should read the documentation about WAL mode tho)
    ncruces7 months ago
    Writers won't queue up, rather they'll storm the place, taking turns at asking “can I go now” and sleeping for (tens, hundreds of) milliseconds at a time.
    This only gets “worse” as computers get faster: imagine how many write transactions a serial writer could complete (WAL mode and normal synchronous mode) while all your writers are sleeping after the previous one left, because they didn't line up?
    And, if you have a single limited pool, your readers will now be stuck waiting for an available connection too (because they're all taken by sleeping writers).
    It's much fairer and more efficient for writers to line up with blocking application locks.
    rich_sasha7 months ago
    I was running into some horrendous issues with WAL, where the WAL file would grow boundlessly, eventually leading to veery slow reads and writes.
    It's fixable by periodically forcing the WAL to be truncated, but it took me a lot of time and pain to figure it out.
    dathinab7 months ago
    The is why I said read the WAL doc page in a different answer ;)
    They do point out the risks here: https://sqlite.org/wal.html#avoiding_excessively_large_wal_f...
    sqlites design makes a lot of SQL concurrency synchronization edge cases much simpler as you can rely on the single writer at a time limitation. And it has some grate hidden features for using it as client application state storage. But there are use-cases it's just not very good at and moving from sqlite to other DBs can be tricky (if you ever relied on the exclusive write transaction or the way cells are blobs which can mix data types, even it it was by accident)
    rich_sasha7 months ago
    I did read it. For whatever reason, automatic checkpoints basically would stop from time to time, and the WAL file would start growing like crazy.
    In the end I wrote an external process that forced a checkpoint a few times a day, which worked. I came across other exasperated people in various dark corners of the Internet with the same symptoms.
    normie30007 months ago
    Interesting, were there any warning signs beyond general query slowdown?
    rich_sasha7 months ago
    No warning signs and very little about it on the Internet. Just performance slows to a grind. Also hard to replicate.
    If I had a blog, I'd be writing about it.
    normie30007 months ago
    And how big was the WAL file getting compared to normal? As someone running SQLite in prod it would be comforting at least to have some heuristics to help detect this situation!
    bawolff7 months ago
    I think this is mentioned in the docs https://www.sqlite.org/wal.html
    le-mark7 months ago
    WAL doesn’t cure concurrency issues for SQLite. WAL plus single writer, multiple reader threaded is required. It’s blazing fast though.
- mostlysimilar7 months ago
  Around what amount of load would you say the overhead of opening/closing becomes a problem?
  - jitl7 months ago
    It depends hugely on how you decide to manage the connection objects. If you have a single thread / single core server that only even opens a single connection, then connection open overhead is never a problem even under infinite load.
    The two main issues w opening a connection are:
    1. There is fixed cost O(database schema) time spent building the connection stuff. Ideally SQLite could use a “zygote” connection that can refresh itself and then get cloned to create a new one, instead of doing this work from scratch every time.
    2. There is O(number of connections) time spent looking at a list of file descriptors in global state under a global lock. This one is REALLY BAD if you have >10,000 connections so it was a major motivator for us to do connection pooling at Notion. Ideally SQLite could use a hash table instead of a O(n) linear search for this, or disable it entirely.
    Both of these issues are reasons I’m excited about Turso’s SQLite rewrite in Rust - it’s so easy to fix both of these issues in Rust (like a good hash table is 2 LoC to adopt in Rust) whereas in the original C it’s much more involved to safely and correctly fix the issue in a fork.
    Furthermore, it would be great to share more of the cache between connections as a kind of “L2 cache”; again tractable and safe to build in Rust but complicated to build in a fork of the original C.
    Notion uses a SQLite-backed server for our “Database” product concept that I helped write, we ran in to a lot of these kinds of issues scaling reads. We implemented connection pooling over better-sqlite3 Node module to mitigate these issues. We also use Turso’s existing SQLite C fork “libsql” for some connections since it offers a true async option backed by thread pool under the hood in the node driver, which helps in cases where you can have a bottleneck serializing or deserializing data from “node” layout to “SQLite c” layout or many concurrent writes to different DBs from a single NodeJS process.
- bootsmann7 months ago
  Is there a significant advantage of the sqlite in-memory page cache over the page cache that's already included with the operating system?
  - jitl7 months ago
    Yes: SQLite needs to inspect the schema when it opens a new connection object and does some O(number of conns) lookups in global state during this process. It’s best to avoid re-doing this work.
- manmal7 months ago
  Doesn’t SQLite have its own in-memory cache? Is this about having more control re cache size?
  - dathinab7 months ago
    yes, per "open connection", hence why not closing+reopening connections all the time helps the cache ;)
d1l7 months ago
This is strange on so many levels.
SQLite does not even do network I/O.
How does sharing a connection (and transaction scope) in an asyncio environment even work? Won’t you still need a connection per asyncio context?
Does sqlite_open really take long compared to the inevitable contention for the write lock you’ll see when you have many concurrent contexts?
Does sqlite_open even register in comparison with the overhead of the python interpreter?
What is an asyncio SQLite connection anyways? Isn’t it just a regular one that gets hucked into a separate thread?
- simonw7 months ago
  If you're talking to a 100KB SQLite database file this kind of thing is likely unnecessary, just opening and closing a connection for each query is probably fine.
  If you're querying a multi-GB SQLite database there are things like per-connection caches that may benefit from a connection pool.
  > What is an asyncio SQLite connection anyways? Isn’t it just a regular one that gets hucked into a separate thread?
  Basically yes - aiosqlite works by opening each connection in a dedicated thread and then sending async queries to it and waiting for a response that gets sent to a Future.
  https://github.com/omnilib/aiosqlite/blob/895fd9183b43cecce8...
  - d1l7 months ago
    That's even crazier - so you're using asyncio because you have a ton of slow network-bound stuff - but for your database access you are running every sqlite connection in it's own thread and just managing those threads via the asyncio event loop?
    reactordev7 months ago
    Thread pooling for databases, whether network based, or disk based, is common. A lot of times it will be baked into your client, so the fact that you think it’s crazy means you’ve only dealt with clients that did this for you.
    For really large data sets, you can query and wait a few minutes before getting a result. Do you really want to await that?
    paulddraper7 months ago
    This is a common paradigm for blocking APIs (e.g. the sqlite driver)
    quietbritishjim7 months ago
    What is crazy about that?
    lttlrck7 months ago
    Of course I don't know what the parent is thinking, but my thought is: why can't it be entirely event loop driven? What are the threads adding here?
    (I don't know anything about that project and this isn't meant as a criticism of its design or a challenge - cos I'd probably lose :-) )
    eurleif7 months ago
    SQLite doesn't have a separate server process; it does all of the work for queries in your process. So it's intrinsically CPU-heavy, and it needs threads to avoid blocking the event loop.
    One way to look at is that with a client-server database and an async client library, you have a thread pool in the database server process to do the heavy lifting, and async clients talk to it via TCP. With SQLite, you have that "server" thread pool in the same process instead, and async "clients" talk to it via in-process communication.
    mayli7 months ago
    Cause the sqlite-lib that python ships isn't async, and sqlite itself usually doesn't give an async API.
    maxbond7 months ago
    Python's asyncio is single threaded. If you didn't send them into a different thread, the entire event loop would block, and it would degenerate to a fully synchronous single threaded program with additional overhead.
  - crazygringo7 months ago
    > If you're querying a multi-GB SQLite database
    In which case SQLite is probably the wrong tool for the job, and you should be using Postgres or MySQL that is actually designed from the ground up for lots of concurrent connections.
    SQLite is amazing. I love SQLite. But I love it for single-user single-machine scenarios. Not multi-user. Not over a network.
    simonw7 months ago
    Multi-GB is tiny these days.
    I didn't say anything about concurrent access. SQLite with WAL mode is fine for that these days for dozens of concurrent readers/writers (OK only one writer gets to write at a time, but if your writes queue for 1-2ms who cares?) - if you're dealing with hundreds or thousands over a network then yeah, use a server-based database engine.
    da_chicken7 months ago
    Multi GB is tiny, but that doesn't make SQLite magically better at large queries of multi GB databases. That's why DuckDB has been getting more popular.
    benjiro7 months ago
    Sqlite != DuckDB... two totally different DB types. One is a row based, the other is a column based database. Both run different workloads and both can handle extreme heavy workloads.
    da_chicken7 months ago
    Yes, that's the point I'm making. If SQLite didn't ever struggle with databases in the GB ranges, then there wouldn't be much call to replace it with DuckDB. The fact that there's significant value in an OLAP RDBMS suggests that SQLite is falling short.
    benjiro7 months ago
    The problem is not SQLite struggling with databases in GB range. It does that with ease. OLAP requires a different database structure, namely column storage (preferably with compacting / compression / and other algorithms).
    That is DuckDB its selling point. You want data analyzing, you go DuckDB. You want oltp you go SQLite. Or combine both if you need both.
    Even postgres struggles with OLAP dataloads, and that is why we have solutions like TimescaleDB / postgres plugin. That ironically uses postgres rows but then packs information as column into columns row fields.
    That does not mean that postgres is flawed working with big data. Same with Sqlite... Different data has different needs, and has nothing to do with database sizes.
    brulard7 months ago
    I always had troubles having multiple processes get write access to the sqlite file. For example if I have node.js backend work with that file, and I try to access the file with different tool (adminer for example) it fails (file in use or something like that). Should it work? I don't know if I'm doing something wrong, but this is my experience with multiple projects.
    dathinab7 months ago
    There are multiple aspects to it:
    - sqlite is a bit like a RWLocked database either any number or readers xor exactly one writer and no readers
    - but with WAL mode enabled readers and writers (mostly) don't block each other, i.e. you can have any number of readers and up to one writer (so normally you want WAL mode if there is any concurrent access)
    - if a transaction (including implicit by a single command without "begin", or e.g. upgrading from a read to a write transaction) is taking too long due to a different processes write transaction blocking it SQLITE_BUSY might be returned.
    - in addition file locks might be used by SQL bindings or similar to prevent multi application access, normally you wouldn't expect that but given that sqlite had a OPEN_EXCLUSIVE option in the past (which should be ignored by half way modern impl. of it) I wouldn't be surprised to find that.
    - your file system might also prevent concurrent access to sqlite db files, this is a super obscure niche case but I have seen it once (in a shared server, network filesystem(??) context, probably because sqlite really doesn't like network filesystems often having unreliable implementations for some of the primitives sqlite needs for proper synchronization)
    as other comments pointed out enabling WAL mode will (probably) fix your issues
    Groxx7 months ago
    They can't write concurrently, but generally speaking yes, they can: https://sqlite.org/faq.html#q5
    Your throughput will be much worse than a single process, but it's possible, and sometimes convenient. Maybe something in your stack is trying to hold open a writable connection in both processes?
    cyanydeez7 months ago
    PRAGMA journal_mode = WAL;
    simonw7 months ago
    That is because the default SQLite mode is journal, but for concurrent reads and writes you need to switch it to WAL.
    brulard7 months ago
    I use WAL basically everywhere. I thought that would fix my problem some time ago, but it didn't
    simonw7 months ago
    Are you seeing SQLITE_BUSY errors?
    Those are a nasty trap. The solution is non-obvious: you have to use BEGIN IMMEDIATE on any transaction that performs at least one write: https://simonwillison.net/tags/sqlite-busy/
    asa4007 months ago
    This is correct, and one of the things I tell anybody who is considering using SQLite to watch out for. The busy timeout and deferred write transactions interact in a really non intuitive way, and you have to use BEGIN IMMEDIATE for any transaction that performs any writes at all, otherwise SQLite gives up and throws an error without waiting if another traction is writing when your traction attempts to upgrade from a read to a write.
    brulard7 months ago
    Thanks for the direction. I thought SQLite was limited in how multiple processes can access the db files, but now I see the problem is on my end. Btw. I'm a fan of your AI/LLM articles, thanks for your awesome work.
    Asmod4n7 months ago
    An average human being can produce around 650MB of text during a while work lifetime when doing nothing but write text 4 hours per weekday without any interruptions.
    Saying multi gigabyte databases for single user usage is the norm feels insane to me.
    simonw7 months ago
    Have you seen the size of the database of email you've received?
    Kranar7 months ago
    SQLite is a great database for organizing data in desktop applications, including both productivity software and even video games. It's certainly not at all unreasonable for those use cases to have files that are in the low GB and I would much rather use SQLite to process that data instead of bundling MySQL or Postgres into my application.
    naasking7 months ago
    > In which case SQLite is probably the wrong tool for the job
    Why? If all it's missing is an async connection pool to make it a good tool for more jobs, what's the problem with just creating one?
    nomel7 months ago
    It's a bit re-inventing the wheel, since solving all the problems that come with network access is precisely why those databases exist, and what they've already done.
    asyncpg is a nice python library for postgres.
    I think postgres releasing a nice linkable, "serverless" library would be pretty amazing, to make the need for abusing sqlite like this (I do it too) go away.
    jitl7 months ago
    Postgres has really not solved problems that come with being a networked server and will collapse under concurrent connections far before you start to feel it with SQLite. 5000 concurrent connections will already start to deadlock your Postgres server; each new connection in Postgres is a new Postgres process and the state for the connection needs to be written to various internal tracking tables. It has a huge amount of overhead; connection pooling in PG is required and often the total system has a rather low fixed limit compared to idk, writing 200 lines of python code or whatever and getting orders of magnitude more connections out of a single machine.
    anarazel7 months ago
    A connection definitely has overhead in PG, but "5000 concurrent connections will already start to deadlock your Postgres server" is bogus. People completely routinely run with more connections.
    Check the throughput graphs from this blog post from 2020 (for improvements I made to connection scalability):
    https://techcommunity.microsoft.com/blog/adforpostgresql/imp...
    That's for read-mostly work. If you do write very intensely, you're going to see more contention earlier. But that's way way worse with sqlite, due to its single writer model.
    EDIT: Corrected year.
    jitl7 months ago
    Yeah, I think I'm conflating our fear of >5000 connections for our Postgres workload (read-write that is quite write heavy) with our SQLite workload, which is 99.9% read.
    The way our SQLite workload works is that we have a pool of hundreds of read connections per DB file, and a single writer thread per DB file that keeps the DB up to date via CDC from Postgres; basically using SQLite as a secondary index "scale out" over data primarily written to Postgres. Because we're piping Postgres replication slot -> SQLite, we don't suffer any writer concurrency and throughput is fine to keep up with the change rate so far. Our biggest bottleneck is reading the replication slot on the Postgres side into Kafka with Debezium.
    simonw7 months ago
    https://pglite.dev/ is a version of that, in 3MB of WASM.
    actionfromafar7 months ago
    That's wild. Not sure if I love it or hate it, but I'm impressed.
    jitl7 months ago
    Postgres will shit itself without a connection pooling proxy server like PGBouncer if you try even like 5000 concurrent connections because Postgres spawned a UNIX process per inbound connection. There’s much more overhead per connection in Postgres than SQLite!
    drzaiusx117 months ago
    Likewise MySQL will shit itself with just a couple hundred connections unless you have a massive instance size. We use AWS' RDS proxy in front for a similar solution. I've spent way too many hours tuning pool sizes, resolving connection pinning issues...
- Retr0id7 months ago
  My preferred python wrapper for sqlite is apsw. The maintainer gives a good answer here for why not to use an async interface in most cases: https://github.com/rogerbinns/apsw/discussions/456#discussio...
  It really depends on what your workload looks like, but I think synchronous will win most of the time.
- charleslmunger7 months ago
  A connection pool is absolutely a best practice. One of the biggest benefits is managing a cache of prepared statements, the page cache, etc. Maybe you have temp tables or temp triggers too.
  Even better is to have separate pools for the writer connection and readers in WAL mode. Then you can cache write relevant statements only once. I am skeptical about a dedicated thread per call because that seems like it would add a bunch of latency.
- pjmlp7 months ago
  For some strange reason, some people feel like using SQLite all over the place, even when a proper RDMS would be the right answer.
  - 9rx7 months ago
    It is not that strange when you consider the history. You see, as we started to move away from generated HTML into rich browser applications, we started to need minimal direct DBMS features to serve the rich application. At first, few functions were exposed as "REST APIs". But soon enough those few featured turned into full-on DBMSes, resulting in a DMBS in front of a DBMS. But then people, rightfully, started asking: "Why are we putting a DBMS in front of a DBMS?"
    The trouble is that nobody took a step back and asked: "Can we simply use the backing DBMS?" Instead, they trudged forward with "Let's get rid of the backing DBMS and embed the database engine into our own DBMS!" And since SQLite is a convenient database engine...
  - fidotron7 months ago
    I recently encountered a shared SQLite db being used for inter process pub sub for real time data . . . in a safety critical system.
    Wrong on so many levels it's frightening.
    aynyc7 months ago
    Is it? It was designed for damage control system on naval combat vessels. I have no idea what it does on a naval vessel, but I imagine there is certain level of safeness.
    asa4007 months ago
    How was SQLite used in that scenario? What was the architecture?
wmanley7 months ago
Regarding shared caching: Use `PRAGMA mmap_size` to enable mmap for reading your database. This way SQLite won't add another layer of page caching on top saving RAM and making things faster. SQLite only uses mmap for reads and will continue to write to the database with pwrite().
You must set it to a value higher than the size of your DB. I use:
```
    PRAGMA mmap_size = 1099511627776;
```
(1TB)
- rogerbinns7 months ago
  Unless you compile SQLite yourself, you'll find the maximum mmap size is 2GB. ie even with your pragma above, only the first 2GB of the database are memory mapped. It is defined by the SQLITE_MAX_MMAP_SIZE compile time constant. You can use pragma compile_options to see what the value is.
  https://sqlite.org/compile.html#max_mmap_size
  Ubuntu system pragma compile_options:
  MAX_MMAP_SIZE=0x7fff0000
  - otterley7 months ago
    That seems like a holdover from 32-bit days. I wonder why this is still the default.
    rogerbinns7 months ago
    SQLite has 32 bit limits. For example the largest string or blob it can store is 2GB. That could only be addressed by an incompatible file format change. Many APIs also use int in places again making limits be 32 bits, although there are also a smattering of 64 bit APIs.
    Changing this default requires knowing it is a 64 bit platform when the C preprocessor runs, and would surprise anyone who was ok with the 2GB value.
    There are two downsides of mmap - I/O errors can't be caught and handled by SQLite code, and buggy stray writes by other code in the process could corrupt the database.
    It is best practise to directly include the SQLite amalgamation into your own projects which allows you to control version updating, and configuration.
    wmanley7 months ago
    >There are two downsides of mmap - I/O errors can't be caught and handled by SQLite code,
    True. https://www.sqlite.org/mmap.html lists 3 other issues as well.
    > and buggy stray writes by other code in the process could corrupt the database.
    Not true: "SQLite uses a read-only memory map to prevent stray pointers in the application from overwriting and corrupting the database file."
    otterley7 months ago
    All great points. Thank you!
bawolff7 months ago
> The primary challenge with SQLite in a concurrent environment (like an asyncio web application) is not connection time, but write contention. SQLite uses a database-level lock for writes. When multiple asynchronous tasks try to write to the database simultaneously through their own separate connections, they will collide. This contention leads to a cascade of SQLITE_BUSY or SQLITE_LOCKED errors.
I really don't get it. How would this help?
The benchmarks dont mention which journal mode sqlite is configured as, which is very suspicious as that makes a huge difference under concurrent load.
- pornel7 months ago
  Sharing one SQLite connection across the process would necessarily serialize all writes from the process. It won't do anything for contention with external processes, the writes within the process wouldn't be concurrent any more.
  Basically, it adds its own write lock outside of SQLite, because the pool can implement the lock in a less annoying way.
  - bawolff7 months ago
    I don't understand, all writes to a single sqlite DB are going to be serialized no matter what you do.
    > Basically, it adds its own write lock outside of SQLite, because the pool can implement the lock in a less annoying way.
    Less annoying how? What is the difference?
    pornel7 months ago
    SQLite's lock is blocking, with a timeout that aborts the transaction. An async runtime can have a non-blocking lock that allows other tasks to proceed in the meantime, and is able to wait indefinitely without breaking transactions.
    bawolff7 months ago
    What's the benefit of this over just doing PRAGMA busy_timeout = 0; to make it non-blocking ?
    After all, as far as i understand, the busy timeout is only going to occur at the beginning of a write transaction, so its not like you have to redo a bunch of queries.
- slaily7 months ago
  When your program does heavy concurrent writing and opens/closes connections for each write, most of them will fail with SQLITE_BUSY or SQLITE_LOCKED errors.
  This situation can be managed with a pool of small (5 connections or less) to prevent spawning too many connections. This will reduce racing between them and allow write operations to succeed.
mayli7 months ago
FYI, I've once had few long-lived connection with wal, and wal file just goes exploded. Turns out sqlite won't truncate the wal if there are open connections.
- infamia7 months ago
  Using WAL2 should make that problem better. It has two WAL files it alternates between when making writes, so the system has an opportunity to check point the WAL file not in use.
  https://sqlite.org/src/doc/wal2/doc/wal2.md
bob10297 months ago
I've been thinking about trying pre-serialization of SQLite commands to enable single-writer against a singleton SQLiteConnection using something like Channel<T> or other high performance MPSC abstraction. Most SQLite providers have an internal mutex that handles serialization, but if we can avoid all contention on this mutex things might go faster. Opening and closing SQLite connections is expensive. If we can re-use the same instance things go a lot faster.
7 months ago
undefined
anacrolix7 months ago
When you have multiple sqlite connections, any write will flush the caches of other connections. So more connections is not always better.
ddorian437 months ago
No synchronous api support ?