Databases/protocols that differentiate between connection, session and call scopes, can reliably perform resets without requiring to tear down the entire connection.
Retrying is safe with no client side transaction idempotency tokens needed, as the error handling will roll back any changes done only by the failed call, not the entire transaction or session.
[1]: https://tanelpoder.com/2008/04/10/sqlnet-breakreset-to-clien...
Edit: added the comment about retry and idempotency.
In SSMS you can press Alt + Break to send an OOB attention message to the active TDS that will cancel the running command.
https://learn.microsoft.com/en-us/openspecs/windows_protocol...
I believe DB2 supports a similar technique.
There are several ways commands depending on what you want to reset. If you're only interested in mutable state changeable with SET, you can use RESET ALL instead. This resets configuration settings such as the time zone, statement timeout, and so on.
Postgres has different fine-grained ways to reset other aspects of a session, such as closing cursors (CLOSE ALL), unlistening from notifications (UNLISTEN), deallocating prepared plans (DEALLOCATE), and so on. Finally, there's DISCARD ALL, which resets everything.
I agree about error handling. Most clients just disconnect on a connection error.
One gotcha about Postgres queries is that running queries do not get cancelled when the client disconnects. This has always been a weird one that's quite dangerous, as it can lead to dog-piling — if a client starts a slow query, then times out or cancels its connection, the query lives on; then if another identical request starts, same thing happens, and soon you have 100 queries running that are doing the same thing and getting slower and slower. You have to be very careful with retrying.
You can configure that these days, at least if the server is running on some common platforms:
https://www.postgresql.org/docs/current/runtime-config-conne...
I don't know why Postgres can't write keepalive messages while executing the query; speculating here, but it's possible that the architecture is synchronous and doesn't support doing anything with the connection while the query is executing. It's an old threading model where one process is started per connection.
I could have sworn I had read about a new configuration option introduced in Postgres 14, but I can't find anything about it.
[1] https://www.postgresql.org/message-id/e09785e00907280622v9b1...
Someone linked this
https://www.postgresql.org/docs/current/runtime-config-conne...
> client_connection_check_interval (integer)
> Sets the time interval between optional checks that the client is still connected, while running queries. The check is performed by polling the socket, and allows long running queries to be aborted sooner if the kernel reports that the connection is closed.
> This option relies on kernel events exposed by Linux, macOS, illumos and the BSD family of operating systems, and is not currently available on other systems.
> If the value is specified without units, it is taken as milliseconds. The default value is 0, which disables connection checks. Without connection checks, the server will detect the loss of the connection only at the next interaction with the socket, when it waits for, receives or sends data.
> For the kernel itself to detect lost TCP connections reliably and within a known timeframe in all scenarios including network failure, it may also be necessary to adjust the TCP keepalive settings of the operating system, or the tcp_keepalives_idle, tcp_keepalives_interval and tcp_keepalives_count settings of PostgreSQL.
So it appears that they can now indeed use tcp keepalive but only if you configure it
In case of connection issues you can check whether it was all actually committed or rolled back. Often it was propagated to the user, they got error message, refreshed and checked and retried and all was okay in the end. That's probably why the DB engines did not bother micromanaging individual SQL statemens.
This is the impedance mismatch. The _connection_ exists at a layer beneath the database/application layer. You're not configuring the _connection_, you're configuring the _session_! And the relationship between connections and sessions is one-to-many, not one-to-one.
Adding an idempotency key is a way to allow the API provider to detect and de-dupe requests. It requires work and collaboration on the part of the API client and the API provider. The API client must generate a unique idempotency key for each logically different operation it attempts to execute, and it must retry requests for operations that appear to have failed using that fixed idempotency key. If the operation is not inherently idempotent, the API provider needs to implement some way of de-duping repeated requests - perhaps storing requests in a durable cache for N hours/days keyed by the idempotency key.
One industry where these "once-and-only-once" execution guarantees are useful is banking. Customers expect their payments to be executed exactly once, so their accounts are debited once and not zero times or 2+ times.
There's an interesting infoq podcast from 2018 with Jason Maude of Starling Bank [2] talking about how they design their bank systems for "once-and-only-once" execution. From memory, Jason decomposes the responsibilities as: API clients are responsible for ensuring operations execute at least once. API providers are responsible for ensuring operations execute at most once.
[1] see e.g. google's API design guidelines for request identification: https://google.aip.dev/155
[2] https://www.infoq.com/podcasts/cloud-based-banking-startup-j...
I am creating a bank currently, and I was asking this with double payments in mind.
To work correctly, application code must be aware of transactions, and be prepared to roll back state and safely retry a transaction on failure.
ORMs like ActiveRecord struggle to do this because they attempt to present a transactional database as a low-friction, in-memory, mutable, non-transactional data model that does not support rollback.
If you lose a connection to the database prior to successfully committing your transaction, this is just another form of transaction failure.
You need to discard any pending application state changes, open a new connection and transaction, requery data required to determine the actual current state, and then if necessary, retry your transaction.
ActiveRecord can’t do that, hence the ill-advised desire to be able to automatically retry statements on failure — which abandons database-mediated transaction isolation, and incorrectly shifts responsibility for database state preservation/management to the client.
If SQL was trivially idempotent, we wouldn’t need transactions or different levels of transaction isolation in the first place.
If you find yourself so frequently hitting socket timeouts attempting to execute database queries that you need to modify your ORM to work around this problem automatically, your problems likewise almost certainly begin at the schema layer. I'm not going to blame an ORM maintainer for having to deal with that concern in a way that avoids irritating library consumers, and I suppose I can see where it does no good arguing that one's own clients (said consumers) are using the database wrong in order to need such functionality.
I'd have thought to see perf considered to some extent beyond blaming the engine, but I suppose there is a kind of honesty in simply omitting any discussion of causes for these apparently so common "network errors," in cases where they are so common and thus must be so transparently retried that the entire existence of this behavior could be totally overlooked by the application developer nominally responsible for the database interaction.
(I'm curious why a library maintainer needs to assert responsibility over session variables at all times, but I suppose it makes sense from a perspective that one's users are not especially trustworthy, as I suppose most ORM maintainers and especially ActiveRecord's must conclude. I guess it's fair to say that connections needing to manipulate the session that granularly should exist outside the ORM entirely, but then we're back to the whole tiresome architectural question that's vexed the field for now nearly two decades since Neward's "Vietnam" paper: https://web.archive.org/web/20060703024451/http://blogs.tedn....)
When pooling connections, you need to reset the connection when putting it back in the pool. For example, if you grab a connection from the pool and do:
SET SESSION statement_timeout to '1s';
SET timezone TO 'Europe/London';
…and then release the connection, you don't want the next connection to inherit these settings.It has nothing to do with the ORM (although the ORM itself may want to set some baseline settings based on config), and everything to do with sharing connections between different parts of a single application.
Why should the engine burden itself with an undo stack (even if only one frame deep) for every client, on behalf of a minority so incoherently implemented that it can't keep straight what it's doing across the span of two (application and ORM library) codebases? A metaphor already exists in the interface for getting a fresh connection: you do this by creating a fresh connection. If you want some other way to accomplish the same task, it's fair to require justification for the effort of providing that second option, and no such justification is presented in the article under discussion.
I don't like ORMs, I think it's long obvious. But to be excruciatingly clear, I grant them the same courtesy as any other library in that they absolutely may declare things explicitly out of scope. Seeing that not done, in a case where the consumer is pretty explicitly screwing around with internals of a database connection belonging to the library, is what's surprising, even in the context of ORMs as a category, where maximalism seems so constantly the order of the day.
Postgres has no native awareness of connection pooling, and so you need to issue certain reset commands to ensure that the connection is "clean" for the next piece of code that claims it. It should be possible to do things like "SET SESSION statement_timeout" without worrying about who the next user of the connection is.
What confuses me is that the Active Record maintainer should argue for modifications to several database engines, so this poor behavior - namely, treating pooled connections as if distinct - can be safe. Or, better said, I can see several justifications both technical and social for such an argument, none of which compels me.
It's a philosophical difference, I suspect. Users who remove a cover labeled "no user-serviceable parts inside" and end up sorry they did so, in my view, learn a valuable lesson about what such covers and labels are for. Others take a kinder or less grandmotherly view. Fair enough; it's a big enough world for everyone, last I checked.
(That said, it is worth noting that the only way any ORM could actually protect itself from this misuse would be by reimplementing enough of each of its supported databases' command parsers to recognize and reject session state mutations. Obviously no one would be mad enough to do so, but from a perspective of what we could call software design or architecture or craftsmanship or even taste, this kind of maximalism in necessary implication could and in my view should be taken to indict the entire paradigm: 'Why worry about the lesion? It's only pre-cancerous.')
I have no particular opinions about ActiveRecord, which I don't use, but in my opinion all poolers ought to do this. The pooler should hand out clean connections so that you can treat them as distinct.
It's about separation of concerns: A pooled connection should behave like a distinct connection, otherwise every single connection has to worry about getting an "unclean" connection.
Sure, Postgres offers SET LOCAL, which reverts the mutated state on commit/rollback, but that requires a transaction, which is not always desirable. For example, let's say you are doing many UPDATEs in batches. You can do this:
SET SESSION ...
UPDATE ...;
UPDATE ...;
-- etc.
As opposed to: BEGIN; SET LOCAL SESSION ...; UPDATE; COMMIT;
BEGIN; SET LOCAL SESSION ...; UPDATE; COMMIT;
BEGIN; SET LOCAL SESSION ...; UPDATE; COMMIT;
-- etc.
This saves three roundtrips per batch, which can significantly increased throughput. Of course you can save roundtrips by bundling several statements in a single statements string separated by semicolons, but that's painful to work with and doesn't work with prepared statements or SELECTs.And sure, you could yourself call RESET ALL before releasing a connection, but if the pool always does this for you, you've solved the problem everywhere and don't need to think about it ever again. (In the apps I work on, we install a global "after release" hook to automatically do RESET ALL.)
Parsing statements isn't needed for this.
You keep mentioning ORMs, but again, it has nothing to do with them.
I also mention it in this context because an ORM multiplies the problem. Everything you describe is indeed very easy in Postgres, but what about the other engines Active Record (or Sequelize, SQLAlchemy, etc) has to support? Now we face a choice between violating our engine-independent abstraction, and declining to offer a useful capability even where it would be no effort - a dilemma entirely imposed, in my view, by the same philosophical maximalism I've been decrying, and one impossible of satisfactory solution: I have long experience of ORMs preferring both solutions, and both are frankly lousy.
We also don't agree on pooled connection semantics, in that I'm okay with expecting callers to know what is and isn't safe to do with that abstraction, but that's a separate issue and not very interesting to me; I'm a grownup, as long as we're using a decent RDBMS and the same convention everywhere, it's fine. Really, I'm just here to grind my axe about ORMs, in hopes someone in their twenties will discover there is a rich and contentious history here, and there are worse problems to have than finding SQL's somewhat archaic syntax and uncommon rigor a little intimidating at first.
Some SQL engines are more sensitive (due to caching plans in the first place or not) to this problem as well - SQL Server famously utilizes parameter sniffing for performance, which has positive implications of skipping work, and the negative of skipping work you might need to do.
My experience has long been that in almost every case where a "database performance" problem occurs while an ORM is in use, presumptively refactoring to eliminate the ORM almost immediately reveals the buried n+1 or other trivially pathological pattern which, if the ORM's implementor knows to try to avoid it, the ORM's interface typically is not sufficiently expressive to disambiguate in any case. (Fair do's: In the uncommon case where that interface is so expressive, one does usually find the implementation accounts for the case.)
Hence my earlier reference to the "Vietnam" paper, which dissects in extensive (if unfortunately mostly fruitless) detail the sunk cost fallacy at the heart of this problem.
If you do need some session-specific variables, use a connection pooler (you should anyway) that handles that. ProxySQL, for one, tracks the per-session variables that are set, so when a client requests a connection, it hands it one already set to its requested specification (if it’s available).
> The reason this is important is that whenever the write or the read fails, in the overwhelming majority of cases, you don’t know whether the query was executed on the server or not.
Yes, you definitely can know this in almost all cases. Did you receive a result? It succeeded. Did you receive a warning? It succeeded, but maybe read the warning. Did you receive an error? It did not succeed. Parse the error message to determine what went wrong, and if a retry makes sense. Did you receive nothing? Now, and only now, do you not know the state.
It blows my mind the number of devs who haven’t read the API specification for their RDBMS. There is a ton of useful information you can glean that’s often ignored.
> These statements also have a cost on the server side. Each statement requires some amount of memory in the database server. So you have to be careful not to create an unbounded amount of them, which for an ORM isn’t easy to enforce.
Which is why you can limit them. MySQL has a default of 16382; not sure about Postgres.
Also, to the complaint about them being session-scoped, this is yet another reason to use a connection pooler. They can often share them across connections. I know PgBouncer can.
Read. Application. Manuals.