Interesting Bits of Postgres Grammar(steve.dignam.xyz)

72 pointsby sbdchd4 days ago4 comments

mdaniel3 days ago
> But if there’s a comment in between, it’s a syntax error:
Man, wtf. It seems that just about every language has its own opinion about what the parser should do with whitespace and comments. My suspicion is that SQL actually cares about pragma comments but since I don't have CHF221 for a pdf, I don't know for sure
PaulHoule4 days ago
What I want is a PEG grammar generator that lets you set operator precedence with either numbers or partial orderings.
- o11c4 days ago
  You really don't want PEG, even if you think you do. Maybe especially if you think you do. PEG gives up on both performance and correctness in case of ambiguity; LL and LR are the main families that can be trusted (though not LL(*) unless it's actually LL(1) after converting token trees). If you're just parsing expressions however, you don't need anything near that complicated though.
  Operator-precedence parsers can handle partial orderings just fine if you think about it - just toposort them, then explicitly list the set of acceptable children on each side (which must be less than the current precedent, or possibly equal on the left side, or right if you can reassociate which requires remembering parens) rather than just subtracting 1 like most implementations do. In many cases it suffices to just specify a single number (plus a single bit to allow more of the same level) instead of a set, e.g. if you're just fixing the ambiguous level of the bitwise or comparison operators between languages.
  Note also that Bison has an XML output mode which is really useful since LR machine runtimes are trivial to write yourself; the conversion from a grammar to tables is the hard part. Unfortunately there is no similar story for lexers.
  - PaulHoule3 days ago
    I've got a long list of grievances with the parser status quo. Frankly I think if we had better parser generators we could put a stake in the heart of the idea that Lisp is a better language for metaprogramming than more mainstream languages created post-Syntactic Structures.
    I think it really should be easy to:
    (1) generate an unparser at the same time you generate a parser (you can metaprogram something, write it into a file and check it into git)
    (2) patch a parser by adding a few productions (you should be able to add an unless statement to javac and have about 50% of it be the POM file)
    (3) stick a grammar into another grammar (embed SQL in any language)
    (4) work with concrete syntax trees (goes with (1), back in the 1990s there were CASE tools that would let you edit a GUI with a visual editor and make a patch you could check into version control like a patch by a professional programmer)
    (5) generate your AST/CST objects from the same source as the parser (The Bison/Yacc streaming API was OK for C in the 1970s)
    A few factors mitigate against this. One of them is performance. System programmers have been traumatized by C++ compile times and don't want to give up a microsecond. Another one is that anybody who knows how to make a parser generator knows how to use today's crummy parser generators and doesn't have empathy with the large number of programmers who might be doing more advanced things if it was easier.
    The PEG community at least admits there is a problem with the status quo, but performance issues tend to make the PEG revolution less than revolutionary. They PEGilated Python and we really didn't get anything out of it.
o11c4 days ago
Postgres's identifier-quoting is almost what standard SQL requires, except that it folds in the wrong direction (only relevant if you're introspecting or mixing quoted with unquoted identifiers).
Many (Most?) other SQL implementations violate the standard horribly.
- flysand73 days ago
  I'm curious as to how do they violate it
lovich4 days ago
Was not aware you can execute lambda calculus in sql. neat article
- cryptonector4 days ago
  SQL is Turing complete.