Every other language has figured this one out: just support trailing commas. JSON5 supports comments and trailing commas.
https://devblogs.microsoft.com/oldnewthing/20240209-00/?p=10... https://json5.org/
> The first version of CONL used # as a comment token, but I quickly ran into issues. URLs contain #, so my next version...
Every other language has figured this one out as well. Wrap strings in quotation marks.
> That led to a data-model where each value is one of scalar|list|map (Compared to JSON’s null|bool|number|string|object|array, this felt good).
I'm not sure what a "scalar" is in CONL (is it always a string?) but a config file format having fewer types than JSON does not feel good to me. Even JSON's hand-wavy "number" type is problematic (whether "1" is an integer or float or some some other type is implementation-defined). TOML got it right to distinguish integers from floats. TOML got this right.
If you're older than 40, you remember that there did exist an aeon, long, long ago, when people did not use data object serialization formats as config files. When config files were written not to be easy to parse, but to make it easier for human beings to configure software. When nobody believed there was one single good way to do everything. When software was written not to aid computers, but to aid humans.
Config files have always been a variant of key-value or section-key-value, except that we used to have ad hoc (and probably buggy, inconsistent, incomplete or all three) rules for quoting; array items separated by a mix of spaces, commas or something else; comments (semicolon, percent, sharp) different for each program. Case sensitivity was also a crap shoot, sometimes different between keys and values.
These days TOML (which more or less just works) just works. I have mixed feelings about YAML but certainly I would not swap it with endless variants of sendmail's m4 madness.
Again with the TOML vs YAML? Ya'll can't come up with anything but another version of the same old thing? You don't need to do the same thing everyone else does with a tiny twist. Think outside the box. Expand your mind!
Sure, Sendmail/M4 was a pain in the ass. Postfix was more along the lines of key-value. But Exim had its own rule format, and Qmail took simplicity to the extreme by creating a different file for all the different options.
How about an X11 config file? Nginx? Puppet? Bind? Fstab? Vimrc? Rsyslog? Netrc? Cups? Pppd? Iptables? Apt? Cron? Sysctl? SSH? Just name a program on Linux that wasn't created in the last 15 years and it will have a different config file format, tailored to the users and use cases of that application. And none of them are JSON, YAML, or TOML.
You don't have to make yours completely unique, but you also don't have to go "oh well, there's only 3 formats to choose from, I guess I will have to settle for one of those". DO YOUR OWN THING! It's your program! Don't be a slave to convention!
X11, ssh, Cups are 100% section-key-value or key-value and could be served by TOML easily.
Some of these are just programming languages (vimrc, udev, nginx) or shell scripts (iptables) in disguise. By all means keep those.
Some are tables (cron, fstab, apt, netrc, syslogd are the ones I recognize). I suppose that's a third category but in the end they're also section-key-value (see systemd timer and mount units) and the bespoke format for the user is just one possible tradeoff between readability and conciseness. A lot of the formats you mentioned do have quoting issues, that would go away with a standardized configuration format.
But what's the user experience like? Those generic formats are not designed for a great user experience, they're designed to be generic. So they end up being at best mildly irritating, and at worst wildly frustrating.
Here's an SSH config:
Include ~/.ssh/my-org/*
Host *.co.uk
ProxyCommand ssh bastion@my-uk-server.co.uk nc %h %p 2> /dev/null
Host newServer
HostName newServer.url
User adminuser
Port 2222
IdentityFile ~/.ssh/id_rsa.key
Host anotherServer.tld
HostName anotherServer.url
User mary
Port 2222
Now write that as TOML: [global]
include-files = ~/.ssh/my-org/*
[host.match-co-uk]
host-match = *.co.uk
proxy-command = ssh bastion@my-uk-server.co.uk nc %h %p 2> /dev/null
[host.match-new-server]
hostname-match = newServer
hostname = newServer.url
user = adminuser
port = 2222
identityfile = ~/.ssh/id_rsa.key
[host.match-another-server]
host-match = anotherServer.tld
hostname = anotherServer.url
user = mary
port = 2222
The SSH config can be read easily, written easily, is easy to understand, and the functionality and format are tied together so you can do more complex things easier. On top of that, the SSH file can be changed around to load includes before or after other lines, to change how they match.The TOML one not only takes longer to write, but it lacks the kind of functionality that the SSH config has to both declare a new block, define its internal name, and specify a config glob, all with the same string. And you can't change how or when includes are loaded or what they overload without adding some kind of "priority" key-value, and then having to read each entry, do some math, change all the values to load different things at different places. (and looking back, I actually screwed up the TOML config, because it was so confusing!)
Don't choose a generic solution if it's going to give the user a pain in the ass. If you don't care about the user, then you're part of the enshittification of technology.
This isn't true. TOML allows table names with periods and asterisks. It also supports top-level keys. This is what the TOML may look like for your SSH config:
include = "~/.ssh/my-org/*"
# We write `"*.co.uk"` rather than `host."*.co.uk"`.
# `Host` in ssh_config(5) introduces sections,
# and TOML has table headers for sections.
# A more complete design would have something for `Match`.
["*.co.uk"]
proxy-command = "ssh bastion@my-uk-server.co.uk nc %h %p 2> /dev/null"
[newServer]
hostname = "newServer.url"
user = "adminuser"
port = 2222
identity-file = "~/.ssh/id_rsa.key"
["anotherServer.tld"]
hostname = "anotherServer.url"
user = "mary"
port = 2222
(Personally, I am not a fan of indenting TOML.)> And you can't change how or when includes are loaded or what they overload without adding some kind of "priority" key-value, and then having to read each entry, do some math, change all the values to load different things at different places. (and looking back, I actually screwed up the TOML config, because it was so confusing!)
Right, your TOML is invalid. TOML requires you to quote strings. You might have it mixed up with another INI-derived format.
It could be because TOML is inherently more confusing than ssh_config(5), but I doubt it is actually more confusing to a newcomer. For example, a newcomer might think that indentation in SSH config is semantic when it isn't. I know because I once made this mistake. A newcomer must also remember that `Host` and `Match` are special and introduce new sections despite looking like other declarations. What is more likely is that you learned SSH config by studying the man page or reading a book like SSH Mastery and forgot the effort it took, and now you have "the curse of knowledge" about it (https://en.wikipedia.org/wiki/Curse_of_knowledge), but you haven't studied the TOML spec the same way.
A numeric "priority" key would indeed make for a pretty miserable user experience. Don't implement one if you can help it. There are different, better ways to express how an include should only affect certain keys.
The way I would do it in a TOML-based SSH config file is probably with dotted keys for tables. (I wouldn't necessarily choose TOML for this task, but TOML is your example.) For instance:
[my-org]
include = "~/.ssh/my-org/*"
[my-org."*.example.com"]
# Host inherits from `my-org`.
["*.example.net"]
# Host doesn't inherit from `my-org`.
> Don't choose a generic solution if it's going to give the user a pain in the ass.I agree, with a caveat. The caveat is that you probably overestimate how unique your configuration needs are and underestimate the value of picking something standard. Using a standard format means you tap into its network effects. The user gets "free" syntax highlighting and completion in their editor, automatic formatting, and tools like https://github.com/kislyuk/yq to query and modify config files. This is also an important part of user experience. If the format is custom, you will have to implement syntax highlighting for Vim, Emacs, VS Code, etc., and the user will have to install it.
I think the right approach is to sort your concerns into the essential and the inessential, then choose a format based on the essential concerns. (Do you need includes to only apply to subsequent lines, or could you apply them by name or by nesting?) When you choose, prioritize standard formats. And if your needs are complex enough, consider embedding an interpreter like Lua or Starlark for configuration, or have the user write code in non-embedded Python or another language to generate JSON config that your software reads.
The second "just works" should have been "is almost always enough".
Yeah. I've got no idea what your parent comment is talking about.
> When config files were written not to be easy to parse, but to make it easier for human beings to configure software.
*eyes rolling*. All I can remember is the hundreds of hours I've spent trying to figure out how to configure something in Apache httpd, BIND, iptables, and god forbid, Sendmail!!
Config files were written not to be easy to <anything>. There was no rhyme or reason. Every project had their own bespoke config. All from the whims and fancies of the devs of the project.
Good thing that was all in the past and I had no job and no responsibilities. If software today made configuration like they did 40 years ago, I'd just give up!
JSON superset, optional quotes for keys, sensible string handling, comments, automatic env variable handling, variable references.
It's not perfect (all sufficiently powerful configuration language has quirks), but I love it.
I've also been playing around with a configuration format, for similar reasons, although my approach is to make it easy(enough) to read/parse for both humans and machines.
HN post: https://news.ycombinator.com/item?id=42516608
Any feedback is welcomed, but keep in mind is just a toy project which has only one user in mind(me), no plans to conquer the world or solve the config format problems for all :)
- don’t mind the peculiarities of formats used for config
- create a format where semicolons denote comments (just… doesn’t look right)
In JS (well, its why we have JSON), it is //. In YAML, it is #.
Moreover - semicolon is a natural character used in comments (unlike // or #). It inferes with our human parsing.
I don't get the second point: why is that a problem if a semicolon appears in a comment? From what I understand, comments run until the end of the line, so a second semicolon after the first does nothing.
1. If # is in the first column of the line.
2. If # is followed by a space.
3. The # only starts a comment when it’s outside of quotes.
Lots of people post ideas to Hacker News. Lots of good ideas, lots of bad ideas, and lots in-between. It's very much against the hacker spirit to be such a dismissive lazy jerk about it.