Here is an example I made in a few minutes:
ports:
- 80
- 8000 - 10000
- 12000 -
- 14000
Guess how it parses? answer: {"ports":[80,8000,10000,12000,[14000]]}
This is roughly equivalent to saying that a linter can transform the AST of the language into a canonical representation, and the syntax will be rejected unless it matches the canonical representation (modulo things like comments or whitespace-for-clarity).
To see misleading indentation warnings in action, you can try the following snippet in the playground (https://kson.org/playground/) and will properly get a warning:
ports:
- 80
- 8000
- 10000
- 12000
- 14000
Next to that, note that KSON already has an autoformatter, which also helps prevent misleading indentation in files.When I think about it, any language should come with a strict, non-configurable built-in formatter anyways.
If you supposedly human writable config file format is unusable without external tools, there is something wrong with it.
Would that be on the language, or the IDEs that support it? Seems out of scope to the language itself, but maybe I'm misunderstanding.
This is because the only implementation is written in Kotlin. There are Python and Rust packages, but they both just link against the Kotlin version.
How do you build the Kotlin version? Well, let's look at the Rust package's build.rs:
https://github.com/kson-org/kson/blob/main/lib-rust/kson-sys...
It defaults to simply downloading a precompiled library from GitHub, without any hash verification.
You can instead pass an environment variable to build libkson from source. However, this will run the ./gradlew script in the repo root, which… downloads an giant OpenJDK binary from GitHub and executes it. Later in the build process it does the same for pixi and GraalVM.
The build scripts also only support a small list of platforms (Windows/Linux/macOS on x86_64/arm64), and don't seem to handle cross-compilation.
The compiled library is 2MB for me, which is actually a lot less than I was expecting, so props for that. But that's fairly heavy by Rust standards.
Edit: point taken about verifying checksums, just created an issue for it (https://github.com/kson-org/kson/issues/222)
Both to bootstrap making a parser in a new language, and also as a kind of living spec document.
All in all, I'm confident that KSON can become ubiquitous despite the limitations of the current implementation (provided it catches on, of course).
Yes, there are bad consequences that can happen. No, you don't dodge having problems by picking a different data format. You just pick different problems. And take away tools from the users to deal with them.
Definitely more control than guessing the right JSON, or breaking that YAML file. Plus, you get completion, introspection, and help while editing the config because you're in a code-writing environment. Bonus for having search and text manipulation tools under your fingertips instead of clicking checkboxes or tabbing through forms.
The idea of configuring something but not actually having any sort of assurances that what you're configuring is correct is maddening. Building software with nothing but hopes and dreams.
The comment option gives enough space for devs to explain new options flags and objects included to those familiar enough to be using it.
For customer facing configurations we build a UI.
> "backwardsCompatible": "with JSON",
But in that same example, they have a comment like this:
> // comments
Wouldn't that make it not compatible with JSON?
From what I understand, it's "backwards-compatible" with JSON because valid JSON is also valid JSON5.
But it's not "forwards-compatible" precisely because of comments etc.
If people tended to interpret English correctly and not be susceptible to misinterpreting written statements, that would be nice. Reality is a bugger!
Forwards-compatible means the old thing can handle the new things. Here JSON is not forwards-compatible with JSON5.
When the configuration grows complex, and you feel the need to abstract and generate things, switch to a configuration language like Cue or RCL, and render to json. The application doesn't need to force a format onto the user!
We did a prototype at work to try different configuration languages for our main IaC repository, and Cue was the one I got furthest with, but we ended up just using Python to configure things. Python is not that bad for this: the syntax is light, you get types, IDE/language server support, a full language. One downside is that it’s difficult to inspect a single piece of configuration, you run the entry point and it generates everything.
As for RCL, I use it almost daily as a jq replacement with easier to remember syntax. I also use it in some repositories to generate GitHub Actions workflows, and to keep the version numbers in sync across Cargo.toml files in a repository. I’m very pleased with it, but of course I am biased :-)
At work we generate both k8s manifests as well as application config in YAML from a Cue source. Cue allows both deduplication, being as DRY as one can hope to be, as well as validation (like validating a value is a URL, or greater than 1, whatever).
The best part is that we have unit tests that deserialize the application config, so entire classes of problems just disappear. The generated files are committed in VCS, and spell out the entire state verbatim - no hopeless Helm junk full of mystery interpolation whose values are unknown until it’s too late. No. The entire thing becomes part of the PR workflow. A hook in CI validates that the generated files correspond to the Cue source (run make target, check if git repo has changes afterwards).
The source of truth are native structs in Go. These native Go types can be imported into Cue and used there. That means config is always up to date with the source of truth. It also means refactoring becomes relatively easy. You rename the thing on the Go side and adjust the Cue side. Very hard to mess up and most of it is automated via tooling.
The application takes almost its entire config from the file, and not from CLI arguments or env vars (shudder…). That means most things are covered by this scheme.
One downside is that the Cue tooling is rough around the edges and error messages can be useless. Other than that, I fully intend to never build applications differently anymore.
Curiously, LLMs have made it a lot easier. One step away from an English adapter that routes through an LLM to generate the config.
Configuration is difficult, the tooling is rarely the problem (at least in my experience).
As an example, a friend of mine introduced TOML to a reasonably big open source project a while ago. Recently, he mentioned there were some unexpected downsides to TOML. I've asked him to chime in here, because I think he's more qualified to reply (instead of a best-effort attempt from my side to faithfully reproduce what he told me).
Another fun thing about configuration: it's a great indicator of poor software design. If the configuration can be very complicated, in one single format, in one big file, look at the source code and you'll find a big ball of mud. Or even worse, it's lots of very different software that all shares one big config file. Both are bad for humans and bad for computers.
I use a lot of openbsd and one of the things I really like about it is that they care about the user interface(note 1) and take the time to build custom parsers(note 2)
Compare pf to iptables. I think, strictly speaking, ip tables is the more feature-full filter language. but the devs said "we will just use getopt to parse it" and called it a day. leading to what I find one of the ugliest, more painful languages to use. pf is an absolute joy in comparison. I would pick pf for that reason alone.
note 1. Not in the sense of chasing graphical configurators. An activity I find wastes vast amounts of developer time leading to neutered interfaces. But in the sense of taking the time to lay out a syntax that is pleasant to read and edit and then write good documentation for it.
note 2. If I am honest, semi custom parsers, the standard operating procedure appears to start with a parse.y from a project that is close and mold it into the form needed. which lends itself to a nice sort of consistency in the base system.
- The key-value pair. Maybe some section marker (INI,..). Easy to sed.
- The command kind. Where the file contains the same command that can be used in other place (vim, mg, sway). More suited to TUI and bigger applications.
With these two, include statement are nice.
I like the language embedding feature in KSON - we would use that. Have you thought about having functions and variables? That is something you get in Pkl and Dhall which are useful.
This sounds like the kind of question for Daniel himself to chime in, since he has the best overview of the language's design and vision. He's not super active on HN, but I'll give him a heads up! Otherwise feel free to join our Zulip (https://kson-org.zulipchat.com) and we can chat over there.
Configuring TypeScript applications with the `defineConfig` pattern that takes asynchronous callbacks allowing you to code over settings is very useful. And it's fully typed and programmable.
It's particularly useful because it also allows you to trivially use one single configuration file where you only set what's different between environments with some env === "local" then use this db dependency or turn on/off sentry, etc.
Zig is another language that shows that configuration and building should just be code in the language itself.
This depends on the trust-model for who is doing the configuration, especially if you're trying to host for multiple tenants/customers. Truly sandboxing arbitrary code is much harder than reading XML/JSON/etc.
Note that this implies a Turing incomplete language. Which makes sense - our goal is to make dangerous programs unrepresentable, so naturally we can't implement every algorithm in our restricted language.
The data model is closer to XML than JSON, though, so unfortunately it's not a drop-in replacement for yaml.
Small sample:
package {
name my-pkg
version "1.2.3"
dependencies {
// Nodes can have standalone values as well as // key/value pairs.
lodash "^3.2.1" optional=#true alias=underscore
}
}
I think the post is hurt by the desire to sort of… “have a theory” or take a new stance. The configuration file is obviously not a user interface, it is data. It is data that is typically edited with a text editor. The text editor is the UI. The post doesn’t really justify the idea of calling the configuration file, rather than the program used to edit it, the UI. Instead it focuses on a better standard for the data.
The advancement of standards that make the data easier to handle inside the text editor is great, though! Maybe the slightly confusing (I dare say confused) blog title will help spread the idea of kson, regardless.
Edit: another idea, one that is so obvious that nobody would write a blog post about it, is that configuring your program is part of the UX.
But we allow it for files that tend to make production changes usually without any unit tests!
I'd prefer something syntaxed like a programming language but without turing completeness.
It's not telling, it's impossible. Neither JSON or YAML are Turing complete. That said, the JS in JSON is for Javascript, whose syntax it is derived from, so at least one major language uses it in its syntax.
And don’t forget Ansible playbooks!
price_in_€: 42
"price_in_€": 42
Or were you already aware of that and lamenting that KSON requires quoting in this case, compared to YAML which doesn't?I’ll stick to JSON. When JSON isn’t enough it usually means the schema needs an update.
As for Apple Pkl, I think we share the goal of robustness and "toolability", but pkl seems way more complex. I personally think it's more useful to keep things simple, to avoid advancing in the configuration complexity clock (for context, see https://mikehadlow.blogspot.com/2012/05/configuration-comple...).
I agree that it's not perfect but worse is better and familiar is a massive win over making your users look up a new file format or set their editor up for it. If you truly hate YAML that's fine, there's plenty of other familiar formats: INI, toml, JSON.
Why would you be doing that?
Like, are you _literally_ scripting in the value of an item or are you just equating that they are similar?
Literal being:
get_number_of_lines:
command: >
#!/bin/bash
wc -l
Invariably, ppl will write inline scripts instead of actual scripts in their own files. There are also some SDKs for most of these operations that would let you do it in code but they are not always 100% the same as the CLI, some options are different, etc.
But I'd strongly encourage everybody to think about whether that deep configurability is really needed.
That’s why 90% of each iOS update is just another menu or a reorganization of menus and why there are 3 different ways to access the same info/menus on iOS.
1: I pulled that out of my butt, there's no factual data to it.