Like rg, it's one of those "rewrite it in rust" projects that turned out to actually be quite well thought through.
The reason “rewrite in Rust” gets a lot of hate is because it’s often a poorer replica in that it supports a subset of features and sometimes so littered with unsafe blocks that it’s barely any safer than their C counterparts.
Ripgrep and bat are the exceptions in that they’re modernisations in every sense of the term. They’re safer, have more modern features, and better defaults. Even if you don’t give a crap about memory safety, there’s a reason to use ripgrep and bat.
Personally I really don’t see the benefit in rewriting stuff unless you’re bringing other modernisations to the table too. But I suspect for some people, it’s more of an exercise to learn Rust than it is an ambition to displace a particular coreutil.
However what I truly don't understand is using a different license. For something so fundamental, please just let that be the same.
But I think there are a lot more great examples. I have used pandas and I like polars. I have used latex but I like typst. People are creating generally valuable tools that bring something new to the table. More competition and diversity is rarely a bad thing.
Or BSD/ISC for FreeBSD, OpenBSD, macOS etc coreutils? All of which also have subtly different implementations from each other.
Or maybe you are talking about other UNIXs like CDDL for OpenSolaris?
Or perhaps you meant a proprietary license like Solaris, AIX, HP-UX, Tru64 UNIX and so on?
Or maybe we just agree that there isn’t a standard license for coreutils and developers should be free to chose to license their own code however they wish?
The issue is there's a massive leap between GPLv3 and MIT, even something like GPLv2 or anything else is better than MIT or public domain-tier licenses.
Is this really obvious? Did GNU coreutils do this for the project it was attempting to replicate/supplant?
> even something like GPLv2 or anything else is better than MIT or public domain-tier licenses
That's an opinion, not a fact.
Except on average, how often do GPL projects get forked and modified without the changes getting released to the public versus with MIT projects? Which one benefits end users more?
Technically with GPL, you only need to provide the source if requested. You don't specifically need to publish those changes ahead of time. And as it happens, some businesses don't even share the source when requested.
Also interesting your silence about my other point. Tell me, did GNU coreutils copy the license of its ancestor?
But there isn’t a standard license for coreutils, as I’ve demonstrated.
And worse to your point is that literally only one implementation of coreutils is GPLv3. So by your logic “rewrite in rust” projects shouldnt be GPL-licensed.
> The issue is there's a massive leap between GPLv3 and MIT, even something like GPLv2 or anything else is better than MIT or public domain-tier licenses.
Actually MIT is closer to what the term “public domain” means than GPLv3 is.
But either way, you’re arguing preference as fact. And your preference here is basically just a license flame war. I thought the community had evolved passed this pettiness.
There is for the specific coreutils they're attempting to replicate the behavior of though. They're directly targeting the GNU coreutils.
> Actually MIT is closer to what the term “public domain” means than GPLv3 is.
Yeah, that's the problem with it.
> But either way, you’re arguing preference as fact. And your preference here is basically just a license flame war. I thought the community had evolved passed this pettiness.
Is it really pettiness if one license allows for Elasticsearch situations and the other keeps the software and its derivatives free for people to use? Go and try to argue that the Linux kernel should be relicensed as MIT, surely the license doesn't matter at all and had no impact whatsoever on how things got to the point they are now. It's just pettiness, right?
In the case of uutils coreutils specifically, sure. But that's not universally true for every RiR (Rewrite in Rust) project.
> Is it really pettiness if one license allows for Elasticsearch situations and the other keeps the software and its derivatives free for people to use?
GPLv3 wouldn't prevent ElasticSearch situations. They had to create a new license to solve that.
The problem with ElasticSearch was that AWS were making money running ElasticSearch without financially contributing ElasticSearch. There's nothing in GPL that prevents that. If there were, then nobody would be running Linux servers ;)
> Go and try to argue that the Linux kernel should be relicensed as MIT
Why would I argue that when there are plenty of kernels that are BSD licensed? If I cared about software licenses, then I'd use one of them instead.
> surely the license doesn't matter at all and had no impact whatsoever on how things got to the point they are now. It's just pettiness, right?
It's petty because you're complaining that developers should not be free to choose the software license they want for their own software projects because of an ideological complaint you have based around a misunderstanding of GPL.
In practice, even if they’d chosen the GPL for their own code, they’d be including dependencies that weren’t GPL’d, so unless they were committing to doing everything from scratch (including the Rust standard library!) some parts of the codebase would be non-GPL’d.
that doesn't matter. the point of the GPL is to protect the application. that still happens even if libraries used are not GPL. the LGPL would not exist if that were an issue, so using a different more restrictive license for applications, and a less restrictive one for libraries is done intentionally.
This is exactly the issue most of us have with the rust ecosystem and these 'rewrite in rust' projects, though.
By making everything licensed with the absolute bottom wrung restrictions, you're just made it even easier for corpos to have free pickings of any given tool on the internet to incorporate into their own tools and have never-ending Amazon and Elasticsearch situations.
Obviously the community wouldn't even be here to begin with if it wasn't for Linux going with a GPLv2 license. Going forward, with everything becoming more MIT/BSD licensed, I wonder to see how the community/ecosystem will fare.
I suspect, should there come a time in the future where we realize that this may have been a critical error, it'll be far too late to correct it.
But I get it, at least it doesn't have no license.
I also don't think it's careless because "Go with community norms" is a considered way to choose something.
Finally, this isn't really about "careless" exactly, but if I were the authors of this project, I would deliberately choose MIT/Apache2.0 over the GPL, and so like, I dunno, suggesting that they're not being responsible because they didn't pick the GPL isn't a framing I'd agree with.
I've already given plenty of examples of BSD-licensed coreutils. If an evil corporation wanted to steal coreutils, they wouldn't need to take the Rust implementation. And as a bonus, if they took FreeBSD/OpenBSD/whatever, they'd get a project that's far more mature too.
It is, after all, exactly what Apple did with Darwin.
> Obviously the community wouldn't even be here to begin with if it wasn't for Linux going with a GPLv2 license.
That's survivor bias and doesn't fall in line with my experiences using Linux and BSD in the 90s.
BSD originally had a bigger community than Linux for quite a while. What accelerated Linux wasn't the license; it was the hacker culture.
BSD systems were tightly controlled ecosystems, whereas Linux was a free-for-all because the kernel was managed by a different developer to the guy who managed GNU. So everything about the GNU/Linux ecosystem was disparate projects slapped together. This encouraged others to slap their own parts to GNU/Linux. This is why fsck needed to exist: the file system was slapped together so Linux needed a way to fix file corruptions. It's why there's different package managers and why the concept of a "distribution" exists in the first place.
It's what made Linux approachable and it meant development on Linux happened at a much faster pace than on BSD.
Then all of those hackers got jobs. Became managers. And recommended Linux because it's what they learned "UNIX" on.
Linux was basically the original "move fast and break things". If it had been licensed MIT then nothing would have changed.
> I suspect, should there come a time in the future where we realize that this may have been a critical error, it'll be far too late to correct it.
The GPL vs BSD argument is probably older than you've been alive. It's probably older than a considerable number of HNers have been alive. And it's been proven time and time again that it's an ideological debate that has no practical truth. Hence why people stopped arguing it.
So? That's allowed, they should do whatever they want with that code. I've preferred most MIT/Apache stuff over GNU alternatives even for personal use.
>Obviously the community wouldn't even be here to begin with if it wasn't for Linux going with a GPLv2 license. Going forward, with everything becoming more MIT/BSD licensed, I wonder to see how the community/ecosystem will fare.
There are tons of projects with MIT/BSD/Apache and decades of community participation. Including *BSD and Apache server themselves...
Linux being GNU didn't help it becoming driven by big companies paying all the core developers...
That's exactly what's needed in many cases tho: someone to drop most of the legacy features.
This is why I said rewrites should be introducing something new, rather than just removing something old.
Plus if you really want to remove old feature flags, you can just do that in the original projects. You don't need to rewrite to remove features.
When I read of rewrites that are a subset of the original, and particularly when they advertise themselves as "opinionated", is a project that the author started to learn x, y or z. And that's a fine goal in itself. But I just wish people were more honest and say "I wrote this to learn Rust".
None of these criticisms apply to bat, though. This tool is adding something new.
Reasons, yes. But not very good ones and blown way out of proportion.
(The issue is further exacerbated, in my opinion, by the prevailing notion that test-driven development is superior to — or at least generally more than adequate for — anything and everything that could be desired. Years ago there was a tense Twitter exchange between Bob Martin [of "Clean Code" note] and Shriram Krishnamurthi [a prominent programming languages researcher and professor at Brown University] on this topic, Martin seemingly unwilling to move past a TDD-oriented worldview at that time.)
I'd say that rewriting anything in any language (even in the same language) would remove large amounts of cruft, and add long-missing neat things that are easier to add when you build from scratch, and with a good understanding which the original authors lacked. Often it also can afford using a better architecture, see rg vs grep: grep has many brilliant technical solutions, but making it multithreaded would be a major rewrite anyway.
Edit, with cURL it's OK… 200.
That makes no sense.
bat: A cat(1) Clone with Wings - https://news.ycombinator.com/item?id=33382307 - Oct 2022 (2 comments)
bat, a cat(1) clone with syntax highlighting, Git integration written in Rust - https://news.ycombinator.com/item?id=24850244 - Oct 2020 (6 comments)
Bat: A cat(1) clone with wings - https://news.ycombinator.com/item?id=17887819 - Aug 2018 (12 comments)
Bat: A cat(1) clone with wings - https://news.ycombinator.com/item?id=17849535 - Aug 2018 (1 comment)
Bat: cat(1) clone with syntax highlighting and Git integration - https://news.ycombinator.com/item?id=16968755 - May 2018 (1 comment)
I just downloaded it on MacOS and when I ran it the first time it took a really long time on a one-line JSON which disappointed me but then any subsequent run on anything was fast. I'd completely forgotten about MacOS doing that thing on first run.
Uses the pager for large files automatically etc. Very nice.
If you like that, you can also make less behave like cat if the file fits in one screen. You just need to add F to the LESS environment variable. It's very convenient because so many things depend on less.
https://www.man7.org/linux/man-pages/man1/less.1.html#:~:tex...
works fine for me.
I guess if all you did was read the headline of the post you could assume your alias does all the same things as bat.
> That shows non-printable characters like bat does?
cat does actually support that via the flags with -v (you can also use -t and -e to view tab and line endings too)
> That allows you to concatenate and page multiple files at once like bat does?
cat is literally called “cat” because it’s intended purpose is concatenation.
It’s not a pager though the GPs example did pipe to less anyway.
> That supports the --line-range option like bat does?
‘tail’ and ‘head’ would be muscle memory to a lot of people and not that different in terms of number of keystrokes.
But I do take your point that it’s nice to have that built into your pager.
> You can pipe the output of tail -f through your alias?
I couldn’t see why not. tail -f isn’t doing anything weird with the fd.
———
I’m not arguing against using bat though. I have it aliased to cat on my own machines, so I clearly and would recommend bat. But I do think some people might be surprised how far you can get with coreutils if bat wasn’t available
cat's behaviour and bat's behaviour is different, though.
>cat a.txt b.txt
It was a dark and stormy night.
Once upon a time.
>bat a.txt b.txt
───────┬──────────────────────────────────────────────────────
│ File: a.txt
───────┼──────────────────────────────────────────────────────
1 │ It was a dark and stormy night.
───────┴──────────────────────────────────────────────────────
───────┬──────────────────────────────────────────────────────
│ File: b.txt
───────┼──────────────────────────────────────────────────────
1 │ Once upon a time.
───────┴──────────────────────────────────────────────────────
This difference becomes more useful once we have a more meaningful example: >cat *.py
(thousands of lines of output)
>bat -r :5 -H 2 --style full *.py
───────┬──────────────────────────────────────────────────────
│ File: __init__.py <EMPTY>
│ Size: 0 B
───────┴──────────────────────────────────────────────────────
───────┬──────────────────────────────────────────────────────
│ File: editor.py
│ Size: 2.4 KB
───────┼──────────────────────────────────────────────────────
1 │ import collections
2 │ import contextlib
3 │ import glob
4 │ import io
5 │ import os.path
───────┴──────────────────────────────────────────────────────
It's hard to imagine many people have the muscle memory for the combination of cat, head, and whatever else you need to add headers with the filename and file size, call out empty files, highlight the second line, show line numbers, do syntax formatting, and wrap to the terminal width (head doesn't do this).In fact, you're at odds with bat's README:
> you can still use bat to concatenate files. Whenever bat detects a non-interactive terminal (i.e. when you pipe into another process or into a file), bat will act as a drop-in replacement for cat
> It's hard to imagine many people have the muscle memory for the combination of cat, head, and whatever else you need to add headers with the filename and file size, call out empty files, highlight the second line, show line numbers, do syntax formatting
Honestly, it's harder to imagine many people with need for most combinations of these features. I can see general audience who would happily use one feature at a time, and if someone is constantly doing obscure one-off file analysis, chances are bat is just never enough, they're going to write long pipelines with awk/perl or use vim macros anyway, so there are no time savings nor convenience from using bat. (Is it really that much more convenient to read syntax-highlighted heads with line numbers? And I can barely remember the last time when `head` that also shows file sizes could've been much more handy than `du * ; head *`.)
Also, good luck using all that bat muscle memory in docker containers or old-school fleet of remote servers.
> and wrap to the terminal width (head doesn't do this)
Terminals already wrap long lines just fine, they don't need help from anything. They can also re-wrap lines when window gets resized.
(edit: expanded quote, markup fix)
I also use the fzf previewer with --range-limited pretty frequently.
Your solution would be ok with an alias as well, so thanks. Might try it just so I dont need yet another program lying around
alias bat='f(){ cat "$1" | highlight --force -O xterm256 | less -SRNI; }; f'
You can make an alias.
> For most installs. Or distros. Probably.
Comparing https://repology.org/project/bat-cat/packages vs https://repology.org/project/highlight/packages appears to show them with approximately equal availability. (Unless you're trying some other point, in which case I don't follow.)
Whether this proves or disproves anything in re syntax highlighting utilities I leave as an exercise.
cat .bashrc | highlight --force -O xterm256 | less -SRNI
everything is green, except line numbers are black (which comes from less), but bat .bashrc
shows actual syntax highlighting.so apparently, no, it doesn't work.
to be fair, this is how it works:
cat .bashrc | highlight --syntax shellscript -O ansi | less -R
to avoid getting caught by the useless use of cat police, this does too: highlight --syntax shellscript -O ansi .bashrc | less -R
however, i have to tell it which syntax to usebut to its credit, highlight even has support for pike, which bat doesn't (yet) (fixed that for myself, at least)
so overall, bat wins.
so really bat and hightlight are equal, and it's not just a useless use of cat, but using cat here actually breaks the syntax detection. and it does so in bat too, obviously.
so this means highlight almost wins because it has pike support already, whereas for bat i had to add it, except that it turns out that if highlight can't detect the syntax it produces nothing, and you need --force to fix that, and if it is given multiple files as arguments it writes the output to files too, which is practically never what i want so i need to fix that with --stdout.
bat it is.
This will live in my .bashrc for a long time:
cat() {
if [[ -t 1 ]]; then
command cat "$@" | highlight --force -O xterm256
else
# plain cat to pipe into other things
command cat "$@"
fi
}
i'd change the third line so you can actually get syntax highlighting:
command highlight --stdout --force -O xterm256 "$@"
highlight --force -O xterm256 < whatever | less -SRNI
* https://freshports.org/textproc/highlight/
* https://ftp.netbsd.org/pub/pkgsrc/current/pkgsrc/textproc/hi...