Hacker News

Show HN: Hck – a fast and flexible cut-like tool

by totalperspectivon 7/10/2021, 3:46:53 PM with 10 comments

by lillesvinon 7/10/2021, 5:23:35 PM
I wrote something similar (but necet really finished it), called 'gut', in Go a few years back. Funny thing is, that I literally never use it. I thought splitting on regexes and that stuff would be super useful, but it turns out that I just use Perl one-liners instead. And Perl is available on something like 99.99% of all *nix machines, which my own 'cut'-substitute isn't.
Still a good exercise for me to write it, and I assume for OP too.
by rashil2000on 7/10/2021, 5:18:19 PM
Love seeing these modern alternatives to coreutils! Ripgrep, fd, hyperfine, bat, exa, bottom, gdu, wc, sd, hexyl...
Yet to find a GNU 'tr' alternative though
by kitdon 7/10/2021, 5:56:34 PM
Nice work!
I don't know whether anyone here has used Rexx. The 'parse' instruction in Rexx was incredibly powerful, breaking up text by field/position/delimiter and assigning to variables all in one line.
I've often wondered if there was a command-line equivalent. Awk is great but you have to 'program' the parsing spec, rather than declare it.
by bilalhusainon 7/10/2021, 5:51:38 PM
It is interesting to note how it compares to "choose" (also in Rust) in the benchmarks.
single character
```
    hck           1.494 ± 0.026s
    hck (no-mmap) 1.735 ± 0.004s
    choose        4.597 ± 0.016s
```
multi character
```
    hck           2.127 ± 0.004s
    hck (no-mmap) 2.467 ± 0.012s
    choose        3.266 ± 0.011s
```
The single pass optimization trick[1] seems to be helping a lot in single character case.
Of course, doing away with a pass is suppossed to give 2x, and I am wondering whether the regex constraint lead to this "side-effect".
[1] fast mode - https://github.com/sstadick/hck/blob/master/src/lib/core.rs#... https://github.com/sstadick/hck/blob/master/src/lib/core.rs#...
by asicspon 7/11/2021, 3:09:41 AM
I saw about `hck` recently on twitter, was impressed to see support for compressed files. From the current todo list, I hope complement is implemented for sure.
I see Negative index is currently "unlikely". I'm writing a similar tool [0], but with bash+awk. I solved the negative index support with a `-n` option, which changes the range syntax to `:` instead of `-` character.
My biggest trouble came with literal field separator [1], because FS can only be specified as a string in awk and backslash is a metacharacter for both string and regexp.
[0] https://github.com/learnbyexample/regexp-cut
[1] https://learnbyexample.github.io/escaping-madness-awk-litera...
by visargaon 7/10/2021, 8:28:02 PM
<offtopic> I have implemented a `_split` command to split a line by a separator and `_stat` command that does basically `sort | uniq -c | sort -nr` counting elements and sorting by frequency. Really useful operations for me.
When my one liners become 2-3 lines long I need to switch to a regular script, but I also log all my shell commands years back and have something a bit better than `history | grep word` to search it.</>
by rendallon 7/11/2021, 3:40:39 AM
The README and description should not assume the reader knows what `cut` is or what it's used for. Maybe reference it and then ELI5
by technologicalon 7/11/2021, 1:00:18 AM
Nice one op. It’s mostly due to my lack of knowledge of rust but the code is not easy to read unlike golang. Does anyone feel the same ? (between nothing to do with how op wrote but rather the language itself)
by queueberton 7/10/2021, 5:59:46 PM
Yay, no more piping multiple cuts when you have multiple delimiters.
by toastalon 7/10/2021, 5:01:56 PM
Heck