Hacker News

A library of words: Discovering Roget's Thesaurus (2023)

by NaOHon 5/15/2025, 5:46:19 PM with 6 comments

by ttctciyfon 5/17/2025, 10:16:54 PM
The author notes:
> since around 1962, publishers have abandoned the side-by-side layout of opposing categories which Roget insisted on as a visual representation of the opposing ideas
illustrated by the original's side-by-side entries for 615 Good and 616 Evil, seeing this as an unfortunate
> example of one of the many ways book design is actually getting less sophisticated over time.
It appears the Gutenberg project also see value in preserving the two columns, at least in their html edition, as can be seen in their rendition of the same passages: https://www.gutenberg.org/cache/epub/10681/pg10681-images.ht.... (Link is to a 10M html file).
(Though it seems things have moved on, since Evil is now #619.)
Surely there must be more programmatic electronic editions, though, given the highly tractable organisation of the book?
by MichaelMoser123on 5/18/2025, 3:20:27 AM
I once had a python side project, it parses the 1911 edition of Roget Thesaurus into memory and provides some queries.
https://github.com/MoserMichael/roget-thesaurus-parser
by folexon 5/18/2025, 2:08:53 AM
Where does the stereotype 'thesaurus = synonyms + antonyms' come from?
I'm not a native english speaker, and I never heard that idea besides in, I'd guess, Friends TV show.
I've used thesauruses since my childhood for exactly the task of looking up meanings, explanations, perhaps some etymology baked in.
For English, I always use WordNet, it is quite good and works offline on Android.
For my basic level of Chinese, Outliers dictionaries are so far the best I have found, but that's mainly due to my heavy reliance on the etymology provided there.
Well, I guess I got carried away a bit. Back to my question, where thesaurus=synonyms+antonyms comes from?
by 5-on 5/17/2025, 10:11:09 PM
the cambridge dictionary thesaurus has a similar organisation and i always thought it was a unique quirk (further promulgated by the mobile version calling it "smart thesaurus").
https://dictionary.cambridge.org/thesaurus/articles/differen...
by Michelangelo11on 5/18/2025, 11:55:06 AM
Interesting. So, a kind of precursor to LLMs, or if you like, a pre-electronic, pen-and-paper latent space.