I see the value of this, but I find the wisdom of it to be highly questionable for anything but the highest-level TLDs.
For example, it enumerates the domains of many US state school districts:
k12.pr.us
// k12.ri.us Removed at request of Kim Cournoyer <netsupport@staff.ri.net>
k12.sc.us
// k12.sd.us Bug 934131 - Removed at request of James Booze <James.Booze@k12.sd.us>
k12.tn.us
k12.tx.us
k12.ut.us
k12.vi.us
k12.vt.us
k12.va.us
k12.wa.us
k12.wi.us
// k12.wv.us Bug 947705 - Removed at request of Verne Britton <verne@wvnet.edu>
k12.wy.us
These seem like awfully specific subdomains to be hardcoded into general-purpose software and entirely reasonable ones to want to set a cookie on or otherwise treat as not-TLDs. The list itself includes evidence of this in the form of exclusions due to bug reports and even makes this point specifically in the case of Hawaii: // k12.hi.us Bug 614565 - Hawaii has a state-wide DOE login
It’s regrettable that browser vendors, even generally responsible ones like Mozilla, feel an incentive to do this.FWIW, this is the same list Facebook told[0] businesses “not” (wink, wink) to add their domain to after Apple announced all the tracking restrictions.
The public suffix list is an abomination --- a useful, pragmatic, largely successful abomination, but an abomination nevertheless. The PSL centralizes and makes static a database that should be dynamic and distributed. It's a throwback to the bad old pre-DNS internet where everyone would copy around /etc/hosts files and rely on ad hoc human updating to keep host->address mapping up to date.
The information in the public suffix list belongs in DNS.
Some related past threads:
Public Suffix List Problems - https://news.ycombinator.com/item?id=20889474 - Sept 2019 (15 comments)
The Public Suffix List - https://news.ycombinator.com/item?id=12311530 - Aug 2016 (40 comments)
Public suffix list - https://news.ycombinator.com/item?id=9634824 - May 2015 (1 comment)
Public Suffix List - https://news.ycombinator.com/item?id=850115 - Sept 2009 (3 comments)
The IETF WG DBOUND tried to find a better solution to this problem and did not reach any consensus. fwiw.
https://datatracker.ietf.org/wg/dbound/about/
The current way most of this is handled is via a list published at publicsuffix.org (commonly known as the "Public Suffix List" or "PSL"), and the general goal is to accommodate anything people are using that for today. However, there are broadly speaking two use patterns. The first is a "top ancestor organization" case. In this case, the goal is to find a single superordinate name in the DNS tree that can properly make assertions about the policies and procedures of subordinate names. The second is to determine, given two different names, whether they are governed by the same administrative authority. The goal of the DBOUND working group is to develop a unified solution, if possible, for determining organizational domain boundaries. However, the working group may discover that the use cases require different solutions. Should that happen, the working group will develop those different solutions, using as many common pieces as it can.
Hey, it works fine as long as you don’t think too much.
Couldn't this be done in DNS? The same way zone delegations appear in there, a way to encode what's a public suffix?
For example (I'm bad at DNS)
_suffix.gitlab.io TXT "type=public,cookies=restrict,cross-origin=forbid"
would tell everyone that remram44.gitlab.io is under the gitlab.io public suffix, and how to deal with cookies etc?
Something I’ve always wondered: why is `co.uk` a TLD? What’s the story behind that?
Getting a domain listed is pretty hard.
Getting vendors to update their PSL in less ubiquitous products is near impossible. For instance, 1Password hasn't shipped a new version in years.
Small plug for a random python tool I maintain that uses this.
Parsing domains is a pain in the ass. It can be impossible to know what is part of tld, what is a subdomain etc without a canonical list and parser.
Here's a sansio domain / tld splitter: https://github.com/theelous3/sansio-tld-parser
Usecase: you want to block all edu domains - but tlds like wa.edu.au exists - gotta parse it out.
Before you begin to make use of the PSL, consider some of its problems: https://github.com/sleevi/psl-problems
FWIW, the link above successfully convinced me and a coworker not to use the PSL.