Hacker News

Adblock via /etc/hosts

by lpszon 2/12/2016, 2:44:42 AM with 39 comments

by sugarfactoryon 2/12/2016, 8:20:24 AM
I had been doing this until some time ago to block ads and to prevent Google from collecting my web browsing history via Google Analytics. During the time I witnessed a strange phenomenon. Every time I added "127.0.0.1 www.google-analytics.com" to C:\Windows\System32\Drivers\etc\hosts. I saw the line removed from the file some hours later. Although I had added tens of lines I only saw the Google Analytics line removed. IIRC finally I decided to figure out whet caused the removal. I used Filemon to watch file changes, though the line got removed again while watching the file and nothing appeared on the log. I suspected Ring-0 processes were secretly running and causing the removal, but I knew nothing about the Windows kernel so I gave up here. I wonder what was the cause to this day.
by sergiotapiaon 2/12/2016, 4:00:44 AM
See, the thing is what if a website is broken due to host files, you can't easily re-enable ads for just this one website you need.
A situation we can all imagine ourselves in: You need to check the google analytics for your website/company site. You can't because it's blocked at Host level.
What solution would there be for this use case?
by stblackon 2/12/2016, 1:43:53 PM
Hi Folks, this is my repo, thanks for all the comments.
I'm always looking for ways to improve things so I'm open to all suggestions.
EDIT: A couple of clarifications.
1) This isn't just for adblock. Your hosts file is useful for thwarting all sorts of malware. If a bot or trojan phones home with a domain, a vigilant hosts file will block it. A if a bot or trojan phones home with an IP, then the hosts file can't help you but, then again, an IP can be physically located fairly quickly.
2) The key to a good hosts file is keeping it current. This hosts file amalgamates several well-curated sources. So your hosts file is only as good as your ability to keep it current. This repo helps with this.
by baptistemon 2/12/2016, 8:06:38 AM
I'm running this for years.
using 127.0.0.1, I have a httpd responding to every request by a 200. this avoid some anti-ad-block check. (such as "watch this ad before your video")
you can also configure your server to reply with a cat gif. but who would like to see a such Internet?
by bloafon 2/12/2016, 4:46:13 AM
Just gonna plug hostsman, which has been doing this on windows since forever:
http://www.abelhadigital.com/hostsman
Lets you chose which lists to use, and automatically update those lists. Also makes it easy to temporarily disable your rules if you need something that's blocked. Has a button for flushing the DNS cache.
by stephendicatoon 2/12/2016, 4:35:21 PM
(Full disclosure: I run a service that blocks and intercepts malware communication using DNS! https://strongarm.io)
Blocking via your hosts file has some great benefits; it works regardless of network and is relatively easy to update. Unfortunately, it doesn't scale easily to many systems or give you any insight into whether or not you are trying to connect to blocked domains.
Blocking via DNS is a good alternative and is suggested multiple times in this thread. You can easily protect a whole network by setting your recursive resolvers and it works across any system.
If you are interested in this and don't want to operate and maintain your own DNS (as well as pulling down various domain lists) check out https://strongarm.io. We manage DNS, aggregating lists of bad domains, and (most uniquely) will alert you if you try and talk to a blocked domain.
It's free for personal use. We are a growing startup and love feedback from HN. Feel free to contact me directly as well! stephen[at]strongarm.io
by vbezhenaron 2/12/2016, 7:49:04 AM
I'm using my own VPN server and I setup unbound DNS server there. It's the only way for my old iPhone and iPad to browse internet without ads. And it's really fast. I use https://pgl.yoyo.org/adservers/ for ad servers list and a little awk script to convert it to unbound format.
by laumarson 2/12/2016, 7:35:38 AM
> Using 0.0.0.0 [instead of 127.0.0.1] is faster because you don't have to wait for a timeout.
I'm not going to argue that localhost is better than 0, but that specific argument they've raised is incorrect. You don't have to wait for a timeout on localhost either. It will either fail instantly due to no listening processes on that IP and port, or it will connect to whatever process you have open on that address (eg a local instance of a http daemon).
by jedisct1on 2/12/2016, 1:32:08 PM
Or use the DNSCrypt proxy. It has a module to filter DNS responses based on their name (full name or using expressions such as sex), or on the IP addresses they resolve to. Instead of returning 127.0.0.1, which can make you vulnerable to rebinding attacks, it returns responses with the standard "REFUSED" response code. https://simplednscrypt.org https://dnscrypt.org
by buro9on 2/12/2016, 3:42:31 PM
I've been doing this on a Streisand created VPN server : https://github.com/jlund/streisand and use most of the same lists (though I've had to remove a few things from "Someone that cares" and I've also had to add a few things - I target apps too so I nuke some specific to mobile apps that are likely not in those lists).
The reasons:
* Block adverts in native mobile apps
* Block adverts in mobile web browsing
* Create a single connection for the mobile (reduce exposure to latency of new connections to different servers)
* VPN connection keep-alive means I seldom reconnect
* Side effect of mitigating risk of my telco screwing with my traffic or excessively logging metadata
It works really, really well.
I'm sure someone will say "battery!" but the cost of mobile adverts on batteries far outweighs the cost of connecting to a VPN.
This is effectively adblock for mobile that works for all apps and websites.
by RKearneyon 2/12/2016, 4:01:26 AM
Some of these map the domains to 127.0.0.1 which is wrong. It should be 0.0.0.0.
On second thought, you shouldn't be using the hosts file for this at all.
by SixSigmaon 2/12/2016, 8:52:44 AM
While this is ok as an idea, I prefer Privoxy [1] to get my ad blocking outside of the browser. It has the benefit that I can turn it on and off (I use a proxy switcher). It also means that I can have other devices use it either via LAN or SSH tunnel or whatever).
[1] http://www.privoxy.org/
by Stamyon 2/12/2016, 8:13:48 AM
Whoever is using OpenWRT this is a great script for blocking hosts https://gist.github.com/teffalump/7227752. I guess it could be modified to use this source.
by riobardon 2/12/2016, 8:13:55 AM
Could someone please explain why advertisers do not re-use functional domain names to defeat domain-based filtering? I always find it fascinating that they still use such obvious ad-only (sub)domains to host assets.
by finishingmoveon 2/13/2016, 12:34:39 AM
I've recently switched from Windows to Linux, and I feel kind of "naked", because I don't know what is the Linux equivalent of Windows Firewall + NOD32 + Common Sense. I've got ufw and AppArmor installed so far. Is using such a huge hosts file a common practice? Also, what about Flash? I don't want to install it, but some websites insist on it still. What would you advise, folks? I'd appreciate any suggestions.
by jcofflandon 2/12/2016, 6:20:29 AM
This already works quite well on my Android phone using AdAway.
Edit: AdAway uses an /etc/hosts file.
by aapjeon 2/12/2016, 9:31:58 AM
Wouldn't it be more efficient to add blocking on router level? Making all of your devices (at home) ad-free.
Anyone exprerienced doing this?
by sosukeon 2/12/2016, 4:52:26 AM
I loved using Gas Mask for multiple host file management on OSX. Not so much for Ads but a great app I had trouble discovering.
by jakeoghon 2/12/2016, 4:12:10 AM
dnsagte[1] should use this as it's default source. Fixing.
[1] https://github.com/jakeogh/dnsgate
by jdosson 2/12/2016, 5:59:34 AM
I wanted a service for my laptop with custom blacklisting/whitelisting, blocking stats and a webserver to serve a blank HTML page for any domains in DNS list so I made:
https://github.com/jdoss/dockerhole
It was inspired by https://pi-hole.net/ and I am glad to see there are others making similar things to block Ads.
by stirneron 2/13/2016, 12:02:39 AM
I made a little C program that converts AdBlock Plus filter lists to hosts file entries: https://github.com/wwalexander/hostsblock
There's a bit of an impedance mismatch since filter lists support some fairly advanced pattern matching while hosts file entries are obviously limited to specific domains, but it gets most domains.
by jsingletonon 2/12/2016, 11:11:04 AM
This technique works very well for blocking ads in Skype.
You can also block the BBC Breaking News banner this way by adding polling.bbc.co.uk. Or if you want to play a prank use 192.30.252.153 as the IP. GitHub pages don't check if you own the domain.
https://unop.uk/dev/breaking-the-news-blocking-the-bbc-news-...
by Boratingon 2/12/2016, 1:58:54 PM
For GNU/Linux also check hostsblock [1]. It's available on aur.
A pi-hole clone notrack [2]
[1] https://gaenserich.github.io/hostsblock/ | [2] https://github.com/quidsup/notrack
by derFunkon 2/12/2016, 8:22:15 AM
I'm using http://pi-hole.net running on a Raspberry Pi. I use it as my home dns, it runs dnsmasq and points a list of a million ad hostnames to its own IP, answering every request with a blank HTML page.
by dbalanon 2/12/2016, 12:41:47 PM
Relevant: adblock with DNS Server https://hub.docker.com/r/kolyunya/afdns/
This is for people who cannot edit /etc/hosts, but can change DNS server.
by dbg31415on 2/12/2016, 2:07:58 PM
http://www.nytimes.com/interactive/2015/10/01/business/cost-...
Ads slow us down.
by cdnsteveon 2/12/2016, 1:12:48 PM
Has anyone tried OpenDNS? It seems that a filtered DNS service is the way to go here.
by kamaalon 2/12/2016, 5:58:31 PM
This needs to be a larger effort, one hosts file updated hourly or any regular interval which we just configure to fetch and update through a cron job and forget.
Would love to see some project like that.
by kup0on 2/12/2016, 3:13:00 AM
Is this reasonably small enough not to cause performance issues? I notice it mentioned trying to keep the size more reasonable.
Ad-blocking via hosts files can often lead to a noticeable performance hit.
by IgorPartolaon 2/12/2016, 1:56:13 PM
Does anyone have a convenient way to convert this to a bind9 config format? I would rather run this for the whole LAN than just one computer at a time.
by LoSboccaccon 2/12/2016, 10:05:25 AM
I used too but windows tends to hang up when the host file gets overly long so had to abandon it when the advertiser grew too many.
Has window 10 got better with that?
by simoncionon 2/12/2016, 5:38:39 AM
Another option is to stand up a DNS server that knows how to do something like BIND 9's Response Policy Zone [0][1].
Although figuring out how to propagate RPZ changes to them isn't exactly straightforward (more on this below), if you're using BIND, you can set up views that match certain clients and provide one mix of RPZs to one set, and another to another set.
On updating RPZs in a view (warning: BIND 9-specific instructions follow) :
So, BIND has this nifty option for a zone called "in-view". This lets you say "The data for this particular zone lives in this other view, so when requests come in for this zone, in this view, use the data in this other view.". It might sound complicated, but it's really just a pointer to a pre-existing zone definition. This lets you define your master zones in one big "zone definition" view, and have client-specific views refer back to those definitions.
However, you can't use in-view with RPZs. Why? Who knows? [2] But, what you can do is this:
* Create one unique RNDC key per view
* Add an allow-notify and match-clients entry in each view with that view's key
* In the appropriate views, add a slave zone definition for each relevant RPZ, with localhost as the master, and whatever is your usual domain xfer key as the key [3]
* Back up in your "zone definition" view, add to your also-notify list for each master RPZ definition an entry for localhost and each view key. [4] Having an ACL just for these RPZ slaves cleans up the RPZ definitions.
Now you have dynamically updatable host blocking that can be deployed on a per-host basis, if you like. It's initially a bit more work than managing a local hosts file, but you can easily apply host blocking lists to any set of machines on your LAN, and you can programmatically update the RPZ lists with tools like nsupdate.
[0] http://jpmens.net/2011/04/26/how-to-configure-your-bind-reso...
[1] http://www.zytrax.com/books/dns/ch7/rpz.html
[2] RPZs are handled just like regular zones in every other way except for this one. It's a bit frustrating.
[3] This is actually less burdensome than it sounds, as you can write these slave zone definitions once and include the files containing the definitions in whatever view needs them.
[4] That is, if you had three views, your also-notify list would have something like the following new entries: 127.0.0.1 key "view1-key"; 127.0.0.1 key "view2-key"; 127.0.0.1 key "view3-key"; You can have entries for just the views that use a given RPZ, but it doesn't hurt to have one ACL that notifies all views when any RPZ data changes.
by bitsodaon 2/12/2016, 1:34:42 PM
Don't the empty <div>s just get left behind taking up space when using this method?
by petteon 2/12/2016, 10:28:37 AM
It took me about a minute to find 12 false entries just by looking what lines end in .de
by DyslexicAtheiston 2/12/2016, 4:12:13 PM
very cool. It would be interesting to see this built into a (maybe raspberryPI) router and have a more central point/policy for configuration maybe together with caching (dnsmasq).
by paulyon 2/12/2016, 10:04:17 AM
been doing this for a long time with various facebook related domains - upsets people if they borrow my laptop though.
by Xophmeisteron 2/12/2016, 10:24:01 AM
Whilst I'm sure this is no longer the case, I used to do this back in the day, but it was soooo slow!