Skip to content

Commit 002190b

Browse files
committed
Current adhosts script
0 parents  commit 002190b

File tree

7 files changed

+120
-0
lines changed

7 files changed

+120
-0
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
ad.domains
2+
ad.hosts

LICENSE.txt

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
This program is free software. It comes without any warranty, to
2+
the extent permitted by applicable law. You can redistribute it
3+
and/or modify it under the terms of the Do What The Fuck You Want
4+
To Public License, Version 2, as published by Sam Hocevar and
5+
reproduced below.
6+
7+
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
8+
Version 2, December 2004
9+
10+
Copyright (C) 2004 Sam Hocevar <sam@hocevar.net>
11+
12+
Everyone is permitted to copy and distribute verbatim or modified
13+
copies of this license document, and changing it is allowed as long
14+
as the name is changed.
15+
16+
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
17+
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
18+
19+
0. You just DO WHAT THE FUCK YOU WANT TO.

README.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# adhosts - Generate ad-blocking hosts file from multiple sources
2+
3+
This minimal tool builds a composite host file, containing entries mapping
4+
domain names to IP addresses, from multiple sources, such as ad-blocking,
5+
anti-tracking and other lists.
6+
7+
The resulting file can be used with eg. `dnsmasq` to provide ad-blocking via DNS
8+
for an entire network, or it can be installed on single hosts.
9+
10+
This script condenses much of the functionality of eg. PiHole into a form that
11+
is easily integratable into existing systems, without introducing lots of new
12+
software. Thus, this tool is mostly geared towards system administrators and
13+
people already familiar with their infrastructure.
14+
15+
## Setup
16+
17+
The `config` file contains all configurable parameters, which are
18+
19+
* `SRC_HOSTS`: A file containing lines of links to hosts-style (ip-blank-domain) blocklists.
20+
* `SRC_DOMAIN`: A file containing lines of links to domain blocklists.
21+
* `OUT_DOMAINS`: The output file containing the final list of blocked domains.
22+
* `OUT_HOSTS`: The output hosts-style blocklist.
23+
* `ADSERVER`: The host to redirect ads to. This can be used to approximately count the number of
24+
blocked ads (see below).
25+
26+
To add a local blocklist, in either domain or hosts-style, use a `file://` URL.
27+
28+
## Blocking on the local host
29+
30+
To apply the generated blocklist on a single computer, copy it to `/etc/hosts`.
31+
Take care that there may be pre-existing entries in there, which may be destroyed
32+
by simply overwriting the existing file. A reasonable solution may be including the
33+
existing entries via a `file://` URL.
34+
35+
## Blocking for the whole network
36+
37+
To block ads for an entire network, you will need to configure that networks resolver
38+
to prefer entries from an external hosts file. Most resolvers will also read the local
39+
`/etc/hosts` and prefer that to external responses, to copying the generated hosts-style
40+
file to `/etc/hosts` on the resolver should work.
41+
42+
For `dnsmasq`, the following configuration option will read an additional hosts-style file
43+
into the resolver:
44+
45+
```
46+
addn-hosts=/path/to/ad.hosts
47+
```
48+
49+
### Counting blocked ads
50+
51+
When redirecting the blocked ads to a host on the local network via the `ADSERVER` configuration
52+
variable, setting up a web server such as `lighttpd` and having it count the hits will provide
53+
an approximation of the number of ads blocked.
54+
55+
Note that most ad networks (sensibly) use HTTPS now, so the count may be off until you
56+
provide HTTPS on the diversion server (which may be complicated by having to also serve a valid
57+
certificate for the requested domains).

adhosts

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
#!/bin/bash
2+
3+
source config
4+
5+
# Clear temporary data
6+
: > "$OUT_DOMAINS.temp"
7+
8+
# Fetch host-format lists, remove first column
9+
while read list; do
10+
if [ -z "$list" ]; then
11+
continue
12+
fi
13+
printf "Fetching host list %s\n" "$list"
14+
curl "$list" | grep -v "^#" | tr '[:blank:]' ' ' | cut -d " " -f 2 >> "$OUT_DOMAINS.temp"
15+
done < "$SRC_HOSTS"
16+
17+
# Fetch domain-format lists
18+
while read list; do
19+
if [ -z "$list" ]; then
20+
continue
21+
fi
22+
printf "Fetching domain list %s\n" "$list"
23+
curl "$list" | grep -v "^#" >> "$OUT_DOMAINS.temp"
24+
done < "$SRC_DOMAIN"
25+
26+
# Sort and weed out duplicates
27+
sort "$OUT_DOMAINS.temp" | tr -d '\r' | sed -e 's/^[[:blank:]]*//' -e 's/[[:blank:]]*$//' | uniq > "$OUT_DOMAINS"
28+
rm "$OUT_DOMAINS.temp"
29+
30+
# Prepend new ad server
31+
sed -e "s/^/${ADSERVER} /" "$OUT_DOMAINS" > "$OUT_HOSTS"
32+

adlist-sources.domain

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt
2+

adlist-sources.host

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
http://sysctl.org/cameleon/hosts
2+
https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
3+
https://hosts-file.net/ad_servers.txt

config

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
SRC_HOSTS="adlist-sources.host"
2+
SRC_DOMAIN="adlist-sources.domain"
3+
OUT_DOMAINS="ad.domains"
4+
ADSERVER="127.0.0.1"
5+
OUT_HOSTS="ad.hosts"

0 commit comments

Comments
 (0)