Squid is a fully-featured HTTP/1.0 proxy
and it offers a rich access control, authorization and logging environment to develop web proxy and content serving applications
.
Let's start with the location of the cache server in the network: according to the documentation, the most suitable place is in the DMZ; this should keep the cache server secure while still able to peer with other, outside, caches (such as the ISP's).
The documentation also recommends setting a DNS name for the cache server (such as "cache.mydomain.tld" or "proxy.mydomain.tld") as soon as possible: a simple DNS entry can save many hours further down the line. Configuring client machines to access the cache server by IP address is asking for a long, painful transition down the road
.
Squid installation is as simple as it can be; you only have to add the Squid package. Available flavors are "ldap" (allowing for LDAP authentication) and "snmp" (including SNMP support).
# export PKG_PATH=/path/to/your/favourite/OpenBSD/mirror # pkg_add squid-x.x.STABLExx-snmp.tgz squid-x.x.STABLExx-snmp: complete --- squid-x.x.STABLExx-snmp ------------------- NOTES ON OpenBSD POST-INSTALLATION OF SQUID x.x The local (OpenBSD) differences are: configuration files are in /etc/squid sample configuration files are in /usr/local/share/examples/squid error message files are in /usr/local/share/squid/errors sample error message files are in /usr/local/share/examples/squid/errors icons are in /usr/local/share/squid/icons sample icons are in /usr/local/share/examples/squid/icons the cache is in /var/squid/cache logs are stored in /var/squid/logs the ugid squid runs as is _squid:_squid Please remember to initialize the cache by running "squid -z" before trying to run Squid for the first time. You can also edit /etc/rc.local so that Squid is started automatically: if [ -x /usr/local/sbin/squid ]; then echo -n ' squid'; /usr/local/sbin/squid fi #
Squid configuration relies on several dozens of parameters, and thus can quickly turn into a very tricky task. Therefore, the best approach is probably starting with a very basic configuration and then tweaking the options, one by one, to meet your specific needs, while still making sure that everything keeps working as expected.
Actually, only a few parameters need to be set to get Squid up and running (theoretically, you could even run Squid with an empty configuration file): for all the options you don't explicitly set, the default values are assumed. Anyway, at least one setting must certainly be changed: the default configuration file denies access to all browsers; and this may sound a bit ...too strict!
Our first configuration will be very simple: we will place our proxy server in the DMZ (172.16.240.0/24, this is the network layout) and allow only requests from the LAN (172.16.0.0/24). No ISP's parent proxy is taken into account.
The main Squid configuration file is /etc/squid/squid.conf. Let's have a look at it.
The http_port option sets the port(s) that Squid will listen on for incoming HTTP requests. There are three forms: port alone (e.g. "http_port 3128"), hostname with port (e.g. "http_port proxy.kernel-panic.it:3128"), and IP address with port (e.g. "http_port 172.16.240.151:3128"); you can specify multiple socket addresses, each on a separate line. If your Squid machine is multi-homed and directly accessible from the internet, it is strongly recommended that you force Squid to bind the socket to the internal address. This way, Squid will only be visible from the internal network and won't proxy the whole world! Squid's default HTTP port is 3128, but many administrators prefer using a port which is easier to remember, such as 8080.
http_port 3128
The cache_dir parameter allows you to specify the path, size and depth of the directories where the cache swap files will be stored. Squid allows you to have multiple cache_dir tags in your config file.
cache_dir ufs /var/squid/cache 100 16 256
The above line sets the cache directory pathname to /var/squid/cache, with a size of 100MB and 16 first-level subdirectories, each containing 256 second-level subdirectories. The cache directory must exist and be writable by the Squid process and its size can't exceed 80% of the whole disk. For further details, please refer to the documentation.
The cache_mgr parameter contains the e-mail address of the Squid administrator, which will appear at the end of the error pages; e.g.:
cache_mgr webmaster@kernel-panic.it
The cache_effective_user and cache_effective_group options, allow you to set the UID and GID Squid will drop its privileges to once it has bound to the incoming network port. The package installation has already created the _squid user and group.
cache_effective_user _squid cache_effective_group _squid
The ftp_user option sets the e-mail address that Squid will use as the password for anonymous FTP login. It's a good practice to use an existing address:
ftp_user webmaster@kernel-panic.it
The following options set the paths to the log files; the format of the access log file, which logs every request received by the cache, can be specified by using a logformat directive (please refer to the documentation for a detailed list of the available format codes):
# Define the access log format logformat squid %ts.%03tu %6tr %>a %Ss/%03Hs %<st %rm %ru %un %Sh/%<A %mt # Log client request activities ('squid' is the name of the log format to use) access_log /var/squid/logs/access.log squid # Log information about the cache's behavior cache_log /var/squid/logs/cache.log # Log the activities of the storage manager cache_store_log /var/squid/logs/store.log
And now we come to one of the most tricky parts of the configuration: Access Control Lists. The simplest way to restrict access is to only accept requests from the internal network. Such a basic access control can be enough in small networks, especially if you don't wish to use features like username/password authentication or URL filtering.
ACLs are usually split into two parts: acl lines, starting with the acl keyword and defining classes, and acl operators, allowing or denying requests based on classes. Acl-operators are checked from top to bottom and the first matching wins. Below is a very basic ruleset:
# Classes acl all src all # Any IP address acl localhost src 127.0.0.0/8 # Localhost acl lan src 172.16.0.0/24 # LAN where authorized clients reside acl manager proto cache_object # Cache object protocol acl to_localhost dst 127.0.0.0/8 # Requests to localhost acl SSL_ports port 443 # https port acl Safe_ports port 80 21 443 # http, ftp, https ports acl CONNECT method CONNECT # SSL CONNECT method # Only allow cachemgr access from localhost http_access allow manager localhost http_access deny manager # Deny requests to unknown ports http_access deny !Safe_ports # Deny CONNECT to other than SSL ports http_access deny CONNECT !SSL_ports # Prevent access to local web applications from remote users http_access deny to_localhost # Allow access from the local network http_access allow lan # Default deny (this must be the last rule) http_access deny all
Now our cache server is almost ready for a first run, just one last step to go. We first need to create the cache-swap directories where Squid will store cached pages. The "squid -z" command will create all the required directories, according to the cache_dir parameter in squid.conf (see above), as the user and group specified by the cache_effective_user and cache_effective_group parameters.
# /usr/local/sbin/squid -z 2009/10/30 18:04:35| Creating Swap Directories #
We are now ready to start Squid. Starting it in debug mode (-d 1 flag) and in foreground (-N flag) will make it easier to see if everything is working fine.
# /usr/local/sbin/squid -d 1 -N 2009/10/30 18:05:19| Starting Squid Cache version 2.7.STABLE6 for i386-unknown-openbsd4.6... [ ... ] 2009/10/30 18:05:19| Accepting proxy HTTP connections at 0.0.0.0, port 3128, FD 10. 2009/10/30 18:05:19| Accepting ICP messages at 0.0.0.0, port 3130, FD 11. 2009/10/30 18:05:19| Accepting SNMP messages on port 3401, FD 12. 2009/10/30 18:05:19| WCCP Disabled. 2009/10/30 18:05:19| Ready to serve requests. 2009/10/30 18:05:22| Done scanning /var/squid/cache (0 entries) 2009/10/30 18:05:22| Finished rebuilding storage from disk. 2009/10/30 18:05:22| 0 Entries scanned 2009/10/30 18:05:22| 0 Invalid entries. 2009/10/30 18:05:22| 0 With invalid flags. 2009/10/30 18:05:22| 0 Objects loaded. 2009/10/30 18:05:22| 0 Objects expired. 2009/10/30 18:05:22| 0 Objects cancelled. 2009/10/30 18:05:22| 0 Duplicate URLs purged. 2009/10/30 18:05:22| 0 Swapfile clashes avoided. 2009/10/30 18:05:22| Took 2.9 seconds ( 0.0 objects/sec). 2009/10/30 18:05:22| Beginning Validation Procedure 2009/10/30 18:05:22| Completed Validation Procedure 2009/10/30 18:05:22| Validated 0 Entries 2009/10/30 18:05:22| store_swap_size = 0k 2009/10/30 18:05:22| storeLateRelease: released 0 objects
Once you get the "Ready to serve requests" message, you should be able to use the cache server. Once it is up and running, Squid reads the cache store: the first time you should see all zeros, as above, because the cache store is empty.
Now, to make sure everything is working fine, we will configure our browser to use our fresh new proxy and we will try to access our favourite web site. In the /var/squid/logs/access.log file, you should see something like:
1242419601.435 6735 172.16.0.13 TCP_MISS/200 11810 GET http://www.kernel-panic.it/ - DIRECT/62.149.140.23 text/html 1242419849.536 14 172.16.0.13 TCP_HIT/200 11820 GET http://www.kernel-panic.it/ - NONE/- text/html [...]
For a detailed description of each field in the access.log file, please refer to the documentation. Anyway, TCP_MISS means that the requested page wasn't stored in the cache (either it was not present or it had expired); TCP_HIT, instead, means that the page was served from the cache. The second field is the time (in milliseconds) that Squid took to service the request: as you can see, it is much shorter when the page is cached. The page size is the fifth field: cached pages may be a little larger because of the extra headers added by Squid.
If everything is working fine, we can stop Squid:
# /usr/local/sbin/squid -k shutdown
and configure the system to start it on boot.
if [ -x /usr/local/sbin/squid ]; then echo -n ' squid' /usr/local/sbin/squid fi
You may also wish to start Squid through the RunCache script, which automatically restarts it on failure and logs both to the /var/squid/squid.out file and to syslog. Just remember to background it with an "&", or it will hang the system at boot time.