Page 1 of 1

Wishlist: Compression support... please?

Posted: Thu Sep 27, 2007 4:50 am
by jamieplucinski
Okay I know ABP supports GZipped lists, but my host, as great as he is, is unable to let me gzip the list due to CPU limits of his account.

So I was wondering if it would be possible for ABP to support subscriptions pointing at gzipped/zipped files? Linking to a static compressed file would have all the benefits of compression without the CPU overhead.

I know it'd be a lot of work and would probably drive you insane to add, but it'd be great for those of us with hosts that enforce a CPU quota.

Thanks :)

Posted: Thu Sep 27, 2007 9:30 am
by Wladimir Palant
Does your host allow you to add a custom header? Then you could use statically gzipped files and only add a header to indicate that the browser has to uncompress them (Dr.Evil's list uses this solution). Problem with other solutions is: Firefox cannot uncompress ZIP files on the fly, one would need temporary files. And GZIP uncompressor is not exposed to extensions as far as I can see.

Posted: Thu Sep 27, 2007 10:33 am
by jamieplucinski
Wladimir Palant wrote:Does your host allow you to add a custom header? Then you could use statically gzipped files and only add a header to indicate that the browser has to uncompress them (Dr.Evil's list uses this solution). Problem with other solutions is: Firefox cannot uncompress ZIP files on the fly, one would need temporary files. And GZIP uncompressor is not exposed to extensions as far as I can see.
I can add custom headers through a PHP script, but nothing globally, sadly. I'm going to try experimenting with GZip headers and some PHP to try and get it to serve some statically gzipped goodness.

Posted: Thu Sep 27, 2007 10:34 am
by fanboy
your list getting a bit big there jamie? ;)

Posted: Thu Sep 27, 2007 11:13 am
by jamieplucinski
fanboy wrote:your list getting a bit big there jamie? ;)
Well I've always seen that some of the other lists were smaller (file size wise) and always figured that my list was huge! But then common sense kicked in and I realised that everyone was using GZipping... but me! lol

Compressed my list comes in around the same size as everyone elses... which will save me some bandwidth until I can get around to optimizing the list which should be soon!

Posted: Thu Sep 27, 2007 11:48 am
by jamieplucinski
Okay for anyone else wanting to do this, here goes:

First download your current filter list and save it as subscription_latest.txt

Then using gzip (google gzip for Windows if you don't have linux) and then, through a command prompt do:

Code: Select all

type subscription_latest.txt | gzip -9 -f > subscription_latest.gz
from the directory you save subscription_latest.txt to, then upload subscription_latest.gz to your server.

Then make a new PHP file and paste the following into it:

Code: Select all

<?php
	if(substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip'))
	{
		global $HTTP_ACCEPT_ENCODING;
		if( headers_sent() ){
			$encoding = false;
		}elseif( strpos($HTTP_ACCEPT_ENCODING, 'x-gzip') !== false ){
			$encoding = 'x-gzip';
		}elseif( strpos($HTTP_ACCEPT_ENCODING,'gzip') !== false ){
			$encoding = 'gzip';
		}else{
			$encoding = false;
		}
		header('Content-Encoding: ' .$encoding);
			print(file_get_contents("subscription_latest.gz"));
	}else{
		// Path to your normal (uncompressed) list
		include_once("subscription.php");
	}
?>
Make sure subscription.php (or whatever your normal list name is) and subscription_latest.gz are in the same directory as the php file. And you're done :) You can test if your new PHP file is serving gzipped content properly by inputting the URL here.

GZip compression without CPU load... :D Now to move this live and start optimizing the list.

Posted: Thu Sep 27, 2007 12:09 pm
by Wladimir Palant
Last I heard, serving files through PHP was extremely bad performance-wise. So it is much better if you have Apache with mod_header and can use .htaccess.

Posted: Thu Sep 27, 2007 12:16 pm
by jamieplucinski
Wladimir Palant wrote:Last I heard, serving files through PHP was extremely bad performance-wise. So it is much better if you have Apache with mod_header and can use .htaccess.
Again it comes down to the host, it's a bad performance hit yes, but nowhere near as bad as Gzipping on demand, and caters to the fact that some hosts (including mine) seem to ignore .htaccess files completely :(

I'd much rather have the performance hit from this, than 3 times the bandwidth use. If this has the desired/undesired effect of more CPU usage then I'll revert but I guess I'll find out when/if my host starts getting mad at CPU usage ;)

Edit: PHP code updated to stomp out a bug.

Posted: Thu Sep 27, 2007 6:53 pm
by Stupid Head
If you want to save more bandwidth, caching is very effective.

Code: Select all

<?php

// Variables
$text_file = 'subscription_file.txt';
$gmt_mtime = gmdate('D, d M Y H:i:s', filemtime($text_file)).' GMT';
$expires = gmdate('D, d M Y H:i:s', strtotime('+5 days')).' GMT';

// Code
if($_SERVER['HTTP_IF_MODIFIED_SINCE'] == $gmt_mtime){
        header('Expires: '.$expires);
        header('HTTP/1.0 304 Not Modified');
        exit();
}
ob_start("ob_gzhandler");
header('Last-Modified: '.$gmt_mtime);
header('Expires: '.$expires);
header('Content-Type: text/plain; charset=UTF-8');
include($text_file);

?>
I'm more worried about bandwidth than cpu time because my storage and bandwidth are metered. My host is running Apache 1.3 right now, so I can't use mod_deflate. As soon as they upgrade, I'm going to stop using the poor man's compression.

Posted: Thu Sep 27, 2007 7:53 pm
by chewey
Jamie: Why not put the compressed list on your server and let the server deliver it as gzipped?

My list is compressed exactly once per revision: On my machine before being transferred to the server.

Presto: No performance hit at all. :-)

Posted: Thu Sep 27, 2007 11:32 pm
by jamieplucinski
chewey wrote:Jamie: Why not put the compressed list on your server and let the server deliver it as gzipped?

My list is compressed exactly once per revision: On my machine before being transferred to the server.

Presto: No performance hit at all. :-)
That's exactly what I did with that code ;) I gzip it locally (highest compression posssible) and then upload it, I have a nice automated script to download the un-gzipped list, compress it, and then upload it via FTP, works like a charm. Then the script just loads the .gz file and displays it with just a single header inserted above it. Similar to the mod_rewrite mods that people do, just through the power of PHP :)