List optimisation script

Posting here is no longer possible, please use the forum of a filter list project, such as EasyList
Locked
RealEnder
Posts: 3
Joined: Sat Jan 03, 2009 5:41 pm
Location: Sofia
Contact:

List optimisation script

Post by RealEnder »

Helo all,
From some time ago I maintain the Bulgarian adblock list at http://stanev.org/abp/
I've created a simple solution for users to submit their filter addons or requests. One useful tool for keeping the list optimised is the redundancy checker at http://adblockplus.org/en/redundancy_check
Nevertheless, as the time passes, sites get redesigned, change their CMS or just disappear. Keeping all them will slow down the browsing experience.
I've put up a simple python script to help dealing with that. You can download it here: http://stanev.org/abp/abp_list_check.py
Of course, several things are true: good filter is generalized on and not tied to only one site; many webmasters and developers employ various techniques to modify response headers, which can fool us.
The script takes the list file as argument and does some normalization on each entry, if possible, to standard URL. Next, it sends request and if 404 is received, prints the result. It opens the file in readonly mode and will not change anything.
My python knowledge is very poor, so comments, patches and improvements are more then welcome.
Cheers!
Locked