This goes to mainly to Wladimir, but I'd like everybody to contribute their thoughts about this:
First of all, great work! I just read your article on http://adblockplus.org/blog/investigati ... algorithms
and here is my idea:
I want a possibility to prevent filters like "ad" from blocking "head" etc.
At the moment, the only way to do that seems to be using several filters like this:
Code: Select all
.ad.
/ad.
/ad/*
/ad-
.ad-
.ad_
/ad_
Code: Select all
/([^\w]|_)ad([^\w]|_)/
I know that this will slow down the algorithm, but I think it's worh it.
Another method could be: Split each URL into its components, i.e.
Code: Select all
http://ad.server.com/?showad.php&ad=test
Code: Select all
http
ad
server
com
showad
php
ad
test
Still, it could be very useful.
What do you think?