missed ad

Everything about using Adblock Plus on Mozilla Firefox, Thunderbird and SeaMonkey
Wladimir Palant

Post by Wladimir Palant »

It will be - because any number of simple filters won't increase the processing time.

And http:// at the beginning of the regexp doesn't do anything for efficiency. The whole thing about filters starting with http:// being more efficient is that they are transformed into a regexp like /^http://.../ - note the anchor at the beginning.
IceDogg
Posts: 909
Joined: Fri Jun 09, 2006 11:22 pm

Post by IceDogg »

so these many filters would only be optimized or fast if they have the http:// at the beginning? I mean with the new speed filtering your adding soon.
Wladimir Palant

Post by Wladimir Palant »

Ehm... No... This is two absolutely different things - filters starting with http:// are processed faster starting with ABP 0.6 because it will only look for a match at the beginning of the address. ABP 0.6.2 will make this optimization useless and I'll probably remove it.
IceDogg
Posts: 909
Joined: Fri Jun 09, 2006 11:22 pm

Post by IceDogg »

OK thanks, sorry I feel like I just step in at the end of something. But thanks for not bashing me for it.
User avatar
pirlouy
Posts: 332
Joined: Sat Jun 10, 2006 2:33 pm
Location: France

Post by pirlouy »

I didn't want to insist on http:// but on [^/], i.e. the search stops on first "/".

But if you're sure, I will apply...
Wladimir Palant

Post by Wladimir Palant »

^ has a different meaning in a charater class than it has in the regexp itself.
User avatar
pirlouy
Posts: 332
Joined: Sat Jun 10, 2006 2:33 pm
Location: France

Post by pirlouy »

ok. Thanks for the ^, but... sorry you will find me boring...
In my regxp, the search stops after the first /, that's why I'm not sure that your changes will do something in my case.

In fact, to be honest, since I try your page test, and I've seen 3.2ms before optimization and 50 after, I'm a bit afraid. :-D

And the numion test ( http://www.numion.com/stopwatch/ ) show me that there is no difference when I enable or disable AB+.

But I will wait for the ..2 version and will adapt and optimize again...
Wladimir Palant

Post by Wladimir Palant »

And you are really testing this regexp?
/^http://([^/]+\.)?(a(1\.yimg|ddfreestats|llosponsor)|b(aventures|luestreak)|c(asalemedia|omclick)|daooda|e(stat|uroclick)|fa(lkag|stclick)|gestionpub...)[./:]/
I get 0.9 ms vs. 1.3 ms for it - with and without the anchor at the beginning.
User avatar
pirlouy
Posts: 332
Joined: Sat Jun 10, 2006 2:33 pm
Location: France

Post by pirlouy »

I test this regxp:

/^http://([^/]+\.)?(a(1\.yimg|ddfreestats|llosponsor)|b(aventures|luestreak)|c(asalemedia|omclick)|daooda|e(stat|uroclick)|fa(lkag|stclick)|gestionpub|intellitxt|kontera|lostfrog|netavenir|overture|pimpslord|smartadserver|t(racker|radedoubler)|valueclick|weborama)[./:]/

And disable "Optimize pattern matching", I found 0.64ms (50 runs) whereas
enable "Optimize pattern matching", I found ~55ms.
Wladimir Palant

Post by Wladimir Palant »

Ok, I'm tired explaining. Just wait for ABP 0.6.2.
Paulfox

Post by Paulfox »

pirlouy: I PM'd you as well. Here's how I see it:
a1.yimg may get you false positives - yahoo is infernal for that "a1|i1| thing. At one time I used 4 different LONG filters to handle Yahoo, now I stick with the default
/yimg\.com(.*/adv/|/a[^u])(?!vision)/

Also, your filter is I believe spelled incorrectly in two places; should be
allsponsor, not allosponsor
questionpub, not gestionpub
Anyway, if I HAD to have one term I would optimize those as:

a1.yimg
addfreestats
allsponsor
baventures
bluestreak
casalemedia
comclick
daooda
estat
euroclick
falkag
fastclick
questionpub becoming:

/(casalemedi|daood)a|(com|euro|fast)click|a(1\.yimg|llsponsor)|b(aventures|luestreak)|estat|falkag|questionpub/

Personally I would break into 3 since they're totally unrelated and some are .com and some are .net (not that you like using those suffixes, I know).

/[\W](cible|com|double|euro|fast|inter|precision|value)click[\W]/
Takes care of all the "clicks" on the planet that I know of.

/a(1\.yimg|ddfreestats|llsponsor)/
takes care of the "a" s.
What's left:

baventures
bluestreak
casalemedia
daooda
estat
falkag
questionpub becomes:
/(casalemedi|daood)a|b(aventures|luestreak)|estat|falkag|questionpub/

Even safer would be add the suffixes and use the 3 this way:
/[\W](cible|com|double|euro|fast|inter|precision|value)click[\W]/
/(a(1\.yimg|ddfreestats|llsponsor)|b(aventures|luestreak)|casalemedia|questionpub)\.com/
/daooda|estat|falkag/


Why add suffixes? To eliminate possible false positives. Bear in mind that 'estat" blocks a lot other than "estat.nastyad.botheringpirlouy.net" This kind of "terming" in a filter would be a nightmare, IMHO. I'd find a suffix for that bad boy or otherwise qualify it - or dump it. Then you're good to go! Cheers/p
Post Reply