Pacfiles and regex

Everything about using Adblock Plus on Mozilla Firefox, Thunderbird and SeaMonkey
Post Reply
Bethrezen

Pacfiles and regex

Post by Bethrezen »

hi all

i use a pac file for content filtering as it works in all browsers and I'm trying to clean up my filter list using some basic regex but its not working and i was wondering if some one could have a look at this filter and tell me why it wont work i cant figure out what i'm doing wrong

Code: Select all

|| shExpMatch(url, "/ad(brite|click|farm|revolver|server|tech|vert|vertising|banner)/")
Fox
Posts: 300
Joined: Sat Jun 10, 2006 3:05 pm
Location: Finland

Post by Fox »

I'am not sure if i did understand your message.
But, why do you wanna use regexp filters, those are hard to understand, and if you have a problem, you must fix, delete or disable whole regexp filter.
according to this: http://adblockplus.org/en/deregifier

Code: Select all

/ad(brite|click|farm|revolver|server|tech|vert|vertising|banner)/
it's same as these simple filters:

Code: Select all

adbanner
adbrite
adclick
adfarm
adrevolver
adserver
adtech
advert
advertising
And i think it's better use dot or slash before every filter, or those can be something like load..., download... and so on.
So like these are better:
.adbrite
/adbrite
And it's better when simple filters have at least eight characters of unbroken text, read this: http://adblockplus.org/en/faq_internal#filters
User avatar
Stupid Head
Posts: 214
Joined: Sat Aug 26, 2006 8:11 pm
Location: USA

Post by Stupid Head »

It doesn't work because shExpMatch requires a shell expression, not a regular expression. I think you're limited to * and ? for that. You'll have to create a regexp object if you want to use it. Also, you can use javascript features like arrays to create your pac file.
What, me worry?
Bethrezen

Post by Bethrezen »

hi
It doesn't work because shExpMatch requires a shell expression, not a regular expression.
if that's the case then why does this one work

(note: this should all be one line but i've broken it up a bit to stop the board running off the screen)

Code: Select all

|| shExpMatch(url, "/\bads\b|2o7|a1\.yimg|ad(brite|click|farm|revolver|server|
tech|vert)|at(dmt|wola)|banner|bizrate|blogads|bluestreak|burstnet|casalemedia|coremetrics|
(double|fast)click|falkag|(feedster|right)media|googlesyndication|hitbox|httpads|imiclk|intellitxt|
js\.overture|kanoodle|kontera|mediaplex|nextag|pointroll|qksrv|speedera|statcounter|tribalfusion|webtrends/")
because mine if just a shortened down version of this and this one works but mine doesn't ??? the question is why ??? what's so different ???
i'm not sure if i did understand your message.
But, why do you wanna use regexp filters, those are hard to understand, and if you have a problem, you must fix, delete or disable whole regexp filter.
according to this: http://adblockplus.org/en/deregifier

Code: Select all

/ad(brite|click|farm|revolver|server|tech|vert|vertising|banner)/
it's same as these simple filters:

Code: Select all

adbanner
adbrite
adclick
adfarm
adrevolver
adserver
adtech
advert
advertising
you have already answered your own question

the reasion I'm trying to use regular expression's is so i can avoid having individual filters for everything because that makes for a very long filter list and that is causing slowdowns

I'd rather just group similar filters together its quicker and more efficient because as soon as the filter gets a hit it stops checking and goes on to the next filter where as for individual filters it must check everyone filter against the contents that is loading so as you can see with very big filter lists this is inefficient
So like these are better:
.adbrite
/adbrite
And it's better when simple filters have at least eight characters of unbroken text, read this: http://adblockplus.org/en/faq_internal#filters
now as a rule of thumb i would agree that simple is better but in this case simple isn't good enough the filter list is just to long and is slowing down browsing unnecessarily
Fox
Posts: 300
Joined: Sat Jun 10, 2006 3:05 pm
Location: Finland

Post by Fox »

I somehow did think that you are using or you are going to start using Adblock Plus.
(And Adblock Plus works faster with simple filters.)
Bethrezen

Post by Bethrezen »

yet more weirdness having played round with this some more i modified my first filter a little from this

Code: Select all

|| shExpMatch(url, "/ad(brite|click|farm|revolver|server|tech|vert)/")
to

Code: Select all

|| shExpMatch(url, "/intellitxt|ad(brite|click|farm|revolver|server|tech|vert)|kontera/")
and the bit in the middle

Code: Select all

ad(brite|click|farm|revolver|server|tech|vert)
worked as expected BUT!!! something still isn't right becaue the filters kontera and intellitxt don't work

what's going on what am i missing ??? could this be something to do with the code for the pacfile being wrong ??

here is the pacfile i'm using minus all the filters

Code: Select all

function FindProxyForURL(url, host)
{
// Allowed List
if (0
    || {{Insert Filters Here}}
)
return "DIRECT";

// Block List
if (0
    || {{Insert Filters Here}}
)
return "PROXY 0.0.0.0:80";
else
return "DIRECT";
}
User avatar
Adblock Plus Fan
Posts: 1255
Joined: Sat Feb 24, 2007 11:08 am

Post by Adblock Plus Fan »

Bethrezen wrote:and that is causing slowdowns

I'd rather just group similar filters together its quicker and more efficient
You sohuld have a read in this thread: http://adblockplus.org/forum/viewtopic.php?t=1222
Wladimir Palant wrote:even if you add a hundred simple filters the processing will still be faster than if you add a single regexp.
@Bethrezen, or maybe you know something about ABP that Wladimir doesn't? :P
Wladimir Palant

Post by Wladimir Palant »

@Adblock Plus Fan: This thread is *not* about Adblock Plus.
User avatar
Adblock Plus Fan
Posts: 1255
Joined: Sat Feb 24, 2007 11:08 am

Post by Adblock Plus Fan »

ups... somehow I missed that. Sorry Bethrezen. :P
Bethrezen

Post by Bethrezen »

hi

dont worry about it i know this is the forum for adblock plus but i figured it was a fitting place to ask this since it was still about adblocking

so any ideas anyone ???
Bethrezen

Post by Bethrezen »

hi all

having had more time to play around with this i finally cracked it

all i needed to do was add a couple of wide card characters

so

Code: Select all

|| shExpMatch(url, "/ad(brite|click|farm|revolver|server|tech|vert)/")
become

Code: Select all

|| shExpMatch(url, "*/ad(banner|brite|click|farm|revolver|server|tech|vert|vertising)/*")
what this filter says is match

Code: Select all

*/adbanner/*
*/adbrite/*
*/adclick/*
*/adfarm/*
*/adrevolver/*
*/adserver/*
*/adtech/*
*/advert/*
*/advertising/*
Post Reply