So, those regexps...

Posting here is no longer possible, please use the forum of a filter list project, such as EasyList
Locked
itr
Posts: 3
Joined: Mon Jun 23, 2008 1:04 pm

So, those regexps...

Post by itr »

Hi, original adblock user here, just finally switching to plus.
In the old days regexp was the way to go, now I'm under the impression that it is not?
I have a few MONSTER filters that I've been using in Plus, and noticing no significant slowdowns compared to adblock.
But I'd like to know precisely if these filters would benefit from being split up, and by how much time?

I'm reluctant to.. these filters are like my babies.. I've been crafting them for years to my liking, and they're very effective. But here are a few..

Code: Select all

/(120|125|160|170|180|201|234|250|300|350|390|468|500|600|728|768)x(40|60|88|90|100|120|167|175|176|190|200|250|300|400|410|600)/

/(page|paid|place|tower|web|php|leftnav|brink|context|xmradio|put|promo|pokergrub|mptv|iiw|right|crystal|details|zoffsitetop|iframe|hover|integrity|refreshAll|random|travelot\.|cube|grey|blogsmith|sql|online|php|(real)?text|forum|house|popin|display|directle|primary|side|blog|http|new|flash|layer|yotoshi|mspace|online|mngi|postnews|_bar|sponsor|bullet|secure|azoogle|simple|rsi|internal|cast|section|banner|rolling|google|nk|Dynamic|hosting|value|popup|parser|sponsoredResults|quigo|serve|top_|process|Linked|top|bottom|open|contentdetail)(-|_)?ads?/

/ad_?(cycle|dfreestats|farm|i?frame|vert|images?|sdk|sag|test|trix|zones?|remote|brite|ultfriendfinder|-?serv(e|er|ing)?|view|(c|k)lick|info|type|verts|campaign|engage|fetch|(h|l)eader|js|layer|mar?ker|revenue|vimg|words|reporting|link|content|popup|butler|-partner|sen(s|c)e|v_vert|sys\.|common|net\.biz|Count=|srv\.|tags|tracking|x\.js|vSearch\.htm|start|juggler|-logics|infinity|data|targ|.?text|x.dll|-[a-z]\.php|bureau|counter|features|placement|revolver|solution|sence|channel|vlinks|banner|img|middle|info|req|init|here|expansion|down|label|key|links|tology|tower|panel|desktop|count|out\.php|mcncserve-|origin|refresh|manager|knowledge|log\.php|display|renaline(\.cz|sk\.sk)|sales|area|rotate|tag|code\.ws|Wiz|landpro|-forward|ultadworld\.com|icio\.com|code\.js|agencypro|-flow|interax|\.cgi\?|TopBanner|gardener|fusion|medium|sales|Clutter|square|\.asp|web2|unitid|newsday|RunID|url|stats\.|ffadult|group|interax)/

/banner(source|space|_?ads?|\.exclude\.|type|_promotion|copy|add|swap|-?rotator|s_(right|top)|.*?id=|boxes|\.cgi|links|_?codes?|_?i?frame|[0-9]*x[0-9]*\.js|_anim|_[0-9]\.php|_control|connect|_?exchange)/
Wladimir Palant

Post by Wladimir Palant »

Usually I would recommend http://adblockplus.org/en/deregifier but in this case it will create lots of small and lots of unnecessary filters - it will actually make performance worse. You really don't want to block strings like "nkad", this is too likely to cause false positives. So rather than converting these filters automatically you will have to replace them by simple filters that refer to longer strings (strings that are less likely to appear in legit content). Note that the performance optimizations in Adblock Plus only work if you use simple filters that have at least 8 characters in a row not broken up by wildcards.

In general, Adblock Plus needs 162 ms on my computer to process two (not particularly small but not extraordinary large either) web pages. Same web pages need only half of that time with EasyList that has hardly any filters that are too short. 40 ms per page - not much but the difference can be noticed.
itr
Posts: 3
Joined: Mon Jun 23, 2008 1:04 pm

Post by itr »

So, for example, would easylist, which has 85 filters beginning with ad*, load a page with a few matches on it quicker than it would using my huge regexp ad* filter?
Wladimir Palant

Post by Wladimir Palant »

Yes, definitely. As long as Adblock Plus can create a shortcut for the filter (needs at least 8 characters) hundreds of those filters can be processed faster than a single regexp.
Locked