Reducing FilterSet by Effectiveness Rating

Posting here is no longer possible, please use the forum of a filter list project, such as EasyList
Locked
Paulfox

Reducing FilterSet by Effectiveness Rating

Post by Paulfox »

My current filterset is 46 lines long. Lots of compilations and use of "like characters/attributes" to combine reg-ex's into an easy to see/sort/alter/append format. It works great.

However, I'm finding by looking at what filters block what, that only 23 or so are actually being "used" in day to day "surfing" of sites I regularly visit.

What would the advice of forum members (and mcm/G) be on this? Reduce by half the filterset I have and then add blocks accordingly in future useage, or keep the 46 lines to "never see an ad" (which I don't!).

Since 0.5.1.1 en_us is so lean and mean - I'm thinking I'd do the same to my filterset - after all, FX has to "look through" those 46 lines on each site visit - surely 23 vs 46 would help loading time, although I don't know to what degree in "real life" speed gain.

Thanks in advance; I'll bookmark this thread and check responses over next few days. I'm leaning toward stripping the "23 non/seldom used" and add according to my surfing habits as time passes - would appreciate your opinions on this.

Also handy would be a couple of "test sits" if any of you have them - I remember seeing a good one on here a couple of weeks ago but couldn't find it in a search. Something with lots of "nasties" to test what filters I might need to add to the "leaner 23." Cheers and Happy Holiday! /p

Addendum: Found that test site on Google:
http://p2.forumforfree.com/good-site-to ... kplus.html
User avatar
mcm
Posts: 359
Joined: Sat Jun 10, 2006 2:36 am

Post by mcm »

It is a good idea, however I may have a filter like "/\Wads?(vert|click)?\W/" but it could be that it matches "advert" or "adclick" more often then it does "ads". The better approach would be to also output to the log file the URL that it matched and even what part of the URL matched. That way you could also improve the filters which are used a lot to only contain what parts are doing the work.

The problem with all this is that unlike whitelisting having a blocking log will impact on performance a bit which was the main reason I never implemented it for Adblock Plus 0.5.
ecjs
Posts: 170
Joined: Sun Jun 11, 2006 7:39 pm

Post by ecjs »

Paulfox

Post by Paulfox »

I agree with mcm that another "ongoing background process is unneeded. I imported my filterset(s) and options into a spreadsheet, and color coded them as I randomly hit 10-20-50-100-200 sites. Good results often found were colored green, moderate yellow and the "dogs" red. A "dog" was one that came up 0 - 1 times over a couple of weeks.

Using that I've found elements of dog filters, incorporated them into good ones that "were similar," and whittled the list down from 46 to 33, including 4 "personal." My standalone list is 29 lines plus 4 "specials," and it whizzes along in comparison.
Locked