Blocking objects based on file size.

Posting here is no longer possible, please use the forum of a filter list project, such as EasyList
Locked
LAN_deRf_HA
Posts: 2
Joined: Mon Sep 29, 2008 2:52 pm

Blocking objects based on file size.

Post by LAN_deRf_HA »

I couldn't find anything on this, but I probably just didn't search well enough... Basically I want to know if I can block an image that's repeating all over a page with different file names and dynamically generated urls. I figure the only way to do it short of image recognition software is for it to be blocked based on it's exact file size.. can this be done with ABP?
User avatar
Hubird
Posts: 2850
Joined: Thu Oct 26, 2006 2:59 pm
Location: Australia
Contact:

Post by Hubird »

No you can't block items based on files size but if you post a link to the site someone might be able to come up with something.
LAN_deRf_HA
Posts: 2
Joined: Mon Sep 29, 2008 2:52 pm

Post by LAN_deRf_HA »

It's a rather specific case, not sure you'll find ads doing anything like this. The site is photobucket via http://members.home.nl/bas.de.reuver/files/fusker.html. You use it to scan for photos in private buckets based on file naming schemes you've observed in photos the person has posted publicly. I've been running into very challenging naming schemes lately (first 3 digits increase conventionally, but have a randomized 5 digit number that follows) that require searches in the hundreds of thousands. It becomes impractical to scroll through all of it now that photobucket throws up a small generic image when the file you've requested never existed. To scroll through just 10 thousand of these dud images looking for one picture took me 10 minutes, any faster and my eyes wouldn't catch it flashing by, having just the real images you're looking for and nothing else would be vastly more efficient.
mrbene
Posts: 173
Joined: Tue Apr 10, 2007 10:09 pm
Location: Seattle, WA, USA
Contact:

Post by mrbene »

Nice hack!

('hack' indicating 'creative use of tools to bypass existing boundaries'). You can't know the size of a file from within the web browser before it's downloaded - though if you were to write your own you could probably inspect the HTTP headers and make a best guess based on 'Content-Length' header. I can't remember the full parameters of cURL, but I don't think it exposed the ability to abort based on HTTP headers.

What you can do is bypass the web browser completely and use cURL instead of the site you've linked, then do the inspection of the file sizes on disk - since you're downloading them anyways in the web browser, you're simply removing the CPU need to render all of 'em.
Locked