Adblocking and Counter-Blocking: A Slice of the Arms Race

Rishab Nithyanand

Stony Brook University

Sheharbano Khattak

University of Cambridge

Mobin Javed

University of California, Berkeley

Narseo Vallina-Rodriguez

International Computer Science Institute

Marjan Falahrastegar

Queen Mary University of London

Julia E. Powles

University of Cambridge

Emiliano De Cristofaro

University College London

Hamed Haddadi

Queen Mary University London

Steven J. Murdoch

University College London

Abstract

Adblocking tools like Adblock Plus continue to rise in

popularity, potentially threatening the dynamics of ad-

vertising revenue streams. In response, a number of

publishers have ramped up efforts to develop and de-

ploy mechanisms for detecting and/or counter-blocking

adblockers (which we refer to as anti-adblockers), ef-

fectively escalating the online advertising arms race. In

this paper, we develop a scalable approach for identi-

fying third-party services shared across multiple web-

sites and use it to provide a ﬁrst characterization of anti-

adblocking across the Alexa Top-5K websites. We map

websites that perform anti-adblocking as well as the en-

tities that provide anti-adblocking scripts. We study the

modus operandi of these scripts and their impact on pop-

ular adblockers. We ﬁnd that at least 6.7% of websites in

the Alexa Top-5K use anti-adblocking scripts, acquired

from 12 distinct entities – some of which have a direct

interest in nourishing the online advertising industry.

1 Introduction

Today’s web ecosystem is largely driven by online adver-

tising. However, recent years have seen a large number

of users turn to adblocking and tracker-blocking tools

for the purposes of improving their web-browsing expe-

rience, maintaining privacy, and more recently to pro-

tect themselves against malware [21, 25]. With a recent

study estimating the number of active adblock users to be

198M and revenue losses due to adblockers at $22B [22],

the threat posed by adblockers to the online advertising

revenue model has moved from mildly concerning to ex-

istential. In response, publishers have started to actively

detect users of adblockers, and subsequently block them

or otherwise coerce them to disable the adblocker – in

the rest of the paper, we refer to these practices as anti-

While adblocking differs from tracker-blocking, to ease presentation,

we refer to tools that provide any of these properties as adblockers.

adblocking. Most recently, this practice gained wide at-

tention with the endorsement of the Internet Advertis-

ing Bureau (IAB) when, in March 2016, it released a

primer on how to deal with users of adblockers, as well

as a semi open-source script

for detecting the use of ad-

blockers [12]. The tension between key stakeholders in

this ecosystem – publishers, users, and a plethora of in-

termediate beneﬁciaries – forms part of what has been

dubbed as the adblocking arms race [26].

Motivation. While incidents of anti-adblocking [3, 6,

24, 26], and the legality of such practices [4, 19, 28],

have received increasing attention, current understanding

thereof is limited to a few forums [3] and user-generated

reports [6]. As a result, we lack quantiﬁable insights into

key questions such as: how prevalent nowadays are such

practices on the Web? Are certain categories of web-

sites more likely to employ anti-adblockers? Who are

the main suppliers of anti-adblocking services? What

mechanisms do these employ to detect the presence of

adblockers? Is it possible for adblockers to counter-block

anti-adblockers? What are common responses after pos-

itive detection of adblockers and their impact on end-

users? In this work, we address these questions by pre-

senting the ﬁrst characterization of anti-adblocking.

Roadmap. We start with characterizing anti-adblocking

on the Web by identifying anti-adblocking scripts across

Alexa Top-5K sites. To this end, we develop a scal-

able technique to identify popular third-party services

that are shared across multiple websites, and utilize it

to ﬂag anti-adblocking scripts. We then map out the

entities that serve anti-adblocking scripts and the web-

sites that use these scripts. We ﬁnd that at least 6.7%

of Alexa Top-5K websites conduct some form of anti-

adblocking by downloading 14 scripts from 12 unique

domains most of which belong to ad services, while

one speciﬁcally offers anti-adblocking services. Most

of the anti-adblocking websites represent popular cate-

The script was only made available to members of the IAB.

gories such as news, blogs, and entertainment. We man-

ually visit sample websites from the anti-adblockers and

ﬁnd that the arms race has already entered the next round:

at least one of three popular browser extensions (Ad-

Block Plus, Ghostery, Privacy Badger) can counter-block

half of the anti-adblocking scripts. We conclude with a

discussion of the anti-adblocking arms race in terms of

ethics and legality, also enumerating existing proposals

that aim to achieve a sustainable and unintrusive online

advertising model.

2 Related Work

Raﬁque et al. [23] measure anti-adblocking as an in-

cidental aspect of a broader study of malicious and

deceptive advertisements, malware and scams on free

live-streaming services. They ﬁnd that anti-adblocking

scripts were used by 16.3% of the 1,000 domains they

crawled, which is a bit higher than what we ﬁnd in

the Alexa Top-5K (6.7%), although not surprising given

their heavy use of deceptive ads.

Our paper also complements work quantifying and

characterizing non-transparent third-party web services,

as well as revealing users’ differential treatment. For

example, Ikram et al. [13] proposed a machine learn-

ing approach to characterize JavaScripts used for online

tracking and those used for providing website function-

ality. Their work allows privacy-enhancing tools to more

selectively block JavaScripts without breaking website

functionality. Acar et al. [1] and Liu et al. [16] mea-

sure the prevalence of tracking across large datasets of

websites, while Mayer [17] studies the effectiveness of

some adblocking and anti-tracking tools against those

sites. Khattak et al. [14] assess discrimination against

Tor users at the network and application layer. Various

studies investigate price discrimination [11, 20] and its

methods [7] employed by online marketplaces, and there

are other studies on ﬁlter bubbles – the effect where high

web personalization leads to users being locked in infor-

mation silos [10, 29].

All of these studies illuminate the nature and scale

of opaque practices on the web, informing our under-

standing of complex and multidimensional ecosystems.

Our work complements previous studies by presenting a

novel technique to identify shared objects across multi-

ple websites at scale, and utilizing this approach to pro-

vide a ﬁrst look at how the Web employs anti-adblocking

techniques.

3 Methodology

This section presents our method for identifying third-

party services that are shared between multiple websites.

We describe the technique in the context of identifying

shared anti-adblocking JavaScripts (JS). The premise of

our approach is that by discovering similar objects (in

our case, JavaScripts) that are loaded by multiple web-

sites, we can infer the presence of a common third-party

JS, its functionality and its source.

Crawler overview. We rely on a Selenium-based web

crawler to generate the set of JavaScripts to analyze.

We load each website in our dataset with four browser

modes – vanilla Firefox (with no extensions), Firefox

with AbBlock Plus, Firefox with Ghostery, and Firefox

with Privacy Badger. For each page load, we capture

screenshots, HTML source code, and responses to all

requests generated by the browser. We extract all the

text between <script> and </script> tags from the

HTML and label them as embedded JS. Similarly, we

detect all JS objects in the collected responses and label

them as downloaded JS. In total, the Top-5K Alexa web-

sites generate over 200K individual JS ﬁles when loaded

with the vanilla Firefox browser.

Identifying JS objects with common sources. We for-

mulate our problem of ﬁnding groups of similar JS as a

maximal clique ﬁnding problem [5]. We consider each

JS ﬁle loaded by a website to be a node in a graph. If two

nodes are within some margin of similarity of each other

(we deﬁne our similarity metric below), we say there is

an edge between them. We extract classes of JS that have

a common source by identifying all maximal cliques in

this graph. By intentionally focusing on ﬁnding similar

JS (rather than identical JS) we allow for the grouping

of objects that differ only slightly because they contain

website-speciﬁc identiﬁers, features and properties.

Choice of similarity metric and threshold. In order

to add an edge between two nodes in the graph (i.e., to

indicate that two JS ﬁles in two different websites are

similar), we need to deﬁne a metric for similarity, and a

suitable threshold for this metric. To measure the sim-

ilarity of two JS ﬁles, we use Term Frequency–Inverse

Document Frequency (TF-IDF) to generate a vector of

keyword weights for each JS ﬁle after ﬁltering out JS re-

served words, such as function and var. We then

use the cosine similarity metric to measure the similarity

of the two keyword weight vectors. Similar approaches

using both TF-IDF and cosine similarity have been used

by the information retrieval community for topic identi-

ﬁcation and similarity checking of source-code [15, 30].

We note that this method is particularly well suited to

our task compared to other string matching approaches

because it is:

• White-space insensitive: Many websites perform

script miniﬁcation using different libraries, yielding

different indentation and white-spacing practices.

Our approach is unaffected by these complications.

100

0.4 0.5 0.6 0.7 0.8 0.9 1

True positive rate (%)

#Cliques returned

Similarity Threshold

True positive rate Cliques returned

Figure 1: Effect of the similarity threshold parameter on the

True Positive Rate (TPR) and the number of maximal cliques.

• Position insensitive: In scripts that have several

functionalities (e.g., tracking and ad-block detec-

tion), the position of each speciﬁc function is irrel-

evant to the similarity score.

• Reasonably resistant to noise: Small changes (e.g.,

website speciﬁc identiﬁers) have little impact on the

ﬁnal similarity score.

In order to determine a similarity score threshold, we

perform a series of experiments on a small dataset of

4.4K JS ﬁles extracted from the Alexa Top-100 web-

sites. In each experiment, we set a similarity thresh-

old between 0.40 and 1.00 and compute the cliques in

each of the corresponding graphs. We then manually in-

spect the cliques extracted at each threshold to identify

the fraction of cliques containing JS with identical func-

tionality and sources. Using this approach, we ﬁnd that at

a similarity threshold of 0.80, 17/20 cliques returned by

our program contain scripts with identical functionality

and sources, i.e., achieving True Positive Rate (TPR) of

0.85. In Figure 1, we illustrate the change in TPR along

with the number of cliques returned as the threshold in-

creases. Although thresholds above 0.90 yield TPR=1.0,

the number of cliques returned drops signiﬁcantly, which

will result in lower True Negative Rates (TNR). There-

fore, following a conservative stance, we use a threshold

of 0.80 for the remainder of our experiments.

Improving scalability. Our approach involves comput-

ing the cosine similarity between each pair of keyword

weight vectors, thus requiring O(n

) vector multiplica-

tions for n JS ﬁles. Given the large number of JS ﬁles

used by websites (e.g., the Alexa Top-5K sites contained

over 200K JavaScripts), this may not scale with large

datasets. Therefore, we use a set of heuristically devel-

oped ﬁlters to eliminate comparisons between scripts that

are unlikely to ever be part of the same clique:

• Word-count ﬁlter: We avoid comparing scripts with

signiﬁcant word-count difference. Speciﬁcally, if a

pair of scripts has a word-count ratio higher than

1.50, we assume that they are unlikely to be a part

of the same clique and set their similarity to 0.

Cliques Websites

Downloaded 1,373 3,619

Embedded 509 2,070

Trackers 456 2,741

Anti-Adblockers 22 335

Table 1: The number of total cliques (out of 1,882 found)

and those related to tracking and anti-adblocking, along with

the number of websites that incorporate these scripts (totalling

4,017 websites, computed over 200K downloaded and embed-

ded scripts).

• Embedded vs. downloaded script ﬁlter: JavaScript

is either embedded in the source HTML for page-

speciﬁc functionality, or downloaded separately

from external sources to provide site-wide function-

ality. We do not consider them as the same type of

identity thus we set their similarity to 0.

• Source ﬁlter: If two JavaScripts are fetched from

the exact same URL, we mark them as identical.

• JS domain ﬁlter: JavaScript can communicate with

external sources indicated by embedded URLs. We

assume that for any pair of scripts, if one communi-

cates with external sources and the other does not,

their functionality is different and set their similar-

ity score to 0.

Source and functionality identiﬁcation. Once maximal

cliques of similar scripts are identiﬁed, the content and

meta-data of each script in a clique is used to generate

and log: (i) the FQDN (Fully Qualiﬁed Domain Name)

of the script’s source, (ii) FQDNs of external resources

utilized by the script, and (iii) keywords associated with

the script. In Section 4, we use these three features, in

addition to content of the script, to classify cliques by

functionality.

Method limitations. We acknowledge that our method

has a few limitations. First, our similarity metric will

fail to identify obfuscated JS code. Second, given that

we do not compare downloaded with embedded JS code,

we may fail to identify small cliques in which a re-

duced number of sites integrate an anti-adblocking JS

in a different way than is normal. Finally, our method

may fail to identify similarities between composed JS–

i.e., scripts that consist of multiple individual ﬁles down-

loaded as a single object. As a result, our method only

provides a lower-bound approximation of the usage of

anti-adblocking across websites. We plan on addressing

these limitations in future work.

4 Dataset and Results

We apply our clique detection methodology to the JS

objects fetched by our crawler using the vanilla Fire-

fox browser. We restrict our analysis to cliques of size

greater than 5 – i.e., JavaScripts shared by more than

5 sites in our dataset – as we are interested in identi-

fying scripts that are shared across many websites. We

acknowledge that this approach might fail to ﬂag anti-

adblocking scripts utilized by individual or a small num-

ber of websites, and those used by a few websites in the

Alexa Top-5K but popular among websites ranked above

5K. As shown in Table 1, we ﬁnd 1,373 cliques that are

shared among 3,619 websites in the downloaded ﬁles,

with an average of 232 websites per clique (σ =365.6)

and the largest clique having 1,320 websites (which we

ﬁnd, via manual inspection, is a JS related to jQuery).

Among the embedded scripts, 509 cliques are shared by

2,070 websites (µ =41.2 σ =48.9 max=261).

We manually analyze all the 1,882 cliques (corre-

sponding to 4,017 unique websites) identiﬁed for both

downloaded and embedded scripts, and tag them as

trackers (if they upload information such as IP addresses

and cookies to tracking companies), anti-adblockers (if

they check for the presence of adblockers), or oth-

ers. Manual analysis is performed by identifying exter-

nal libraries and function speciﬁc keywords used in the

scripts. We note that manual analysis of JS is a tedious

process that does not scale to a larger number of scripts,

therefore we leave as part of future work to investigate

ways to automate JS tagging.

We uncover 22 cliques used for anti-adblocking em-

ployed by 335 websites – about 6.7% of Alexa Top-5K

websites. We observe that Alexa Top-1K have 60 anti-

adblocking websites, and the number increases by about

70 websites for every additional 1K considered, reaching

335 anti-adblocking websites in Top-5K. While study-

ing anti-adblockers, we also identify 456 tracking cliques

employed by about 54% of Alexa Top-5K, validating

previous studies on the pervasiveness of tracking over the

Web [8].

Anti-adblocking by website categories. In Table 2, we

report the categories of the 335 anti-adblocking web-

sites, using McAfee’s URL categorization service [18].

We ﬁnd that anti-adblocking is common among a di-

verse mix of publishers, and prevalent among publish-

ers of “General News” (19.5%), “Blogs/Wiki” (9.3%),

and “Entertainment” (8.5%) categories, which represent

more than one third of all websites. Note that these

categories are also among the most popular ones across

all Top-5K Alexa domains, although to a lesser extent

– respectively, 9.4%, 6.29%, and 5.4%. Whereas, other

popular categories among Top-5K domains (e.g., “Inter-

net services”, “Online Shopping”, “Business”, which ac-

count for 20% of the Top-5K) are much less prevalent in

anti-adblocking websites.

Website response to detection of adblockers. In order

to assess how anti-adblocking websites behave once they

% Category % Category

19.5% General News 2.5% Pornography

9.3% Blogs/Wiki 2.5% Forum/Bulletin Boards

8.5% Entertainment 2.2% Technical/Business Forums

4.3% Internet Services 2.2% Potential Illegal Software

3.7% Sports 2.0% Online Shopping

3.7% Games

1.7% Portal Sites

3.2% Travel 1.7% Humor/Comics

3.2% Education/Reference 1.2% Social Networking

2.7% Business 1.2% Provocative Attire

2.5% Software/Hardware 1.2% Marketing/Merchandising

Table 2: Distribution of anti-adblocking websites by category

according to McAfee’s URL categorization.

identify adblockers, we look at all the screenshots taken

by our crawler, respectively, when using the vanilla Fire-

fox browser with no extensions and the Firefox browser

with AdBlock Plus enabled (which we assume is more

likely to be detected due to its popularity [21]) .

We note cases where there is an explicit (i.e., warning

to disable adblocker) or a discrete (i.e., blank page via

AdBlock Plus, but normal appearance without) response

to adblocking. For these websites, we also view screen-

shots when accessed by the Firefox browser with each of

the following extensions: Ghostery, Privacy Badger, and

NoScript.

We ﬁnd only 6 explicit and no discrete responses

to adblocking. Of the explicit responses, 3 are dis-

played by porn websites hosted by the same company

– MindGeek – and employ the same anti-adblocking

script downloaded from DoublePimp. The warning is

displayed for both AdBlock Plus and Ghostery. The re-

maining 3 also employ the same script, but display differ-

ent messages (only for AdBlock Plus) with the same gen-

eral theme, i.e., nudging the user to disable the adblocker

and/or support the website via subscription or donation.

Some websites display adblocker warning to users af-

ter they engage in some form of activity, such as clicking

on links or scrolling. To capture such responses, we re-

peat the above exercise for screenshots taken after mim-

icking user activity – speciﬁcally, clicking on a random

link on the page, scrolling down to the bottom of the

newly loaded page, waiting three seconds, then scrolling

back up to the top of the page, waiting 5 seconds. While

the modiﬁed methodology validates our previous obser-

vations, we do not discover any new responses.

In the attempt of automating the analysis of websites’

response to anti-adblocking, we have also tried to use im-

age comparison tools, such as perceptual hashing. How-

ever, this generates a high number of false positives due

to dynamic content on many sites as well as false nega-

tives since anti-adblocking warnings and messages gen-

erate a relatively small visual difference.

How anti-adblockers work. Next, we manually in-

spect the 22 anti-adblocking scripts (14 downloaded and

Domain Description #Sites ABP Gh PB

pagefair.com Anti-adblocking 20 3 7 3

googleadservices.com Ads 61 7 7 7

googlesyndication.com Ads 13 7 7 7

taboola.com Ads 36 7 3 3

outbrain.com Ads 10 7 3 3

ensighten.com Ads 6 7 3 7

hotjar.com Analytics 9 7 7 7

doublepimp.com Pornography 8 7 3 7

tacdn.com Travel 8 7 7 7

cloudﬂare.com CDN 50 7 7 3

cloudfront.net CDN 6 7 7 7

ytimg.com Content/Ads 108 7 7 7

Table 3: Domains from which anti-adblocking scripts are

downloaded and #websites employing them. The table’s right

side reports whether AdBlock Plus, Ghostery, and Privacy Bad-

ger counter-block anti-adblocking scripts from these domains.

8 embedded) aiming to understand how anti-adblocking

scripts detect adblockers. We note that of these only the

14 downloaded scripts are actually useful as the 8 embed-

ded scripts simply redirect to the downloaded scripts. We

ﬁnd that anti-adblockers operate on a simple premise: if

a bait object (i.e., an object that is expected to be blocked

by ad-blockers – e.g., a JS or DIV element named ads)

on the publisher’s website is missing when the page

loads, the script concludes that the user has an adblocker

installed.

Speciﬁcally, the anti-adblocker detects adblockers by

one of the following approaches: (1) The anti-adblocker

injects a bait advertisement container element (e.g.,

DIV), and then compares the values of properties rep-

resenting dimensions (height and width) and/or vi-

sual status (display) of the container element with the

expected values when properly loaded. (2) The anti-

adblocker loads a bait script that modiﬁes the value of a

variable, and then checks the value of this variable in the

main anti-adblocking script to verify that the bait script

was properly loaded. If the bait object is determined

to be absent, the anti-adblocking script concludes that

an adblocker is present. To track whether the user has

turned off the adblocker after being prompted to do so,

the anti-adblocker periodically runs the ad-block check

and stores the last recorded status in the user’s browser

using a cookie or local storage.

Anti-adblocker suppliers. We analyze the source code

of the 14 anti-adblocking scripts and the domains from

which these are downloaded aiming to infer the suppliers

of these scripts. The remaining 8 embedded scripts redi-

rect to anti-adblocking scripts served by Cloudflare

and Taboola. Our analysis is summarized in Table 3.

We also include a description of these domains – based

on the information available on their ofﬁcial websites,

Google search, and McAfee URL categorization ser-

vice [18] – as well as the number of websites in our

dataset that employ the anti-adblocker.

At the top we ﬁnd Pagefair, a company specialized

in anti-adblocking services, followed by a number of do-

mains related to Google, Taboola, Outbrain and

Ensighten. Overall the anti-adblockers downloaded

from these 5 domains are employed by 48% of all the

315 websites employing anti-adblockers. We note that

these domains are direct beneﬁciaries of anti-adblocking

as these inherently thrive on the prevalence of online ad-

vertisements. Though not directly related to online ad-

vertisement, the ability to detect adblockers is a useful

capability for the analytics company HotJar.

We also ﬁnd two cases where the anti-adblocking

script is shared by entities in the same domain or busi-

ness: TripAdvisor (tacdn.com) distributes the script

to its 8 websites with different country code top-level

domains. Adult websites, all of which are hosted by

MindGeek, turn to DoublePimp for anti-adblocking.

Two anti-adblocking scripts are pulled from popular

Content Delivery Networks (CDNs), but we could not

determine their original supplier. Finally, ytimg (a con-

tent server associated with YouTube) serves a script that

has the ability to detect if ads were properly loaded, how-

ever, it is not clear how it uses this information.

Adblocker response to being blocked. There is anec-

dotal evidence that the adblocking arms race has en-

tered the next level: some adblockers can detect anti-

adblockers and counter-block them [27]. To test for

this behaviour, we visit a sample website for each anti-

adblocking script via AdBlock Plus, Ghostery and Pri-

vacy Badger over Chrome web browser. We repeat the

experiment three times and monitor all HTTP requests

generated when loading the website using Chrome’s De-

veloper Tools. We infer that adblocker can counter-

block if the request to fetch anti-adblocking script fails

to be initiated. As reported in Table 3, half of the

12 anti-adblocking suppliers are blocked by at least

one adblocker. Ghostery and Privacy Badger detect 4

anti-adblockers each, while AdBlock Plus detects only

1. Anti-adblocking scripts served by Taboola and

Outbrain are blocked by both Ghostery and Privacy

Badger, PageFair scripts by both AdBlock Plus and

Privacy Badger, while Doublepimp, Ensighten and

Cloudflare scripts by at most one of the three ad-

blockers. We note that the anti-adblocking suppliers

that are never detected are related to content distribution,

Google ad services, analytics, or site-wide scripts.

5 Discussion

The adblocking arms race involves a plethora of players:

between publishers and consumers, a jostling array of in-

termediaries compete to deliver ads, mostly supported by

business models that involve taking a cut of the resultant

advertising revenue. At the heart of this rich ecosystem

lie important questions regarding the legality and ethics

of adblocking and anti-adblocking.

The legality of adblocking is potentially contestable

under laws about anti-competitive business conduct and

tested these arguments in court, with adblockers winning

most [4], but not all of the cases [19]. On the other hand,

anti-adblocking in the EU might in turn breach Article

5(3) of the Privacy and Electronic Communications Di-

rective 2002/58/EC, as it involves interrogating an end-

user’s terminal equipment without consent [28].

Many consider adblocking to be an ethical choice for

consumers and publishers to consider from both an in-

dividual and societal perspective. In reality, however,

both sides have resorted to radical measures to achieve

their goals. The Web has empowered publishers and ad-

vertisers to track, proﬁle and target users in a way that

is unprecedented in the physical realm [8]. In addition,

publishers are inadvertantly and increasingly serving up

malicious ads [25]. This has resulted in the rise of ad-

blocking, which in turn has led publishers to employ anti-

adblocking. The core issue is to get the balance right

between ads and information: publishers turn to anti-

adblocking to force consumers to reconsider the default

blocking of ads for earnest ad-supported publishers but

defaults are difﬁcult to shift at scale. Nevertheless, those

publishers will fail if they do not redress in a fundamen-

tal way the reasons that brought consumers to adblockers

in the ﬁrst place. There exist proposals to provide a com-

promise, such as privacy-friendly advertising [9] as well

as mechanisms to give users more control over ads and

trackers they are exposed to [2, 31]. Our work extends

these efforts by providing quantiﬁed insights into anti-

adblocking, to inform policy that can improve upon the

current blocking/counter-blocking deadlock.

6 Conclusion

This paper presented a measurement-based analysis

aimed to provide a ﬁrst look at the arms race between

adblocking and anti-adblocking. We found that at least

6.7% of Alexa Top-5K websites, mostly in popular cat-

egories like news, blogs, and entertainment, engage in

some form of anti-adblocking. The arms race has already

entered the next level, as at least one of three popular

browser extensions – AdBlock Plus, Ghostery, Privacy

Badger – can evade half of the anti-adblocking scripts

in our dataset. In future work, we plan to extend our

measurements beyond the Alexa Top-5K websites, and

experiment with crowdsourced and/or automated mech-

anisms to tag JavaScript by functionality and to assess

publisher response to detection of adblockers.

Acknowledgements. The authors would like to thank

the anonymous reviewers for constructive feedback on

preparation of the ﬁnal version of this paper. Rishab

Nithyanand was supported by a Open Technology Fund

Emerging Technology Senior Fellowship. Sheharbano

Khattak and Steven J. Murdoch were supported by The

Royal Society [grant number UF110392]; Engineering

and Physical Sciences Research Council [grant number

EP/L003406/1]. Emiliano De Cristofaro was supported

by a Xerox University Affairs Committee award and EU

grant H2020-MSCA-RISE “ENCASE”.

Source code and data release. The source code of our

JS clique extraction approach can be found at https://bi

tbucket.org/rishabn/ad-study-code. Data created during

this research is available from the University of Cam-

bridge data archive at http://dx.doi.org/10.17863/CAM.

703.

References

[1] G. Acar, C. Eubank, S. Englehardt, M. Juarez,

A. Narayanan, and C. Diaz. The web never forgets: Per-

sistent tracking mechanisms in the wild. In CCS, 2014.

[2] J. P. Achara, J. Parra-Arnau, and C. Castelluccia. My-

TrackingChoices: Pacifying the Ad-Block War by En-

forcing User Privacy Preferences. In WEIS, 2016.

[3] Adblock Plus. Filters for Adblock Plus forum.

https://adblockplus.org/forum/viewforum.php?f=2&

sid=83e35818f92df8cf921623f2ff27ce70.

[4] Adblock Plus. Five and oh look, another lawsuit upholds

users’ rights online. https://adblockplus.org/blog/five-an

d-oh-look-another-lawsuit-upholds-users-rights-online.

[5] C. Bron and J. Kerbosch. Algorithm 457: Finding all

cliques of an undirected graph. Commun. ACM, 16(9),

1973.

[6] Campaign Against The Illegal Detection/Circumvention

Of Adblocking Tools. Alexander Hanff. https://adblocki

ng.think-privacy.com.

[7] L. Chen, A. Mislove, and C. Wilson. Peeking beneath the

hood of Uber. In IMC, 2015.

[8] M. Falahrastegar, H. Haddadi, S. Uhlig, and R. Mortier.

Tracking Personal Identiﬁers Across the Web. In PAM,

2016.

[9] S. Guha, B. Cheng, and P. Francis. Privad: Practical pri-

vacy in online advertising. In NDSI, 2011.

[10] A. Hannak, P. Sapiezynski, A. Molavi Kakhki, B. Krish-

namurthy, D. Lazer, A. Mislove, and C. Wilson. Measur-

ing personalization of web search. In WWW, 2013.

[11] A. Hannak, G. Soeller, D. Lazer, A. Mislove, and C. Wil-

son. Measuring price discrimination and steering on e-

commerce web sites. In IMC, 2014.

[12] IAB Tech Lab. Publisher ad blocking primer. Technical

report, 2016.

[13] M. Ikram, H. J. Asghar, M. A. Kaafar, B. Krishnamurthy,

and A. Mahanti. Towards Seamless Tracking-Free Web:

Improved Detection of Trackers via One-class Learning.

arXiv preprint 1603.06289, 2016.

[14] S. Khattak, D. Fiﬁeld, S. Afroz, M. Javed, S. Sundare-

san, V. Paxson, S. J. Murdoch, and D. McCoy. Do You

See What I See?: Differential Treatment of Anonymous

Users. In NDSS, 2016.

[15] A. Kuhn, S. Ducasse, and T. G

ırba. Semantic clustering:

Identifying topics in source code. Information and Soft-

ware Technology, 49(3), 2007.

[16] B. Liu, A. Sheth, U. Weinsberg, J. Chandrashekar, and

R. Govindan. Adreveal: Improving transparency into on-

line targeted advertising. In HotNets, 2013.

[17] J. Mayer. Tracking the Trackers: Self-help Tools.

http://cyberlaw.stanford.edu/blog/2011/09/tracking-

trackers-self-help-tools, 2011.

[18] McAfee. http://www.trustedsource.org.

[19] MEEDIA. Doppelte Attacke: Springers zwei

Fronten-Strategie im Kampf gegen Ad-Blocker.

http://meedia.de/2015/12/15/doppelte-attacke-springers-

zwei-fronten-strategie-im-kampf-gegen-ad-blocker/.

[20] J. Mikians, L. Gyarmati, V. Erramilli, and N. Laoutaris.

Crowd-assisted search for price discrimination in e-

commerce: ﬁrst results. In CoNEXT, 2013.

[21] Mozilla. Firefox: Most Popular Extensions. https://addo

ns.mozilla.org/en-us/firefox/extensions/?sort=users.

[22] PageFair. The 2015 Ad Blocking Report. https://blog.p

agefair.com/2015/ad-blocking-report/.

[23] M. Z. Raﬁque, T. Van Goethem, W. Joosen, C. Huygens,

and N. Nikiforakis. It’s Free for a Reason: Exploring the

Ecosystem of Free Live Streaming Services. In NDSS,

2016.

[24] Schneier on Security. The Ads vs. Ad Blockers Arms

Race. https://www.schneier.com/blog/archives/2016/02/

the ads vs ad b.html, 2016.

[25] The Guardian. Major sites including New York Times

and BBC hit by ‘ransomware’ malvertising . https://ww

w.theguardian.com/technology/2016/mar/16/major-sites-

new-york-times-bbc-ransomware-malvertising, 2016.

[26] The New York Times. The Ad Blocking Wars. http://nyti

.ms/1Qs20YB, 2016.

[27] The Next Web. This adblocker-blocker helps

you get around sites that ban you for hiding ads.

http://thenextweb.com/apps/2016/02/11/around-and-

around-we-go/#gref, 2016.

[28] The Register. Ad-blocker blocking websites face legal

peril at hands of privacy bods. http://www.theregister.co

.uk/2016/04/23/anti ad blockers face legal challenges/.

[29] X. Xing, W. Meng, D. Doozan, N. Feamster, W. Lee, and

A. C. Snoeren. Exposing inconsistent web search results

with bobble. In PAM, 2014.

[30] T. Yamamoto, M. Matsushita, T. Kamiya, and K. Inoue.

Measuring similarity of large software systems based on

source code correspondence. In Product Focused Soft-

ware Process Improvement. Springer, 2005.

[31] Z. Yu, S. Macbeth, K. Modi, and J. M. Pujol. Tracking

the Trackers. In WWW, 2016.