Paper

The paper "FPDetective: Dusting the Web for Fingerprinters" (PDF) describes the first comprehensive effort to measure the prevalence of device fingerprinting on the Internet. It will be presented at the 20th ACM Conference on Computer and Communications Security that takes place in Berlin in November.


Reference: G. Acar, M. Juarez, N. Nikiforakis, C. Diaz, S. Gürses, F. Piessens and B. Preneel. FPDetective: Dusting the Web for Fingerprinters. In Proceedings of CCS 2013, Nov. 2013.

FPDetective Framework

Visit FPDetective at GitHub for source and releases

FPDetective is designed as a flexible, general purpose framework that can be used to conduct large scale web privacy studies. The framework is developed using Python, C++(browser modifications), JavaScript and MySQL programming/scripting languages.

Components

Crawler: The crawler features two instrumented browsers, PhantomJS and Chromium. CasperJS and Selenium were used to drive the browsers to websites and navigate through the pages. To build instrumented versions of the browsers, we modified parts of the WebKit source code, which was the rendering engine used by both Chromium and PhantomJS.


Parser: The parser is used to extract relevant data from the logs generated by the crawler, and to store them in the database. It also tags sites with a label if a known fingerprinting script is found in the HTTP requests made for this visit.


Intercepting Proxy: In order to obtain Flash files for static analysis, we redirected traffic through mitmproxy an SSL-capable intercepting proxy. We used the mitmdump module to log all the HTTP traffic passing through the proxy, and the libmproxy library to parse and extract Flash files based on content sniffing.


Decompiler: We used the JPEXS Free Flash Decompiler to decompile Flash files and obtain the ActionScript source code. The source code is then searched for fingerprinting related function calls (e.g. enumerateFonts and getFontList to obtain a binary occurrence vector. See Appendix B, in the paper, for full set of methods and properties searched in the decompiled source code.


Central Database: We ran crawls using several machines, but used a central database to store, combine, and analyze the results of different crawls with minimal effort. The stored data include the set of JavaScript function calls, the list of HTTP requests and responses, and the list of loaded or requested fonts. For the Flash experiments, we also stored a binary vector that represents the occurence of ActionScript API calls that might be related to fingerprinting.


Source code:

Get the source code at GitHub


Performance:

Using Dromaeo JavaScript performance test suite we compared the performance of our modified Chromium browser against the Chromium browser available from Ubuntu repositories. The difference between their aggregate performance was about 4%. You can check the results online: http://dromaeo.com/?id=197597,197598

Results

Here we present a summary of results, please consult the paper for the details.


JavaScript-Based Font Probing

Table: Prevalence of Fingerprinting with JavaScript Based Font Probing on Top 1M Alexa sites


With FPDetective we found 404 sites in the Alexa top million pages that fingerprint visitors on their homepages using JavaScript-based font probing. These scripts are served by 13 different fingerprinting providers, of which only one had been identified in prior research.

Flash-Based Font Enumeration

Table: Flash Fingerprinting objects with font enumeration, found on Top 10K Alexa websites


Flash-based fingerprinting was present the homepages of 145 out of the top 10,000 sites, indicating that Flash-based fingerprinting is more prevalent. This is possibly because of its extended capabilities for font enumeration, proxy detection and its widespread browser support. Please note that, the table only includes Flash fingerprinters that use font enumeration (95 of them).


Countermeasures: Tor Browser

We found out that the local fonts loaded by @font-face CSS rules are exempted from the Tor Browser's font-per-document cap, and that it is possible to load an unlimited number of system fonts using the local() value of the @font-face rule's src descriptor.

Visit http://jsfiddle.net/C4t7w/13/ for demo and explanation.

Countermeasures: FireGloves

Firegloves is a proof-of-concept browser extension for Mozilla Firefox that was created for research purposes. In order to confuse fingerprinting scripts, Firegloves returns randomized values when queried for certain attributes, limits the number of fonts that a single browser tab can load and reports false dimension values for the offsetWidth and offsetHeight properties of HTML elements to evade JavaScript-based font detection.

Visit http://jsfiddle.net/3DaUG/27/ for a demo of using getBoundingClientRect instead of offsetWidth and offsetHeight.

Do-Not-Track

We set the Do-Not-Track header to 1 in the PhantomJS browser and visited the websites identified as performing JavaScript based fingerprinting in our previous experiments. For all of these pages, we obtained the same results, showing a complete disregard towards Do-Not-Track

Press Coverage

About Us

This study was performed by KU Leuven researchers from the iMinds security department: Gunes Acar (COSIC), Marc Juarez (COSIC & IIIA-CSIC), Nick Nikiforakis (DistriNet), Claudia Diaz (COSIC), Seda Gurses (COSIC & NYU), Frank Piessens (Distrinet), Bart Preneel (COSIC).

Contact

You can contact us via the following address: gunes.acar[AT]esat.kuleuven.be (PGP key)
Top