Methodologies to derive subjective risk for web tracking
- Contributors: Vito Gambina, Marco Mellia, Martino Trevisan, Luca Vassio
- Year: 2022
This work aims to increase the users’ awareness while browsing the internet, introducing them to today’s tracking ecosystem and derive their perceived risk to assign to websites a subjective risk indicator score. In the digital era where we live, in almost all families it is possible to find a device, such as smartphone, pc, tablet, capable of connecting to the internet and allowing them to visit dozens of websites every day. During their daily online activity, people are unaware to encounter dozens and dozens of web trackers, that, nowadays, represent the most widespread threat to our privacy, allowing the slow and constant accumulation of different kinds of online data in order to build users profiles and to customize targeted ads or other things. For this reason focusing on privacy online and data security is increasingly important and provide users, during their navigation, an indicator of websites risk may be a first step to improve their online experience. In this thesis, a simply survey was developed where some tracking features of the most important websites were presented to some normal web users in order to derive their perceived risk. The results gathered from the survey conducted were analyzed through a machine learning algorithm useful for this thesis purpose. The choice fell on the linear regression algorithm, one of the most basic tools in the area of machine learning for prediction. This was used to estimate the relationships between the objective tracking data and the final risk score indicated by users for each website with the final purpose to construct a model able to predict this score also for other websites. The linear regression model built performs very well reaching very good level of accuracy and shows that machine learning algorithms can be considered for this kind of situation. The results obtained through this thesis work provide users with a better awareness in controlling their data and provide a new point of view for future studies on the web tracking ecosystem.
- Repository link: https://webthesis.biblio.polito.it/22778/1/tesi.pdf
- Download: PDF file