Statistical statements based on website visit logs such as the click-through rate, that means the number of clicks on listed offers in relation to the total number of visits to the website, are strongly distorted by the dominating bot-share among the users. Price comparison sites such as needs to segment users of their website into homogeneous groups to calculate precise and reliable business metrics from those groups.

In this master thesis, the navigation behavior (traces) of users from server-side web protocol data is to be determined. From the (heuristically) constructed traces different features are to be extracted so that with the help of (un- and semi-supervised) machine learning methods the user behavior can be divided into homogeneous groups. Afterwards the quality of this segmentation will be examined.

Data Innovation Community

Industry 4.0

Project partners – a project of solute gmbh
Harun Sentürk (KIT)

Contact persons

Harun Sentürk (KIT),
Ulrich Wünstel (solute gmbh),

Project duration

October 2017 – March 2018