At this year's World Wide Web conference, colleagues and I have published and presented the following paper:
Philipp Singer, Denis Helic, Andreas Hotho and Markus Strohmaier, HypTrails: A Bayesian Approach for Comparing Hypotheses About Human Trails on the Web, 24th International World Wide Web Conference, Florence, Italy, 2015 (Best Paper Award)
[PDF] [arXiv] [Slides] [Tutorial] [Code]
When users interact with the Web today, they leave sequential digital trails on a massive scale. Examples of such human trails include Web navigation, sequences of online restaurant reviews, or online music play lists. Understanding the factors that drive the production of these trails can be useful for e.g., improving underlying network structures, predicting user clicks or enhancing recommendations. In this work, we present a general approach called HypTrails for comparing a set of hypotheses about human trails on the Web, where hypotheses represent beliefs about transitions between states. Our approach utilizes Markov chain models with Bayesian inference. The main idea is to incorporate hypotheses as informative Dirichlet priors and to leverage the sensitivity of Bayes factors on the prior for comparing hypotheses with each other. For eliciting Dirichlet priors from hypotheses, we present an adaption of the so-called (trial) roulette method. We demonstrate the general mechanics and applicability of HypTrails by performing experiments with (i) synthetic trails for which we control the mechanisms that have produced them and (ii) empirical trails stemming from different domains including website navigation, business reviews and online music played. Our work expands the repertoire of methods available for studying human trails on the Web.
In order to better communicate the main concepts and ideas of this approach, I have prepared a tutorial in the form of an IPython notebook. This tutorial also goes into detail about Bayesian inference, Markov chain models and Dirichlet distributions which are all concepts that HypTrails utilizes. The tutorial utilizes the Python implementation of HypTrails.
I hope that this tutorial is helpful to people who are interested in the approach and its Python implementation. I am also happy to hear any feedback and suggestions for improvement.