As I am reaching the final stage of my PhD studies, I have been reflecting about which papers have influenced my work the most and which have amazed me the most. Thus, I want to devote this short blog post to honor the five most influential papers. Some of them might be generally valuable to others, while some are very specific for some problem settings of my thesis. Additionally, I also want to show my gratitude to all authors of all other articles that I have read and cited throughout my PhD studies.
 Clauset, A., Shalizi C. R., and Newman M. E. J.,
Powerlaw distributions in empirical data,
SIAM Rev., 51.4, 661703, 2009
Even though I am not utilizing any power law fitting in my final cumulative thesis, I have been using the methods proposed by Clauset et al. in various experiments throughout the course of my PhD studies. It has amazed me, of how the authors have been able to communicate much technical detail in an intuitive way. For instance, at page 3 the article presents a general recipe for analyzing power law distributed data that  without knowing the technical details  already gives the reader a way to understand the problem and process. Apart from the description of how to analyze power law distributions, this article has changed my way of thinking of how to approach problem settings. For instance, literature has frequently argued (quite shockingly) that distributions follow a power law distribution by simply fitting a straight line to the loglog plot of the data. While this is indeed one common property of power law distributions, there is so much missing with this simple analysis and as Clauset et al. point out such results can not be trusted and one should resort to more statistically sound techniques such as the ones proposed in their paper. For example, a further question to be answered is whether other potential candidate distributions might be better fits to the data compare to the power law fit. In a nutshell, this paper has tought me that the most simplistic approach or answer often is not enough and that one has to think more deeply about what it actually means what someone is doing. For my thesis, this might e.g., refer to the fact that likelihood ratios of nested models are no sufficient way of judging whether one model is a better fit compared to another one. I have blogged about power law distributions in two different blog posts (Power law fitting in Python and Bayesian power law fitting) where the concepts described in this article are relevant.
 Chierichetti, F., Kumar, R., Raghavan, P., and Sarlos, T. ,
Are web users really markovian?,
21st International Conference on World Wide Web, 2012
In 2012, I attended my first WWW conference in Lyon and it has been highly influential to my studies and ways of researching. I have attended the presentation of listed paper and was instantly interested in the question of whether human navigational behavior on the Web is markovian or whether memory effects play a more significant role than partly supposed in related works. As our research group in Graz has been studying human navigational trails for some time, I came home from the conference with the ambition to study more about this problem. After studying the mentioned paper and related works, I decided to devote some time of my PhD studies to the issue of detecting memory effects (Markov chain orders) given human trail data. As a result, I have developed a framework that implements a set of statistical inference methods for detecting the appropriate Markov chain order given human trail data. The corresponding work and framework is now a fundamental aspect of my thesis; all motivated and stimulated by this article.
 Kass, R. E., and Raftery, A. E.,
Bayes factors,
Journal of the american statistical association, 90(430), 773795, 1995
The concept of Bayesian model selection and corresponding Bayes factors has been very crucial for several aspects of my thesis. Thus, this seminal work on Bayes factors has been highly valuable with regard to my understanding of Bayesian model selection using Bayes factors as well as their advantages and disadvantages. Kass and Raftery have achieved to present their ideas and concepts in an intuitive way coupled with several (partly advanced) examples of how to use Bayes factors.
 Gabriel, K. R., and Neumann, J.,
A Markov chain model for daily rainfall occurrence at Tel Aviv,
Quarterly Journal of the Royal Meteorological Society 88(375), 9095, 1962
I love reading older papers. This one was actually published more than 50 years ago. Sometimes, I feel like they are written more clearly compared to modern papers (even though some might say the difference). This article studies the appropriate Markov chain order given rainfall data in Tel Aviv. This is similar to what I have been doing throughout my thesis in a different context. The work by Gabriel and Neumann was based on much smaller data size and complexity though as most of the calculations had to be done by hand. This really demonstrates the advancements we have made the last decades in terms of computational capabilities. In detail, this paper has been valuable to me in several ways. First of all, it presents the concept of calculating log likelihoods and likelihood ratio tests for Markov chain models in an intuitive way. Second, I have used the data (presented via transition tables) as well as corresponding results in order to test the technical soundness of my methods. Also, I have used the data and results for one of my blog posts where I explain how one can determine the appropriate Markov chain order.
 West, R., and Leskovec, J.,
Human wayfinding in information networks,
21st International Conference on World Wide Web, 2012
This article presents experiments on human navigational data derived from the Wikipedia game Wikispeedia. Actually, in my studies I have analyzed similar datasets (i.e., Wikigame and Wikispeedia) in several occasions. Thus, this paper has been highly relevant related work to my studies. Apart from giving me several incentives for my studies, this paper is a great example of how to present scientific work in an intuitive way.
Honorable mention:

C. DavidsonPilon,
Probablistic Programming & Bayesian Methods for Hackers,
Github, 2014
Apart from my list presented above, I want to mention this "book" about probabilistic programming. In detail, this is a freely available book hosted on Github that gives insights into Bayesian statistics in Python using the PyMC library. The content of this book has been really helpful to me as I have focused on Bayesian statistics throughout several occasions during my PhD studies. Apart from the high quality content, the book is novel in several ways. First of all, it is produced collaboratively. Even though C. DavidsonPilon is the main author, everyone can collaborate on the project via Github. Furthermore, this book is an actual IPython book which allows you to interactively see, use and test the Python code presented in the book.