Introduction Whenever a new paper is released using some type of scraped data, most of my peers in the social science community get baffled at how researchers can do this. In fact, many social scientists can’t even think of research questions that can be addressed with this type of data simply because they don’t know it’s even possible. As the old saying goes, when you have a hammer, every problem looks like a nail.

According to CIS’ barometer, political corruption is the second biggest concern in Spain, only behind unemployment, and has been in this position since 2013, as we see Spanish news talking about open trials and new investigations on a regular basis. The European Commission estimates that corruption alone costs the EU economy 120 billion of euros per year, just a little less than the annual budget of the European Union.

Big datasets found in statistical practice often have a rich structure. Most traditional methods, including their modern counterparts, fail to efficiently use the information contained in them. Here we propose and discuss an alternative modelling strategy based on herds of simple models.
Big Data: How big datasets came to be Data has not always been big. Classical datasets such as the famous Anderson’s iris dataset, were often small. Many of the best known statistical methods do also deal with the problems posed by data scarcity rather than data abundance.

You test your code. We know you do. How else are you sure that your changes don’t break the program? But after you commit, you discard those pesky scripts and throw away code. Don’t you think it’s a bit of a waste to dump all that effort that took you quite a decent chunk of your day to conjure? Well, here you are, so let’s see another way. A better way.

Introduction In this session I will focus on Bayesian inference using the integrated nested Laplace approximation (INLA) method. As described in Rue et al. (2009), INLA can be used to estimate the posterior marginal distribution of Bayesian hierarchical models. This method is implemented in the INLA package available for the R programming language. Given that the types of models that INLA can fit are quite wide, we will focus on spatial models for the analysis of lattice data.

Learning to code can be quite hard. Apart from the difficulties of learning a new language, following a book can be quite boring. From my point of view, one of the bests ways to become a good programmer is choosing small and funny experiments oriented to train specific techniques of programming. This is what I usually do in my blog Fronkonstin. In this tutorial, we will learn to combine C++ with R to create efficient loops.

Do you want to know how to make elegant and simple reproducible presentations? In this talk, we are going to explain how to do presentations in different output formats using one of the easiest and most exhaustive statistical software, R. Now, it is possible create Beamer, PowerPoint, or HTML presentations, including R code, \(\LaTeX\) equations, graphics, or interactive content.
After the tutorial, you will be able to create R presentations on your own with R Markdown in RStudio.

Network analysis offers a perspective of the data that broadens and enriches any investigation. Many times we deal with data in which the elements are related, but we have them in a tabulated format that is difficult to import into network analysis tools.
Relationship data require a definition of nodes and connections. Both parts have different structures and it is not possible to structure them in a single table, at least two would be needed.

The R language is peculiar in many ways, and its approach to object-oriented (OO) programming is just one of them. Indeed, base R supports not one, but three different OO systems: S3, S4 and RC classes. And yet, probably none of them would qualify as a fully-fledged OO system before the astonished gaze of an expert in languages such as Python, C++ or Java. In this tutorial, we will review the S3 system, the simplest yet most elegant of them.

Stan is a probabilistic programming language for specifying statistical models. Stan provides full Bayesian inference for continuous-variable models through Markov Chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan can be called through R using the rstan package, and through Python using the pystan package.

Licensed under CC BY-NC-SA 4.0. · Powered by the Academic theme for Hugo.