R

Displaying time series with R

Introduction A time series is a sequence of observations registered at consecutive time instants. The visualization of time series is intended to reveal changes of one or more quantitative variables through time, and to display the relationships between the variables and their evolution through time. The standard time series graph displays the time along the horizontal axis. On the other hand, time can be conceived as a grouping or conditioning variable.

An introduction to web scraping: locating Spanish schools

Introduction Whenever a new paper is released using some type of scraped data, most of my peers in the social science community get baffled at how researchers can do this. In fact, many social scientists can’t even think of research questions that can be addressed with this type of data simply because they don’t know it’s even possible. As the old saying goes, when you have a hammer, every problem looks like a nail.

Graph Theory 101 with corruption cases in Spain

According to CIS’ barometer, political corruption is the second biggest concern in Spain, only behind unemployment, and has been in this position since 2013, as we see Spanish news talking about open trials and new investigations on a regular basis. The European Commission estimates that corruption alone costs the EU economy 120 billion of euros per year, just a little less than the annual budget of the European Union.

Herds of statistical models

Big datasets found in statistical practice often have a rich structure. Most traditional methods, including their modern counterparts, fail to efficiently use the information contained in them. Here we propose and discuss an alternative modelling strategy based on herds of simple models. Big Data: How big datasets came to be Data has not always been big. Classical datasets such as the famous Anderson’s iris dataset, were often small. Many of the best known statistical methods do also deal with the problems posed by data scarcity rather than data abundance.

Automated testing with 'testthat' in practice

You test your code. We know you do. How else are you sure that your changes don’t break the program? But after you commit, you discard those pesky scripts and throw away code. Don’t you think it’s a bit of a waste to dump all that effort that took you quite a decent chunk of your day to conjure? Well, here you are, so let’s see another way. A better way.

Spatial Data Analysis with INLA

Introduction In this session I will focus on Bayesian inference using the integrated nested Laplace approximation (INLA) method. As described in Rue et al. (2009), INLA can be used to estimate the posterior marginal distribution of Bayesian hierarchical models. This method is implemented in the INLA package available for the R programming language. Given that the types of models that INLA can fit are quite wide, we will focus on spatial models for the analysis of lattice data.

Strange Attractors: an R experiment about maths, recursivity and creative coding

Learning to code can be quite hard. Apart from the difficulties of learning a new language, following a book can be quite boring. From my point of view, one of the bests ways to become a good programmer is choosing small and funny experiments oriented to train specific techniques of programming. This is what I usually do in my blog Fronkonstin. In this tutorial, we will learn to combine C++ with R to create efficient loops.

Mastering R presentations

Do you want to know how to make elegant and simple reproducible presentations? In this talk, we are going to explain how to do presentations in different output formats using one of the easiest and most exhaustive statistical software, R. Now, it is possible create Beamer, PowerPoint, or HTML presentations, including R code, \(\LaTeX\) equations, graphics, or interactive content. After the tutorial, you will be able to create R presentations on your own with R Markdown in RStudio.

Getting from flat data a world of relationships to visualise with Gephi

Network analysis offers a perspective of the data that broadens and enriches any investigation. Many times we deal with data in which the elements are related, but we have them in a tabulated format that is difficult to import into network analysis tools. Relationship data require a definition of nodes and connections. Both parts have different structures and it is not possible to structure them in a single table, at least two would be needed.

Simple yet elegant Object-Oriented programming in R with S3

The R language is peculiar in many ways, and its approach to object-oriented (OO) programming is just one of them. Indeed, base R supports not one, but three different OO systems: S3, S4 and RC classes. And yet, probably none of them would qualify as a fully-fledged OO system before the astonished gaze of an expert in languages such as Python, C++ or Java. In this tutorial, we will review the S3 system, the simplest yet most elegant of them.