Introduction Whenever a new paper is released using some type of scraped data, most of my peers in the social science community get baffled at how researchers can do this. In fact, many social scientists can’t even think of research questions that can be addressed with this type of data simply because they don’t know it’s even possible. As the old saying goes, when you have a hammer, every problem looks like a nail.
Big datasets found in statistical practice often have a rich structure. Most traditional methods, including their modern counterparts, fail to efficiently use the information contained in them. Here we propose and discuss an alternative modelling strategy based on herds of simple models.
Big Data: How big datasets came to be Data has not always been big. Classical datasets such as the famous Anderson’s iris dataset, were often small. Many of the best known statistical methods do also deal with the problems posed by data scarcity rather than data abundance.
As the title reads, in this heterogeneous session we will see three topics of different interest. The first is a collection of three simple and useful one-function R packages that I use regularly in my coding workflow. The second collects some approaches to handling and performing linear regression with big data. The third brings in the freaky component: it presents tools to display graphical information in plain ASCII, from bivariate contours to messages from Yoda!