Multiple Regression is one of the most widely used methods in statistical modelling. However, despite its many benefits, it is oftentimes used without checking the underlying assumptions. This can lead to results which can be misleading or even completely wrong. Therefore, applying diagnostics to detect any strong violations of the assumptions is important. In the […]

# Exercises (intermediate)

## Multiple Regression (Part 1)

In the exercises below we cover some material on multiple regression in R. Answers to the exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. We will be using the dataset state.x77, which […]

## Intermediate Tree 2

This is a continuation of the intermediate decision tree exercise. Answers to the exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1 use the predict() command to make predictions on […]

## Working with Shapefiles in R Exercises

R has many powerful libraries to handle spatial data, and the things that R can do with maps can only grow. This exercise tries to demonstrate a few basic functionalities of R while dealing with shapefiles. A shapefile is a simple, nontopological format for storing the geometric location and attribute information of geographic features. Geographic […]

## Intermediate Tree 1

If you followed through the Basic Decision Tree exercise, this should be useful for you. This is like a continuation but we add so much more. We are working with a bigger and badder datasets. We will be also using techniques we learned from model evaluation and work with ROC, accuracy and other metrics. Answers […]

## Descriptive Analytics-Part 5: Data Visualisation (Spatial data)

Descriptive Analytics is the examination of data or content, usually manually performed, to answer the question “What happened?”. In order to be able to solve this set of exercises you should have solved the part 0, part 1, part 2,part 3, and part 4 of this series but also you should run this script which […]

## Model Evaluation 2

We are committed to bringing you 100% authentic exercise sets. We even try to include as different datasets as possible to give you an understanding of different problems. No more classifying Titanic dataset. R has tons of datasets in its library. This is to encourage you to try as many datasets as possible. We will […]

## Descriptive Analytics-Part 5: Data Visualisation (Categorical variables)

Descriptive Analytics is the examination of data or content, usually manually performed, to answer the question “What happened?”. In order to be able to solve this set of exercises you should have solved the part 0, part 1, part 2,part 3, and part 4 of this series but also you should run this script which […]

## R-SQL Exercises

How to write Structured Query Language (SQL) code in R. Well there are many packages on CRAN that relate to databases. In the exercises below we cover some of the important data manipulation operations using SQL in R. We will use a ‘sqldf’ package, an R package for running SQL statements on data frames. Answers […]

## Functions exercises vol. 2

[For this exercise, first write down your answer, without using R. Then, check your answer using R.] Answers to the exercises are available here. Exercise 1 Create a function that given a data frame and a vector, will add a the vector (if the vector length match with the rows number of the data frame) […]