The goal of quickinspectR is to make it easier (and quicker) for beginner R programmers to graphically inspect their data. The package is centered around an expanding collection of simple inspect
functions that help users visualize their data.
Currently, this package supports:
-
inspect_normality
: graphically inspect the distribution of numeric variables in your data. -
inspect_balance
: graphically inspect the class balance (or lack thereof) in your data. -
inspect_missing
: graphically inspect missingness in your data.
Installation
You can install the development version of quickinspectR from GitHub with:
# install.packages("devtools")
devtools::install_github("andrewfullerton/quickinspectR")
Examples
By design, all quickinspectR functions require only one argument: a data frame or a tibble (data
). Unless otherwise specified, functions will display all the relevant variables contained in the data frame and default to easy-to-read plot styling.
To get started, load the package.
If you want to see how the numeric variables in your data are distributed (e.g. check if they are skewed), you can use inspect_normality
.
inspect_normality(data = iris)
If you want to check for class imbalance (e.g. unequal distribution of classes within categorical variables), you can use inspect_balance
. Tip: if you’re only interested in a few key variables in your data, then you can use the vars
argument to manually specify which variables will be displayed.
inspect_balance(data = palmerpenguins::penguins, vars = c("species", "island"))
If you want to quickly see if you’re missing any data (and more importantly: where you’re missing data), then you can use inspect_missing
.
inspect_missing(data = palmerpenguins::penguins)
Even though all inspect
functions will run with data
as their sole argument, there may be times when you want to stylize your plots a bit more. To make this more accessible, each inspect
function contains several additional (but completely optional) arguments to enable basic plot customization. Here’s an example using inspect_missing
:
inspect_missing(data = palmerpenguins::penguins,
na_colour = "purple", # Change colour used to represent missing values
fill_colour = "darkgreen", # Change colour used to represent non-missing values
title = "Missing values in penguins dataset") # Add title
And while advanced plot customization is not the goal of quickinspectR, all its functions are built on top of ggplot2 and support more advanced customization via additional arguments. To learn more about this, you can read more in the “Advanced usage” section of this vignette.
Statement of data usage: This package and its documentation make use of the iris
and airquality
datasets from the datasets
package as well as the penguins
dataset from the palmerpenguins
package