Tidymodels Walkthrough

PUBLISHED ON JAN 26, 2020

The point of this post is to explore some of the packages and capabilities of the {tidymodels} universe of packages, including:

  • rsample
  • recipes
  • dials
  • parsnip
  • tune
  • yardstick

Together, these packages can help streamline the modeling process. The point of this post is not to try to build the best model(s) possible for the given data, and I’m skipping over a bunch of the exploration I would normally do before modeling to get right into the modeling steps.

To illustrate this framework, I’m going to use the House Price dataset from Kaggle, and our goal is going to be to build models that predict the sale price of a house given a bunch of other data we have (lot size, number of bedrooms, etc.).

First, I’ll read in the data and load the packges. Note that you’ll want to install {tune} from Github (see the commented out line in the code below) because it’s not on CRAN yet.

knitr::opts_chunk$set(echo = TRUE, warning = FALSE)

set.seed(0408)

#to install the tune package, run:
#devtools::install_github("tidymodels/tune")

library(tidyverse)
## -- Attaching packages ------------------------------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.2.1     v purrr   0.3.3
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   1.0.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## -- Conflicts ---------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(janitor)
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(tidymodels)
## -- Attaching packages ------------------------------------------------------------------------------------------------------------------ tidymodels 0.0.3 --
## v broom     0.5.3     v recipes   0.1.9
## v dials     0.0.4     v rsample   0.0.5
## v infer     0.5.1     v yardstick 0.0.5
## v parsnip   0.0.5
## -- Conflicts --------------------------------------------------------------------------------------------------------------------- tidymodels_conflicts() --
## x scales::discard()   masks purrr::discard()
## x dplyr::filter()     masks stats::filter()
## x recipes::fixed()    masks stringr::fixed()
## x dplyr::lag()        masks stats::lag()
## x dials::margin()     masks ggplot2::margin()
## x yardstick::spec()   masks readr::spec()
## x recipes::step()     masks stats::step()
## x recipes::yj_trans() masks scales::yj_trans()
library(tune)

train <- read_csv("~/Data/Kaggle/House Prices/Data/train.csv") %>%
  clean_names()
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   Id = col_double(),
##   MSSubClass = col_double(),
##   LotFrontage = col_double(),
##   LotArea = col_double(),
##   OverallQual = col_double(),
##   OverallCond = col_double(),
##   YearBuilt = col_double(),
##   YearRemodAdd = col_double(),
##   MasVnrArea = col_double(),
##   BsmtFinSF1 = col_double(),
##   BsmtFinSF2 = col_double(),
##   BsmtUnfSF = col_double(),
##   TotalBsmtSF = col_double(),
##   `1stFlrSF` = col_double(),
##   `2ndFlrSF` = col_double(),
##   LowQualFinSF = col_double(),
##   GrLivArea = col_double(),
##   BsmtFullBath = col_double(),
##   BsmtHalfBath = col_double(),
##   FullBath = col_double()
##   # ... with 18 more columns
## )
## See spec(...) for full column specifications.
test <- read_csv("~/Data/Kaggle/House Prices/Data/test.csv") %>%
  clean_names()
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   Id = col_double(),
##   MSSubClass = col_double(),
##   LotFrontage = col_double(),
##   LotArea = col_double(),
##   OverallQual = col_double(),
##   OverallCond = col_double(),
##   YearBuilt = col_double(),
##   YearRemodAdd = col_double(),
##   MasVnrArea = col_double(),
##   BsmtFinSF1 = col_double(),
##   BsmtFinSF2 = col_double(),
##   BsmtUnfSF = col_double(),
##   TotalBsmtSF = col_double(),
##   `1stFlrSF` = col_double(),
##   `2ndFlrSF` = col_double(),
##   LowQualFinSF = col_double(),
##   GrLivArea = col_double(),
##   BsmtFullBath = col_double(),
##   BsmtHalfBath = col_double(),
##   FullBath = col_double()
##   # ... with 17 more columns
## )
## See spec(...) for full column specifications.
miss_plot <- function(data, color1 = "steelblue1", color2 = "steelblue4", bound = 0) {
  miss_tab <<- tibble(
    column = names(data),
    perc_miss = map_dbl(data, function(x) sum(is.na(x))/length(x))
  ) %>%
    filter(perc_miss > bound)
  
  ggplot(miss_tab, aes(x = column, y = perc_miss)) +
    geom_bar(stat = "identity", aes(fill = ..y..)) +
    scale_y_continuous(labels = scales::percent) +
    theme(axis.text.x = element_text(angle = 60, hjust = 1)) + 
    scale_fill_gradient(low = color1, high = color2, name = "Percent Missing") +
    labs(
      title = "Missingness by Variable",
      y = "Percent Missing",
      x = "Variables"
    )
}

Like I mentioned before, I’m going to skip over some of the EDA and really trying to understand the data I’m working with. But I’ll still take a quick gander at some of the missingness.

glimpse(train)
## Observations: 1,460
## Variables: 81
## $ id              <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
## $ ms_sub_class    <dbl> 60, 20, 60, 70, 60, 50, 20, 60, 50, 190, 20, 60, 20...
## $ ms_zoning       <chr> "RL", "RL", "RL", "RL", "RL", "RL", "RL", "RL", "RM...
## $ lot_frontage    <dbl> 65, 80, 68, 60, 84, 85, 75, NA, 51, 50, 70, 85, NA,...
## $ lot_area        <dbl> 8450, 9600, 11250, 9550, 14260, 14115, 10084, 10382...
## $ street          <chr> "Pave", "Pave", "Pave", "Pave", "Pave", "Pave", "Pa...
## $ alley           <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ lot_shape       <chr> "Reg", "Reg", "IR1", "IR1", "IR1", "IR1", "Reg", "I...
## $ land_contour    <chr> "Lvl", "Lvl", "Lvl", "Lvl", "Lvl", "Lvl", "Lvl", "L...
## $ utilities       <chr> "AllPub", "AllPub", "AllPub", "AllPub", "AllPub", "...
## $ lot_config      <chr> "Inside", "FR2", "Inside", "Corner", "FR2", "Inside...
## $ land_slope      <chr> "Gtl", "Gtl", "Gtl", "Gtl", "Gtl", "Gtl", "Gtl", "G...
## $ neighborhood    <chr> "CollgCr", "Veenker", "CollgCr", "Crawfor", "NoRidg...
## $ condition1      <chr> "Norm", "Feedr", "Norm", "Norm", "Norm", "Norm", "N...
## $ condition2      <chr> "Norm", "Norm", "Norm", "Norm", "Norm", "Norm", "No...
## $ bldg_type       <chr> "1Fam", "1Fam", "1Fam", "1Fam", "1Fam", "1Fam", "1F...
## $ house_style     <chr> "2Story", "1Story", "2Story", "2Story", "2Story", "...
## $ overall_qual    <dbl> 7, 6, 7, 7, 8, 5, 8, 7, 7, 5, 5, 9, 5, 7, 6, 7, 6, ...
## $ overall_cond    <dbl> 5, 8, 5, 5, 5, 5, 5, 6, 5, 6, 5, 5, 6, 5, 5, 8, 7, ...
## $ year_built      <dbl> 2003, 1976, 2001, 1915, 2000, 1993, 2004, 1973, 193...
## $ year_remod_add  <dbl> 2003, 1976, 2002, 1970, 2000, 1995, 2005, 1973, 195...
## $ roof_style      <chr> "Gable", "Gable", "Gable", "Gable", "Gable", "Gable...
## $ roof_matl       <chr> "CompShg", "CompShg", "CompShg", "CompShg", "CompSh...
## $ exterior1st     <chr> "VinylSd", "MetalSd", "VinylSd", "Wd Sdng", "VinylS...
## $ exterior2nd     <chr> "VinylSd", "MetalSd", "VinylSd", "Wd Shng", "VinylS...
## $ mas_vnr_type    <chr> "BrkFace", "None", "BrkFace", "None", "BrkFace", "N...
## $ mas_vnr_area    <dbl> 196, 0, 162, 0, 350, 0, 186, 240, 0, 0, 0, 286, 0, ...
## $ exter_qual      <chr> "Gd", "TA", "Gd", "TA", "Gd", "TA", "Gd", "TA", "TA...
## $ exter_cond      <chr> "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA...
## $ foundation      <chr> "PConc", "CBlock", "PConc", "BrkTil", "PConc", "Woo...
## $ bsmt_qual       <chr> "Gd", "Gd", "Gd", "TA", "Gd", "Gd", "Ex", "Gd", "TA...
## $ bsmt_cond       <chr> "TA", "TA", "TA", "Gd", "TA", "TA", "TA", "TA", "TA...
## $ bsmt_exposure   <chr> "No", "Gd", "Mn", "No", "Av", "No", "Av", "Mn", "No...
## $ bsmt_fin_type1  <chr> "GLQ", "ALQ", "GLQ", "ALQ", "GLQ", "GLQ", "GLQ", "A...
## $ bsmt_fin_sf1    <dbl> 706, 978, 486, 216, 655, 732, 1369, 859, 0, 851, 90...
## $ bsmt_fin_type2  <chr> "Unf", "Unf", "Unf", "Unf", "Unf", "Unf", "Unf", "B...
## $ bsmt_fin_sf2    <dbl> 0, 0, 0, 0, 0, 0, 0, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
## $ bsmt_unf_sf     <dbl> 150, 284, 434, 540, 490, 64, 317, 216, 952, 140, 13...
## $ total_bsmt_sf   <dbl> 856, 1262, 920, 756, 1145, 796, 1686, 1107, 952, 99...
## $ heating         <chr> "GasA", "GasA", "GasA", "GasA", "GasA", "GasA", "Ga...
## $ heating_qc      <chr> "Ex", "Ex", "Ex", "Gd", "Ex", "Ex", "Ex", "Ex", "Gd...
## $ central_air     <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "...
## $ electrical      <chr> "SBrkr", "SBrkr", "SBrkr", "SBrkr", "SBrkr", "SBrkr...
## $ x1st_flr_sf     <dbl> 856, 1262, 920, 961, 1145, 796, 1694, 1107, 1022, 1...
## $ x2nd_flr_sf     <dbl> 854, 0, 866, 756, 1053, 566, 0, 983, 752, 0, 0, 114...
## $ low_qual_fin_sf <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ gr_liv_area     <dbl> 1710, 1262, 1786, 1717, 2198, 1362, 1694, 2090, 177...
## $ bsmt_full_bath  <dbl> 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, ...
## $ bsmt_half_bath  <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ full_bath       <dbl> 2, 2, 2, 1, 2, 1, 2, 2, 2, 1, 1, 3, 1, 2, 1, 1, 1, ...
## $ half_bath       <dbl> 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, ...
## $ bedroom_abv_gr  <dbl> 3, 3, 3, 3, 4, 1, 3, 3, 2, 2, 3, 4, 2, 3, 2, 2, 2, ...
## $ kitchen_abv_gr  <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, ...
## $ kitchen_qual    <chr> "Gd", "TA", "Gd", "Gd", "Gd", "TA", "Gd", "TA", "TA...
## $ tot_rms_abv_grd <dbl> 8, 6, 6, 7, 9, 5, 7, 7, 8, 5, 5, 11, 4, 7, 5, 5, 5,...
## $ functional      <chr> "Typ", "Typ", "Typ", "Typ", "Typ", "Typ", "Typ", "T...
## $ fireplaces      <dbl> 0, 1, 1, 1, 1, 0, 1, 2, 2, 2, 0, 2, 0, 1, 1, 0, 1, ...
## $ fireplace_qu    <chr> NA, "TA", "TA", "Gd", "TA", NA, "Gd", "TA", "TA", "...
## $ garage_type     <chr> "Attchd", "Attchd", "Attchd", "Detchd", "Attchd", "...
## $ garage_yr_blt   <dbl> 2003, 1976, 2001, 1998, 2000, 1993, 2004, 1973, 193...
## $ garage_finish   <chr> "RFn", "RFn", "RFn", "Unf", "RFn", "Unf", "RFn", "R...
## $ garage_cars     <dbl> 2, 2, 2, 3, 3, 2, 2, 2, 2, 1, 1, 3, 1, 3, 1, 2, 2, ...
## $ garage_area     <dbl> 548, 460, 608, 642, 836, 480, 636, 484, 468, 205, 3...
## $ garage_qual     <chr> "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA", "Fa...
## $ garage_cond     <chr> "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA...
## $ paved_drive     <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "...
## $ wood_deck_sf    <dbl> 0, 298, 0, 0, 192, 40, 255, 235, 90, 0, 0, 147, 140...
## $ open_porch_sf   <dbl> 61, 0, 42, 35, 84, 30, 57, 204, 0, 4, 0, 21, 0, 33,...
## $ enclosed_porch  <dbl> 0, 0, 0, 272, 0, 0, 0, 228, 205, 0, 0, 0, 0, 0, 176...
## $ x3ssn_porch     <dbl> 0, 0, 0, 0, 0, 320, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ screen_porch    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 176, 0, 0, 0, 0...
## $ pool_area       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ pool_qc         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ fence           <chr> NA, NA, NA, NA, NA, "MnPrv", NA, NA, NA, NA, NA, NA...
## $ misc_feature    <chr> NA, NA, NA, NA, NA, "Shed", NA, "Shed", NA, NA, NA,...
## $ misc_val        <dbl> 0, 0, 0, 0, 0, 700, 0, 350, 0, 0, 0, 0, 0, 0, 0, 0,...
## $ mo_sold         <dbl> 2, 5, 9, 2, 12, 10, 8, 11, 4, 1, 2, 7, 9, 8, 5, 7, ...
## $ yr_sold         <dbl> 2008, 2007, 2008, 2006, 2008, 2009, 2007, 2009, 200...
## $ sale_type       <chr> "WD", "WD", "WD", "WD", "WD", "WD", "WD", "WD", "WD...
## $ sale_condition  <chr> "Normal", "Normal", "Normal", "Abnorml", "Normal", ...
## $ sale_price      <dbl> 208500, 181500, 223500, 140000, 250000, 143000, 307...
#let's also take a glimpse at missing data real quick
miss_plot(train)

#this will just replace NA with "NA" in our data
train <- train %>%
  mutate_if(is.character, ~replace_na(., "NA"))

test <- test %>%
  mutate_if(is.character, ~replace_na(., "NA"))

Setting up preprocessing using {recipes}

We’ll implement a number of preprocessing steps here. These are somewhat generic because we didn’t do an EDA of our data. Again, the point of this is to get a feel for a tidymodels workflow rather than to build a really great model for this data. In these steps, we will:

  1. Convert all strings to factors
  2. Pool infrequent factors into an “other” category
  3. Remove near-zero variance predictors
  4. Impute missing values using k nearest neighbors
  5. Dummy out all factors
  6. Log transform our outcome (which is skewed)
  7. Mean center all numeric predictors (which will be all of them at this point)
  8. Normalize all numeric predictors

Note that after specifying all of our steps, we use the prep() function to execute them and create a recipe object. To apply this recipe to data (and return a preprocessed tibble), we use the juice() function. Note that juice will work with training data, but we’ll need bake() to apply this to our test data.

preprocess_recipe <- train %>%
  select(-id) %>%
  recipe(sale_price ~ .) %>%
  step_string2factor(all_nominal()) %>% #this converts all of our strings to factors
  step_other(all_nominal(), threshold = .05) %>% #this will pool infrequent factors into an "other" category
  step_nzv(all_predictors()) %>% #this will remove zero or near-zero variance predictors
  step_knnimpute(all_predictors(), neighbors = 5) %>% #this will impute values for predictors using KNN
  step_dummy(all_nominal()) %>% #this will dummy out all factor variables
  step_log(all_outcomes()) %>% #log transforming the outcome because it's skewed
  step_center(all_numeric(), -all_outcomes()) %>% #this will mean-center all of our numeric data
  step_scale(all_numeric(), -all_outcomes()) %>% #this will normalize numeric data
  prep()

preprocess_recipe
## Data Recipe
## 
## Inputs:
## 
##       role #variables
##    outcome          1
##  predictor         79
## 
## Training data contained 1460 data points and 339 incomplete rows. 
## 
## Operations:
## 
## Factor variables from ms_zoning, street, alley, ... [trained]
## Collapsing factor levels for ms_zoning, street, alley, ... [trained]
## Sparse, unbalanced variable filter removed street, utilities, ... [trained]
## K-nearest neighbor imputation for ms_zoning, lot_frontage, lot_area, ... [trained]
## Dummy variables from ms_zoning, alley, lot_shape, land_contour, ... [trained]
## Log transformation on sale_price [trained]
## Centering for ms_sub_class, lot_frontage, ... [trained]
## Scaling for ms_sub_class, lot_frontage, ... [trained]
train_prep <- juice(preprocess_recipe)

test_prep <- preprocess_recipe %>%
  bake(test)

Cross Validation using {rsample}

Next, we’ll want to create folds we can use for cross validation in our modeling. We can create however many folds we want, but we’ll create 5 here using our preprocessed training data.

train_cv <- train_prep %>%
  vfold_cv(v = 5)

So, that one is pretty straightforward.

Model Specifications using {parsnip}

Next, we’ll specify the models we’re going to run on this data. For illustration, we’ll run an elastic net model and a random forest model. We’ll start with the elastic net.

glmnet_mod <- linear_reg(
  mode = "regression",
  penalty = tune(),
  mixture = tune()
) %>%
  set_engine("glmnet")

And next the random forest model.

rf_mod <- rand_forest(
  mode = "regression",
  trees = 500,
  mtry = tune(),
  min_n = tune()
) %>%
  set_engine("ranger")

Specifying model parameters using {dials}

We set some of the parameters in the previous model to have values of tune(). This allows us to vary those parameter values to find the combinations that create the best model. To implement this, we’ll need to set up a grid of parameters to test various hyperparameter values, & we can do this using the {dials} package.

As a reminder, an elastic net model takes two tuning parameter: the penalty (the lambda value), which describes the total amount of penalty to apply to coefficients, and the mixture (the alpha value), which describes the proportion of the penalty that is L1.

glmnet_hypers <- parameters(penalty(), mixture()) %>%
  grid_max_entropy(size = 20)

For our random forest model, we also have two tuning parameters: mtry, which represents the number of predictors to be randomly sampled at each tree split, and min_n, which represents the smallest number of observations required to split a node further.

rf_hypers <- parameters(mtry(c(5, floor(ncol(train_prep)/3))), min_n()) %>% #we could specify a different mtry range, but this seems reasonable
  grid_random(size = 20)

Setting up a tuning grid with {tune}

Ok, now we have our data preprocessed, our models specified, our splits set, and our hyperparameter grids specified. We can now tune our different model types using the tune_grid() function.

Let’s start with the elastic net model.

glmnet_tuned_results <- tune_grid(
  formula = sale_price ~ .,
  model = glmnet_mod,
  resamples = train_cv,
  grid = glmnet_hypers,
  metrics = metric_set(mae), #we'll just use mae here, although there are other metrics we can choose for regression
  control = control_grid()
)

We can then use the show_best() function to see which hyperparameters gave us the best model. After seeing this, we can choose the simplest model (by penalty) that’s within one standard error of the numerically best model. For this, we’ll choose the model with the most penalty (within one se).

glmnet_tuned_results %>%
  show_best(maximize = FALSE) #we need to remember to set maximize to false here because we want to minimize MAE.
## # A tibble: 5 x 7
##       penalty mixture .metric .estimator   mean     n std_err
##         <dbl>   <dbl> <chr>   <chr>       <dbl> <int>   <dbl>
## 1 0.00370      0.700  mae     standard   0.0934     5 0.00274
## 2 0.00212      0.996  mae     standard   0.0935     5 0.00282
## 3 0.00363      0.0547 mae     standard   0.0955     5 0.00308
## 4 0.000955     0.329  mae     standard   0.0957     5 0.00312
## 5 0.000000344  0.0461 mae     standard   0.0966     5 0.00326
best_glmnet_params <- glmnet_tuned_results %>%
  select_by_one_std_err(metric = "mae", maximize = FALSE, (penalty)) %>%
  select(penalty, mixture)

Now we can finalize our elastic net model.

glmnet_final_mod <- glmnet_mod %>%
  finalize_model(parameters = best_glmnet_params) %>%
  fit(sale_price ~ ., data = train_prep)

And then let’s move on to the random forest model. First we’ll tune it.

rf_tuned_results <- tune_grid(
  formula = sale_price ~ .,
  model = rf_mod,
  resamples = train_cv,
  grid = rf_hypers,
  metrics = metric_set(mae),
  control = control_grid()
)
## i Creating pre-processing data to finalize unknown parameter: mtry

And then we can find the best model. We’ll just use the select_best() function here rather than select_by_one_std_err()

rf_tuned_results %>%
  show_best(maximize = FALSE)
## # A tibble: 5 x 7
##    mtry min_n .metric .estimator   mean     n std_err
##   <int> <int> <chr>   <chr>       <dbl> <int>   <dbl>
## 1    35     8 mae     standard   0.0922     5 0.00249
## 2    36     3 mae     standard   0.0924     5 0.00228
## 3    35    13 mae     standard   0.0927     5 0.00263
## 4    25    11 mae     standard   0.0928     5 0.00247
## 5    34    15 mae     standard   0.0934     5 0.00240
best_rf_params <- rf_tuned_results %>%
  select_best(maximize = FALSE)

Now that we have these, we can finalize our random forest model.

rf_final_mod <- rf_mod %>%
  finalize_model(best_rf_params) %>%
  fit(sale_price ~ ., data = train_prep)

One other point to make – looking at the results (the mean column in the tibble produced by show_best()), we can see that the elastic net model and the random forest model produce somewhat similar results. If they actually are very close to each other in terms of accuracy, we’d probably just want to use the elastic net model since it’s easier to interpret. But for this, we’ll still make predictions using both.

Predicting on new data

We have this test set in the data that we haven’t touched yet. Now we can make predictions on this using our final elastic net model and our final random forest model.

glmnet_price <- exp(predict(glmnet_final_mod, new_data = test_prep)) %>%
  as_vector()

glmnet_preds <- tibble(
  Id = test$id,
  SalePrice = glmnet_price
  )
rf_price <- exp(predict(rf_final_mod, new_data = test_prep)) %>%
  as_vector()

rf_preds <- tibble(
  Id = test$id,
  SalePrice = rf_price
)

And now we’re done! Since this data is part of a Kaggle competition, we could then submit it to Kaggle by writing to a csv and uploading if we wanted to.

I’m sure there’s a lot more we can do with these packages, and I’m excited to explore them more in the future.