library(tidyverse)
library(tidymodels)
En este notebook vamos a realizar tunning de un árbol de decisión como en la práctica guiada 2.
El dataset es el mismo del TP1 y se encuentra disponible en:
https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data
Lo podemos ingresar mediante el siguiente comando
train <- read_csv("https://raw.githubusercontent.com/data-datum/datasets/main/train_house.csv")
## Parsed with column specification:
## cols(
## .default = col_character(),
## Id = col_double(),
## MSSubClass = col_double(),
## LotFrontage = col_double(),
## LotArea = col_double(),
## OverallQual = col_double(),
## OverallCond = col_double(),
## YearBuilt = col_double(),
## YearRemodAdd = col_double(),
## MasVnrArea = col_double(),
## BsmtFinSF1 = col_double(),
## BsmtFinSF2 = col_double(),
## BsmtUnfSF = col_double(),
## TotalBsmtSF = col_double(),
## `1stFlrSF` = col_double(),
## `2ndFlrSF` = col_double(),
## LowQualFinSF = col_double(),
## GrLivArea = col_double(),
## BsmtFullBath = col_double(),
## BsmtHalfBath = col_double(),
## FullBath = col_double()
## # ... with 18 more columns
## )
## See spec(...) for full column specifications.
La variable a predecir es la de SalePrice.