mercedestrenz.train

Module Contents

Functions

train_mercedes_price_prediction_model(data, model_version)

Trains a model to predict the price of a Mercedes-Benz given the year,

make_model(model_type)

Makes a model for the mercedes price prediction model

get_random_search_param_grid(model_type)

Gets a random search parameter grid for the mercedes

make_column_transformer(numeric_features, ...)

Makes a column transformer for the mercedes price prediction model

export_mercedes_price_model(model_pipeline[, version, ...])

Exports the sklearn model pipeline for mercedes price prediction

mercedestrenz.train.train_mercedes_price_prediction_model(data: pandas.DataFrame, model_version: str, model_type: str = 'gradient_boosting', n_iter: int = 25, cv_results={}, save_model: bool = False, overwrite_version: bool = False)[source]

Trains a model to predict the price of a Mercedes-Benz given the year,

Parameters:
  • data (pd.DataFrame) – The raw used mercedes data. Must contain columns for model, year, condition, odometer_mi, paint_color, and price_USD.

  • model_version (str) – The version of the model to train and subsequently save.

  • model_type (str, optional) – The type of model to use to train on the data, by default “gradient_boosting”

  • n_iter (int, optional) – How many iterations of randomized search to do during tuning, by default 25

  • cv_results (dict, optional) – Pass existing dictionary of results to have these results appended, by default {}

  • save_model (bool, optional) – Whether to save a version of the model, by default False

  • overwrite_version (bool, optional) – If a version of that name already exists use this to overwrite it, by default False

Returns:

The best performing model and the results of the cross validation.

Return type:

Tuple[model, cv_results]

Raises:

ValueError – If the data does not contain the required columns.

Examples

>>> from mercedestrenz.modelling import train_mercedes_price_prediction_model
>>> model, results = train_mercedes_price_prediction_model(data, "v2", save_model=False)
mercedestrenz.train.make_model(model_type: str)[source]

Makes a model for the mercedes price prediction model

Parameters:

model_type (str) – What type of model to use

Returns:

A model for the mercedes price prediction model

Return type:

Model

mercedestrenz.train.get_random_search_param_grid(model_type: str)[source]

Gets a random search parameter grid for the mercedes price prediction model

Parameters:

model_type (str) – What type of model to use

Returns:

A random search parameter grid for the mercedes price prediction model

Return type:

dict

mercedestrenz.train.make_column_transformer(numeric_features, ordinal_features, categorical_features)[source]

Makes a column transformer for the mercedes price prediction model

Parameters:
  • numeric_features (list) – List of numeric features to include in the model

  • ordinal_features (list) – List of ordinal features to include in the model

  • categorical_features (list) – List of categorical features to include in the model

Returns:

A column transformer for the mercedes price prediction model

Return type:

ColumnTransformer

mercedestrenz.train.export_mercedes_price_model(model_pipeline, version='v1', overwrite=False)[source]

Exports the sklearn model pipeline for mercedes price prediction

Parameters:
  • model_pipeline (PipeLine) – sklearn pipeline with the model and preprocessing steps

  • version (str, optional) – What to tag the model version by. By default “v1”