Example usage

To use mercedestrenz in a project:

import mercedestrenz
print(mercedestrenz.__version__)
0.3.1

Below is a basic example of how to use each of the four functions included in this package.

# Load all required package functions
from mercedestrenz.data import load_sample_mercedes_listings, listing_search
from mercedestrenz.train import train_mercedes_price_prediction_model
from mercedestrenz.predict import predict_mercedes_price
from mercedestrenz.visualizations import plot_mercedes_price

1. load_sample_mercedes_listings() - a function for loading sample data

The package contains a static dataset for Craiglist used-car listings that were previously web scraped. Several key attributes about the used-car are available in the dataset, such as vehicle prices, models, car conditions, odometer readings, VINs, regions and transmission.

# Load the sample mercedes listings data into a dataframe
data = load_sample_mercedes_listings()
data.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 8553 entries, 0 to 8552
Data columns (total 16 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   price_USD      8553 non-null   int64  
 1   condition      8553 non-null   object 
 2   paint_color    6820 non-null   object 
 3   model          8553 non-null   object 
 4   odometer_mi    8403 non-null   float64
 5   year           8553 non-null   int64  
 6   num_cylinders  4753 non-null   object 
 7   fuel           8547 non-null   object 
 8   transmission   8522 non-null   object 
 9   drive          5042 non-null   object 
 10  size           3013 non-null   object 
 11  type           7564 non-null   object 
 12  state          8553 non-null   object 
 13  VIN            5545 non-null   object 
 14  title_status   8458 non-null   object 
 15  description    8553 non-null   object 
dtypes: float64(1), int64(2), object(13)
memory usage: 1.1+ MB

2. listing_search() - a function for searching listing that match the expected budget range and sort by desired features

Use this function to search for the listings that match the user’s expected price range.

  • The results are filtered by an optional input, model. By default, all models will be shown, but the user can change it to narrow down the searching range to only the models of interest.

  • The results are also sorted by ascending price and another the specified feature in the sort_feature parameter. By default the sort_feature is lower mileage value, but user has the flexibility to choose another numeric attribute.

# Return the top listings that are within a budget range specified by the user. 
# Returns a pandas dataframe of results.
listings = listing_search(data, budget=[2000, 20000], model = "any", sort_feature = "odometer_mi", ascending = True)
listings.head().iloc[:, 0:5]
price_USD model odometer_mi condition size
2493 4999 c-class 0.0 excellent NaN
2494 4999 c-class 0.0 excellent NaN
686 11900 m-class 0.0 excellent NaN
690 11900 m-class 0.0 excellent NaN
699 11900 m-class 0.0 excellent NaN

3. plot_mercedes_price() - a function for visualizing current matket’s price distribution

This function will plot a density plot of a specific Mercedes-Benz model to see where the current vehicle’s price falls within the distribution of prices for that model in the market.

# Plot a price distribution of specific mercedes models, 
# and see where an input price falls in the distribution.
plot_mercedes_price(model='c-class', price=30000, market_df=data)

4. predict_mercedes_price() - a function for predicting a reasonable price in USD

This function will predicts the price in USD of a Mercedes-Benz given the year, model, condition, paint color and odometer reading.

It uses a pre-trained model built into the package to predict the price of themercedes. The model was trained on data from 1990 to 2022.

# # Predict the price (in USD) of a Mercedes-Benz given the year, model, condition, paint color, and odometer reading.
predict_mercedes_price("e-class", 2015, 55_000, "fair", "silver")
8735.85