Example usage¶
To use mercedestrenz
in a project:
import mercedestrenz
print(mercedestrenz.__version__)
0.3.1
Below is a basic example of how to use each of the four functions included in this package.
# Load all required package functions
from mercedestrenz.data import load_sample_mercedes_listings, listing_search
from mercedestrenz.train import train_mercedes_price_prediction_model
from mercedestrenz.predict import predict_mercedes_price
from mercedestrenz.visualizations import plot_mercedes_price
1. load_sample_mercedes_listings() - a function for loading sample data¶
The package contains a static dataset for Craiglist used-car listings that were previously web scraped. Several key attributes about the used-car are available in the dataset, such as vehicle prices, models, car conditions, odometer readings, VINs, regions and transmission.
# Load the sample mercedes listings data into a dataframe
data = load_sample_mercedes_listings()
data.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 8553 entries, 0 to 8552
Data columns (total 16 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 price_USD 8553 non-null int64
1 condition 8553 non-null object
2 paint_color 6820 non-null object
3 model 8553 non-null object
4 odometer_mi 8403 non-null float64
5 year 8553 non-null int64
6 num_cylinders 4753 non-null object
7 fuel 8547 non-null object
8 transmission 8522 non-null object
9 drive 5042 non-null object
10 size 3013 non-null object
11 type 7564 non-null object
12 state 8553 non-null object
13 VIN 5545 non-null object
14 title_status 8458 non-null object
15 description 8553 non-null object
dtypes: float64(1), int64(2), object(13)
memory usage: 1.1+ MB
2. listing_search() - a function for searching listing that match the expected budget range and sort by desired features¶
Use this function to search for the listings that match the user’s expected price range.
The results are filtered by an optional input, model. By default, all models will be shown, but the user can change it to narrow down the searching range to only the models of interest.
The results are also sorted by ascending price and another the specified feature in the sort_feature parameter. By default the sort_feature is lower mileage value, but user has the flexibility to choose another numeric attribute.
# Return the top listings that are within a budget range specified by the user.
# Returns a pandas dataframe of results.
listings = listing_search(data, budget=[2000, 20000], model = "any", sort_feature = "odometer_mi", ascending = True)
listings.head().iloc[:, 0:5]
price_USD | model | odometer_mi | condition | size | |
---|---|---|---|---|---|
2493 | 4999 | c-class | 0.0 | excellent | NaN |
2494 | 4999 | c-class | 0.0 | excellent | NaN |
686 | 11900 | m-class | 0.0 | excellent | NaN |
690 | 11900 | m-class | 0.0 | excellent | NaN |
699 | 11900 | m-class | 0.0 | excellent | NaN |
3. plot_mercedes_price() - a function for visualizing current matket’s price distribution¶
This function will plot a density plot of a specific Mercedes-Benz model to see where the current vehicle’s price falls within the distribution of prices for that model in the market.
# Plot a price distribution of specific mercedes models,
# and see where an input price falls in the distribution.
plot_mercedes_price(model='c-class', price=30000, market_df=data)