17
A KBB-like Reference Pricing System ---- Using Machine Learning Team: Altima Hao Zhu Yingqi Yang

Altima: A KBB-like Reference Pricing System

Embed Size (px)

Citation preview

Page 1: Altima: A KBB-like Reference Pricing System

A KBB-like Reference Pricing System ---- Using Machine Learning

Team: Altima Hao Zhu Yingqi Yang

Page 2: Altima: A KBB-like Reference Pricing System

Product

A KBB-like Reference Pricing System

The end product could be integrated with online apt./room/house etc. rental listings to provide people looking for rental housing with a reference point for rent negotiation

Rental Details Asking Rent Reference Price % Above Reference

…… 1000 800 25%

Page 3: Altima: A KBB-like Reference Pricing System

Business Model

• Build our own service by scraping online rental listings and applying this system

• Cooperate with online rental listing providers such as Craigslist and provide this system as a value-added service

• Promote this system to other similar web services such as ebay auction to predict closing price

Page 4: Altima: A KBB-like Reference Pricing System

Approach

Data Set

Source Seattle Apt/House Rent Price Downloaded from GitHub

Size• Total of 2313 Entries from Nov. 2014• Training/Validation: 75/25

Attribute

• Responds (Price) • 14 Predictors

(Number of Bedrooms, Room Size, Listing Title……)

Page 5: Altima: A KBB-like Reference Pricing System

Approach

Data Exploration

Outlier:

Price < 600 or Price > 3100

Rent Price Distribution

Rent Price Histogram

Page 6: Altima: A KBB-like Reference Pricing System

Approach

Text Mining on Listing Title Variable

Page 7: Altima: A KBB-like Reference Pricing System

Approach

Model Selection

(1) K Nearest Neighbors – Regression Model (KNN)

An algorithm that stores all available cases and predict the numerical target based on a similarity measure (e.g., distance functions).

Numeric Variables -- Euclidean Categorical Variables – Hamming Distance

Page 8: Altima: A KBB-like Reference Pricing System

Approach

Model Selection

(1) K Nearest Neighbors – Regression Model (KNN)

Size Beds Zip code Price1 1710 4 98115 25002 2200 2 98199 28953 1420 2 98117 2150

Step1: Standardize Data Set

Size Beds 98104 98115 98117 Price1 0.564 0.212 0 1 0 25002 0.731 0.091 1 0 0 28953 0.465 0.091 0 0 1 2150

Page 9: Altima: A KBB-like Reference Pricing System

Approach

Model Selection

(1) K Nearest Neighbors – Regression Model (KNN)Step2: Give Reasonable Weights

Variable Size Bath Bed Zip Code ……

Weight 5 4 3 2 ……

Page 10: Altima: A KBB-like Reference Pricing System

Approach

Model Selection

(1) K Nearest Neighbors – Regression Model (KNN)

Forecast 2.053 0.273 98104 ?

③①

Step3: Calculate Distance

K =1 Price = 2150K =2 Price = (2150+2500) /2

Size Beds 98104 98115 98117 Price Distance1 2.819 0.636 0 1 0 2500 0.82 3.654 0.273 0 0 0 2895 1.63 2.325 0.273 0 0 1 2150 0.3

Page 11: Altima: A KBB-like Reference Pricing System

Approach

Model Selection

(1) K Nearest Neighbors – Regression Model (KNN)

(2) Other Models

• Decision Tree Model• Forest Model• Spline Model• Support Vector Machine Model (SVM)

Page 12: Altima: A KBB-like Reference Pricing System

Approach

Model Comparison

Model Name MAPE RMSEKNN Regression

Model 0.17963 20.53814

Decision Tree Model 0.15522 334.49524

Forest Model 0.12895 287.84426

Spline Model 0.16774 408.67882

* SVM Model 0.15726 336.83526

* Not able to implement in Alteryx Designer; Used R to develop instead

Result: Ensemble Model

Page 13: Altima: A KBB-like Reference Pricing System

Demo

Baths Beds Size Zip Code Price Reference Price % Above Reference

1 1 828 98121 2,055 2,038 0.011 2 900 98117 1,800 1,700 0.061 1 583 98121 2,395 1,395 0.721 1 577 98121 1,398 1,595 -0.12

Page 14: Altima: A KBB-like Reference Pricing System

Model Improvement

• Use a larger dataset to build the model to make it stronger

• Add attributes such as availability of pool, security guard, etc.• Include contents of the listings for text mining• Distinguish between house and apartment

• Add time component to the model to handle trend and seasonality in rent price

• Do more research on the variables to get better weights for KNN Regression Model

Page 15: Altima: A KBB-like Reference Pricing System

Q & A

Page 16: Altima: A KBB-like Reference Pricing System

AppendixK Nearest Neighbors – Regression Model (KNN)

D1 ¿ 2√(2.053−2 .819)2+(0.273−0.636)2+1

D2 ¿ 2√(2.053−3 .654 )2+(0.273−0.273)2+0

F_Price = 2150

F_Price = (2150+2500) / 2

K = 1

K = 2

Step3: Calculate Distance

Forecast 2.053 0.273 98104 ?

③①

②Size Beds 98104 98115 98117 Price Distance

1 2.819 0.636 0 1 0 2500 0.82 3.654 0.273 1 0 0 2895 1.63 2.325 0.273 0 0 1 2150 0.3

Page 17: Altima: A KBB-like Reference Pricing System

Reference

http://www.ncbi.nlm.nih.gov/pubmed/16723004

http://www.cs.upc.edu/~bejar/apren/docum/trans/03d-algind-knn-eng.pdf