View
219
Download
1
Category
Preview:
Citation preview
SEGMENTASI PELANGGAN MENGGUNAKAN
METODE PARTICLE SWARM OPTIMIZATION
DAN K-MEANS
TUGAS AKHIR
Diajukan guna memenuhi sebagian persyaratan dalam rangka menyelesaikan
Pendidikan Sarjana Strata Satu (S1) Program Studi Teknologi Informasi
I DEWA AYU AGUNG YUNITA PRIMANDARI
NIM: 1204505063
JURUSAN TEKNOLOGI INFORMASI
FAKULTAS TEKNIK
UNIVERSITAS UDAYANA
2016
ii
PERNYATAAN
Dengan ini saya menyatakan bahwa dalam Tugas Akhir ini tidak terdapat
karya yang pernah diajukan untuk memperoleh gelar kesarjanaan di perguruan
tinggi lain, dan sepanjang pengetahuan saya tidak terdapat karya atau pendapat
yang pernah ditulis atau diterbitkan oleh orang lain, kecuali yang secara tertulis
diacu dalam naskah ini dan disebutkan pada daftar pustaka.
Denpasar, Juni 2016
I Dewa Ayu Agung Yunita Primandari
iii
LEMBAR PENGESAHAN TUGAS AKHIR
iv
BERITA ACARA TUGAS AKHIR
v
KATA PENGANTAR
Puji dan syukur penulis panjatkan kehadapan Ida Sang Hyang Widhi
Wasa/Tuhan Yang Maha Esa, karena atas Asung Kerta Wara Nugraha-Nya,
akhirnya penulis dapat menyelesaikan tugas akhir dengan judul “Segmentasi
Pelanggan Menggunakan Metode Particle Swarm Optimization dan K-Means”.
Penulis mendapat banyak bimbingan dari berbagai pihak. Ucapan terima kasih
penulis sampaikan kepada:
1. Bapak Prof. Ir. Ngakan Putu Gede Suardana, M.T., Ph.D. selaku Dekan
Fakultas Teknik Universitas Udayana.
2. Bapak Dr. Eng. I Putu Agung Bayupati, ST., MT, selaku Ketua Jurusan
Teknologi Informasi Universitas Udayana.
3. Bapak Prof. Dr. I Ketut Gede Darma Putra, S.Kom., M.T., selaku
dosen pembimbing I yang telah banyak memberikan bimbingan dan masukan
dalam penyusunan tugas akhir ini.
4. Bapak I Made Sukarsa, S.T., M.T., selaku dosen pembimbing II yang telah
banyak memberikan petunjuk dan bimbingan selama penyusunan tugas akhir.
5. Bapak Putu Wira Buana, S.Kom., M.T. selaku dosen pembimbing akademik,
yang telah memberikan bimbingan selama menempuh pendidikan di Jurusan
Teknologi Informasi Fakultas Teknik Universitas Udayana.
6. Ibu Ayu Supartini, yang telah membantu dalam proses penelitian. Kedua
orang tua dan keluarga yang telah memberikan dukungan. Teman-teman
yang selalu mau sharing ilmu dan solusi Tio, Maha dan Cinho, Teman-teman
seperjuangan selama masa kuliah Utami, Moren, Indah, Tiwi, Teman-teman
seperjuangan di HMTI, Resa, Abi, Try serta angkatan 2012 TI telah
memberikan dukungan dalam penyusunan tugas akhir ini.
Denpasar, Juni 2016
Penulis
vi
ABSTRAK
Segmentasi pelanggan merupakan salah satu implementasi data mining.
Segmentasi pelanggan membagi pelanggan ke dalam beberapa kelas untuk
membantu perusahaan memahami karakteristik setiap pelanggannya. Tugas akhir
ini menganalisa 33.441 baris data transaksi perusahaan yang ditransformasi ke
dalam bentuk Recency, Frequency, dan Monetary (RFM). Metode clustering yang
digunakan untuk melakukan segmentasi pelanggan adalah kombinasi dari Metode
Particle Swarm Optimization (PSO) dan K-Means. Kombinasi kedua metode
tersebut mengambil keunggulan dari masing-masing metode dan menghilangkan
kelemahan yang dimiliki. Metode K-Means sangat sensitif dalam menentukan titik
pusat karena dilakukan secara acak. Kelemahan pada titik pusat Metode K-Means
tersebut dioptimasi dengan Metode PSO sehingga Metode K-Means dapat
melakuan clustering secara lebih baik. Penelitian ini menggunakan beberapa
jumlah cluster. Berdasarkan metode validasi cluster yang digunakan yaitu Metode
Davies-bouldin Index (DBI) dan Silhouette Index cluster terbaik adalah cluster
terbaik yang memiliki nilai Indeks DBI sebesar 0.5 dan nilai Indeks Silhouette
sebesar 0.83.
Kata kunci: Segmentasi Pelanggan, Data Mining, Clustering, Particle Swarm
Optimization, K-Means, RFM, Davies-bouldin Index, Silhouette Index.
vii
ABSTRACT
Customer segmentation is an implementation of clustering in the data
mining process. Customer segmentation divides customers into certain classes to
help a company to understand each customer. This paper analyzed 33,441 rows of
a transaction data and transformed it into Recency, Frequency, and Monetary form
(RFM) to identify potential customer. Clustering method used was the combination
of Particle Swarm Optimization (PSO) and K-Means algorithm. The combination
of these algorithms aimed to take advantages of both algorithms and remove their
weakness. K-Means is very sensitive to initialize the cluster center because it was
done randomly. PSO was used to optimize the cluster center and help K-Means to
cluster better. The clustering experiment used several number of cluster. The best
number of the cluster for this experiments are two clusters according to Davies-
bouldin Index (DBI) method. The second cluster has 0.5 value of DBI’s index and
0.83 value of Silhouette’s index.
Keywords: Customer Segmentation, Data Mining, Clustering, Particle Swarm
Optimization, K-Means, RFM, Davies-bouldin Index, Silhouette Index.
viii
DAFTAR ISI
HALAMAN DEPAN ............................................................................................. i
PERNYATAAN ..................................................................................................... ii
LEMBAR PENGESAHAN TUGAS AKHIR .................................................... iii BERITA ACARA TUGAS AKHIR ................................................................... iv KATA PENGANTAR ........................................................................................... v ABSTRAK .......................................................................................................... vi
ABSTRACT ......................................................................................................... vii DAFTAR ISI ....................................................................................................... viii DAFTAR GAMBAR ............................................................................................. x DAFTAR TABEL ............................................................................................... xii
DAFTAR KODE PROGRAM .......................................................................... xiv BAB I PENDAHULUAN ..................................................................................... 1 1.1 Latar Belakang ............................................................................................... 1 1.2 Rumusan Masalah .......................................................................................... 3
1.3 Tujuan Penelitian ........................................................................................... 3 1.4 Manfaat Penelitian ......................................................................................... 3 1.5 Batasan Masalah ............................................................................................ 4
1.6 Sistematika Penulisan .................................................................................... 4
BAB II KAJIAN PUSTAKA ............................................................................... 6 2.1 State of the Art ............................................................................................... 6 2.2 Data Mining ................................................................................................. 10
2.3 Hubungan Data Mining dalam Kerangka Kerja CRM ................................ 11 2.4 Model RFM ................................................................................................. 13
2.5 Normalisasi Data ......................................................................................... 15 2.6 Metode Clustering ....................................................................................... 16 2.7 Metode K-Means ......................................................................................... 16 2.8 Metode Particle Swarm Optimization ......................................................... 19
2.8.1 Personal Best dan Global Best Particle Swarm Optimization ......... 20 2.8.3 Contoh Perhitungan Particle Swarm Optimization .......................... 21
2.9 Validasi Cluster ........................................................................................... 23 2.9.1 Validasi Cluster dengan Davies-bouldin Index ............................... 23
2.9.2 Validasi Cluster dengan Silhouette Index ........................................ 24 2.10 Profil Perusahaan PT. X .............................................................................. 25
BAB III METODOLOGI PENELITIAN ........................................................ 26 3.1 Tempat dan Waktu Penelitian ...................................................................... 26 3.2 Alur Penelitian ............................................................................................. 26 3.3 Sumber Data ................................................................................................ 27 3.4 Kebutuhan Hardware dan Software ............................................................ 27 3.5 Perancangan Sistem ..................................................................................... 28
3.5.1 Gambaran Umum Sistem ................................................................. 28 3.5.2 Pemilihan Data ................................................................................. 30
3.5.4 Transformasi Data ............................................................................ 35 3.5.5 Normalisasi Data .............................................................................. 39
ix
3.5.6 Optimasi Titik Pusat Cluster Menggunakan PSO ........................... 44 3.5.7 Clustering Menggunakan K-Means ................................................. 51
3.5.8 Pemodelan Data ............................................................................... 58 3.5.9 Cluster Validation dengan Davies-bouldin Index ........................... 61 3.5.10 Cluster Validation dengan Silhouette Index .................................... 66
3.6 Perancangan Basis Data ............................................................................... 69 3.7 Interface ....................................................................................................... 79
BAB IV PEMBAHASAN DAN ANALISA HASIL ........................................ 85 4.1 Piranti Implementasi Aplikasi ..................................................................... 85
4.1.1 Kebutuhan Perangkat Lunak Implementasi Aplikasi ...................... 85 4.1.2 Kebutuhan Perangkat Keras Implementasi Aplikasi ....................... 85
4.2 Hasil Perancangan Aplikasi ......................................................................... 86 4.2.1 Tampilan Menu ................................................................................ 86 4.2.2 Proses Pemilihan Data ..................................................................... 87
4.2.3 Proses Transformasi Data ................................................................ 88 4.2.4 Proses Normalisasi Data .................................................................. 90
4.2.5 Proses Optimasi Titik Pusat Cluster dengan Metode Particle Swarm
Optimization .................................................................................... 91
4.2.6 Proses Clustering Menggunakan Metode K-Means ........................ 92 4.2.7 Proses Validasi Cluster .................................................................... 93
4.3 Uji Coba Aplikasi ........................................................................................ 94
4.3.1 Hasil Uji Coba Clustering dengan 2 Cluster ................................... 95
4.3.2 Hasil Uji Coba Clustering dengan 3 Cluster ................................... 98 4.3.3 Hasil Uji Coba Clustering dengan 4 Cluster ................................. 102 4.3.4 Hasil Uji Coba Clustering dengan 5 Cluster ................................. 106
4.3.5 Hasil Uji Coba Clustering dengan 6 Cluster ................................. 110 4.4 Analisis Cluster ......................................................................................... 114
4.5 Perbandingan Hasil Clustering Metode K-Means dan Metode PSO+K-
Means ......................................................................................................... 115 4.6 Perbandingan Kelas Konsumen per Bulan ................................................ 119
BAB V PENUTUP ............................................................................................ 123
5.1 Simpulan ................................................................................................. 123
5.2 Saran ........................................................................................................ 124
DAFTAR PUSTAKA ........................................................................................ 125
x
DAFTAR GAMBAR
Gambar 2.1 Diagram Fishbone............................................................................... 9 Gambar 2.2 Proses KDD ....................................................................................... 10 Gambar 2.3 Skema CRIPS-DM ............................................................................ 12
Gambar 2.4 Topologi Bintang............................................................................... 20 Gambar 3.1 Gambaran Umum Sistem .................................................................. 28 Gambar 3.2 Relasi Tabel Data Transaksi PT. X ................................................... 31
Gambar 3.3 Skema Pembentukan Nilai RFM ....................................................... 35
Gambar 3.4 Skema Penentuan Nilai tb_standar ............................................ 37
Gambar 3.5 Skema Penentuan Nilai tb_rfm ...................................................... 38
Gambar 3.6 Screenshoot Data tb_rfm ............................................................... 40
Gambar 3.7 Screenshoot Hasil Normalisasi Data tb_rfm .................................. 44
Gambar 3.8 Flowchart Algoritma PSO ................................................................ 46
Gambar 3.9 Hasil Iterasi Pertama Metode PSO .................................................... 49
Gambar 3.10 Hasil Iterasi Kedua Metode PSO .................................................... 51 Gambar 3.11 Flowchart Algoritma K-Means ....................................................... 52 Gambar 3.12 Hasil Iterasi Pertama Metode K-Means .......................................... 55
Gambar 3.13 Hasil Iterasi Kedua Metode K-Means ............................................. 57
Gambar 3.14 Flowchart Algoritma Davies-bouldin Index ................................... 62 Gambar 3.15 Flowchart Algoritma Silhouette ...................................................... 67 Gambar 3.16 Rancangan Basis Data Bantu .......................................................... 70
Gambar 3.17 Form Home ..................................................................................... 79 Gambar 3.18 Form Transfomasi Monthly Segmentation ...................................... 80
Gambar 3.19 Form Transformasi Data ................................................................. 80 Gambar 3.20 Form Pemilihan Titik Pusat ............................................................ 81 Gambar 3.21 Form Clustering .............................................................................. 81
Gambar 3.22 Form Cluster Validation ................................................................. 82 Gambar 3.23 Form Hasil Segmentasi ................................................................... 83 Gambar 3.24 Form Hasil Segmentasi Per Pelanggan ........................................... 83
Gambar 3.25 Form About ..................................................................................... 84
Gambar 4.1 Tampilan Menu ................................................................................. 86
Gambar 4.2 Proses Pemilihan Data ....................................................................... 87
Gambar 4.3 Skema Pembentukan tb_standar ................................................ 88
Gambar 4.4 Skema Transformasi Data ................................................................. 88
Gambar 4.5 Proses Normalisasi Data ................................................................... 90 Gambar 4.6 Proses Optimasi Titik Pusat Cluster ................................................. 92 Gambar 4.7 Proses Clustering dengan Metode K-Means ..................................... 93 Gambar 4.8 Proses Validasi Cluster ..................................................................... 94 Gambar 4.9 Proses Pemilihan Titik Pusat dengan 2 Cluster ................................ 95
Gambar 4.10 Titik Pusat Optimal dengan 2 Cluster ............................................. 96 Gambar 4.11 Hasil Clustering K-Means dengan 2 Cluster ................................... 96
Gambar 4.12 Hasil Segmentasi Pembentukan 2 Cluster ...................................... 97 Gambar 4.13 Proses Pemilihan Titik Pusat dengan 3 Cluster .............................. 98 Gambar 4.14 Titik Pusat Optimal dengan 3 Cluster ............................................. 99
xi
Gambar 4.15 Hasil Clustering K-Means dengan 3 Cluster ................................. 100 Gambar 4.16 Hasil Segmentasi Pembentukan 3 Cluster .................................... 101
Gambar 4.17 Proses Pemilihan Titik Pusat dengan 4 Cluster ............................ 102 Gambar 4.18 Titik Pusat Optimal dengan 4 Cluster ........................................... 103 Gambar 4.19 Hasil Clustering K-Means dengan 4 Cluster ................................. 104 Gambar 4.20 Hasil Segmentasi Pembentukan 4 Cluster .................................... 105 Gambar 4.21 Proses Pemilihan Titik Pusat dengan 5 Cluster ............................ 106
Gambar 4.22 Titik Pusat Optimal dengan 5 Cluster ........................................... 107 Gambar 4.23 Hasil Clustering K-Means dengan 5 Cluster ................................. 108
Gambar 4.24 Grafik Clustering K-Means dengan 5 Cluster .............................. 109 Gambar 4.25 Proses Pemilihan Titik Pusat dengan 6 Cluster ............................ 110 Gambar 4.26 Titik Pusat Optimal dengan 6 Cluster ........................................... 111 Gambar 4.29 Hasil Clustering K-Means dengan 6 Cluster ................................. 112 Gambar 4.28 Grafik Clustering K-Means dengan 6 Cluster .............................. 113
Gambar 4.29 Hasil Validasi Cluster ................................................................... 115 Gambar 4.30 Perbandingan Hasil Cluster Metode K-Means dan Metode
PSO+K-Means ............................................................................... 116 Gambar 4.30 Grafik Segementasi Pelanggan (Kode Pelanggan = 12100) ......... 120
Gambar 4.31 Grafik Segementasi Pelanggan (Kode Pelanggan = 11005) ......... 121 Gambar 4.32 Grafik Segementasi Pelanggan (Kode Pelanggan = 12493) ......... 121 Gambar 4.33 Grafik Segementasi Pelanggan (Kode Pelanggan = 11794) ......... 122
xii
DAFTAR TABEL
Tabel 2.1 Daftar State of the Art ............................................................................. 8 Tabel 2.2 Klasifikasi Customer ............................................................................ 14
Tabel 2.3 Data Sumber .......................................................................................... 17 Tabel 2.4 Titik Pusat Iterasi ke-1 .......................................................................... 18 Tabel 2.5 Hasil Perhitungan Jarak ........................................................................ 19
Tabel 3.1 Tabel Pelanggan .................................................................................... 31
Tabel 3.2 Contoh Tabel tb_pelanggan ........................................................... 32
Tabel 3.3 Tabel Produk ......................................................................................... 32
Tabel 3.4 Contoh Tabel tb_produk .................................................................. 32
Tabel 3.5 Tabel Sales ............................................................................................ 33 Tabel 3.6 Contoh Tabel Sales ............................................................................... 33 Tabel 3.7 Tabel Transaksi ..................................................................................... 33
Tabel 3.8 Contoh Tabel Transaksi ........................................................................ 34 Tabel 3.9 Tabel Detail Transaksi .......................................................................... 34
Tabel 3.10 Contoh Tabel Detail Transaksi ........................................................... 35 Tabel 3.11 Data Pemilihan Atribut Sesuai Model RFM ....................................... 35 Tabel 3.12 Contoh Data pada Tabel Transaksi ..................................................... 36
Tabel 3.13 Contoh Data pada tb_standar ....................................................... 37
Tabel 3.14 Contoh Data pada tb_rfm ................................................................ 39
Tabel 3.15 Contoh Data pada tb_rfm ................................................................ 39
Tabel 3.16 Contoh Data Atribut Recency ............................................................. 40
Tabel 3.17 Contoh Data Atribut Frequency .......................................................... 42 Tabel 3.18 Contoh Data Atribut Monetary ........................................................... 43
Tabel 3.19 Hasil Normalisasi Data tb_rfm ........................................................ 44
Tabel 3.21 Normalisasi Data tb_rfm ................................................................. 47
Tabel 3.21 Contoh Data pada tb_rfm yang Ternormalisasi ............................. 52
Tabel 3.22 Perhitungan Jarak Iterasi Pertama ....................................................... 54 Tabel 3.23 Perhitungan Jarak Iterasi Pertama ....................................................... 57
Tabel 3.24 Domain Nilai RFM ............................................................................. 58 Tabel 3.25 Deskripsi Variabel Linguistik dan Label Konsumen .......................... 59
Tabel 3.26 Data tb_rfm ...................................................................................... 60
Tabel 3.27 Domain Nilai RFM ............................................................................. 61 Tabel 3.28 Kelas Cluster ....................................................................................... 61
Tabel 3.29 Normalisasi Data tb_rfm ................................................................. 63
Tabel 3.30 Contoh Data Atribut Cluster 1 ............................................................ 63 Tabel 3.31 Contoh Data Atribut Cluster 2 ............................................................ 64
Tabel 3.32 Normalisasi Data tb_rfm ................................................................. 68
Tabel 3.33 Hasil Perhitungan Indeks Silhouette ................................................... 69
Tabel 3.34 Struktur tb_standar ...................................................................... 70
Tabel 3.35 Contoh Data tb_standar ............................................................... 71
Tabel 3.36 Struktur tb_rfm ................................................................................ 71
xiii
Tabel 3.37 Struktur Data tb_rfm ........................................................................ 72
Tabel 3.38 Struktur tb_hasil ........................................................................... 72
Tabel 3.39 Struktur Data tb_hasil ................................................................... 72
Tabel 3.40 Struktur tb_range_r ...................................................................... 73
Tabel 3.41 Struktur Data tb_hasil_r .............................................................. 73
Tabel 3.42 Struktur tb_range_f ...................................................................... 74
Tabel 3.43 Struktur Data tb_hasil_f .............................................................. 74
Tabel 3.44 Struktur tb_range_m ...................................................................... 74
Tabel 3.45 Struktur Data tb_hasil_monetary ............................................. 75
Tabel 3.46 Struktur tb_kelas ........................................................................... 75
Tabel 3.47 Struktur Data tb_kelas ................................................................... 76
Tabel 3.48 Struktur tb_avg ................................................................................ 76
Tabel 3.49 Struktur Data tb_avg ........................................................................ 77
Tabel 3.50 Struktur tb_cluster ...................................................................... 77
Tabel 3.51 Struktur Data tb_cluster .............................................................. 77
Tabel 3.52 Struktur tb_hasil_normal .......................................................... 78
Tabel 3.53 Struktur Data tb_normal ................................................................ 78
Tabel 3.54 Struktur tb_optimasi .................................................................... 78
Tabel 3.57 Struktur Data tb_optimasi ............................................................ 79
Tabel 4.1 Hasil Optimasi dengan Parameter Cluster =2 dan Iterasi = 20 ............. 95 Tabel 4.2 Hasil Segmentasi Pembentukan 2 Cluster ............................................ 97
Tabel 4.3 Hasil Segmentasi Pembentukan 2 Cluster ............................................ 97 Tabel 4.4 Hasil Optimasi dengan Parameter Cluster =3 dan Iterasi = 20 ............. 99 Tabel 4.5 Hasil Segmentasi Pembentukan 3 Cluster .......................................... 100
Tabel 4.6 Hasil Segmentasi Pembentukan 3 Cluster .......................................... 100 Tabel 4.7 Hasil Optimasi dengan Parameter Cluster =4 dan Iterasi = 20 ........... 103
Tabel 4.8 Hasil Segmentasi Pembentukan 4 Cluster .......................................... 104 Tabel 4.9 Hasil Segmentasi Pembentukan 4 Cluster .......................................... 104
Tabel 4.10 Hasil Optimasi dengan Parameter Cluster = 5 dan Iterasi = 20 ........ 107 Tabel 4.11 Hasil Segmentasi Pembentukan 5 Cluster ........................................ 108 Tabel 4.12 Hasil Segmentasi Pembentukan 5 Cluster ........................................ 108
Tabel 4.13 Hasil Optimasi dengan Parameter Cluster = 6 dan Iterasi = 20 ........ 111 Tabel 4.14 Hasil Segmentasi Pembentukan 6 Cluster ........................................ 112 Tabel 4.15 Hasil Segmentasi Pembentukan 6 Cluster ........................................ 113 Tabel 4.16 Hasil Segmentasi Pembentukan 6 Cluster ........................................ 117
Tabel 4.17 Waktu Eksekusi Metode ................................................................... 118 Tabel 4.16 Perbandingan Kelas Pelanggan per Bulan ........................................ 119
xiv
DAFTAR KODE PROGRAM
Kode Program 4.1 Stored Procedure Input Nilai RFM ........................................ 89
Kode Program 4.2 Fungsi Normalisasi Data dengan Metode Min and Max ........ 90 Kode Program 4.3 Fungsi Metode Particle Swarm Optimization ........................ 91 Kode Program 4.4 Fungsi Metode K-Means ........................................................ 93 Kode Program 4.5 Fungsi Metode DBI ................................................................ 93
Kode Program 4.6 Fungsi Metode Silhouette Index ............................................. 94
Recommended