37
FlexRecs: Expressing and Combining Flexible Recommendations IDS Lab. Seminar Winter 2010 Minsuk Kahng [email protected] Jan. 8 th , 2010 G. Koutrika, B. Bercovitz, H. Garcia-Molina SIGMOD 2009 Center for E-Business Technology Seoul National University Seoul, Korea Stanford InfoLab Intelligent Database Systems Lab. 강 강 강

FlexRecs: Expressing and Combining Flexible Recommendations IDS Lab. Seminar Winter 2010 Minsuk Kahng [email protected] Jan. 8 th, 2010 G. Koutrika,

Embed Size (px)

Citation preview

FlexRecs: Expressing and Combining Flexible Recom-mendations

IDS Lab. Seminar

Winter 2010

Minsuk Kahng

[email protected]

Jan. 8th, 2010

G. Koutrika, B. Bercovitz, H. Garcia-Molina

SIGMOD 2009

Center for E-Business TechnologySeoul National UniversitySeoul, Korea

Stanford InfoLab

Intelligent Database Systems Lab.

강 민 석

Copyright 2010 by CEBT

Abstract

FlexRecs Recommendation systems have a number of limitations.

Algorithm is hard-wired into the system.

propose a recommendation framework, FlexRecs

decouple the definition of a recommendation process

declaratively define recommendation process as a high-level workflow

comprise traditional relational operators and new operators

Prototype, flexible recommendation engine

Realize the proposed framework, FlexRecs

2

Contents

Introduction

Related Work

Recommendation Framework

System Architecture

Experiments

Conclusions

3

Copyright 2010 by CEBT

Introduction

Recommendation System Provide advices on movies, products, travel, and many other topics

Become very popular in systems

Google News, Amazon, MovieLens

Many recommendation approaches have been proposed.

4

Copyright 2010 by CEBT

Motivation

CourseRank Stanford InfoLab has developed CourseRank

A social tool that helps students to make informed choices about classes

수업 공식 정보 , 수업 게시판 , 학점 분포 , 수강 후기 , 시간표 , 4 년 시간표 , 추천 등

5

Copyright 2010 by CEBT

Motivation

Challenges The need for flexibility and expressivity

Initial version offered no choices 추천 결과는 n 개의 list 만 제공

추천된 n 개 중 1 개와 관련된 더 많은 추천 결과를 보고 싶어도 방법이 없음 .

type 을 제한하기 , 친구의 이력을 기반으로 추천하기 , 학점이 비슷한 사람의 추천 등 불가

The need for experimentation and higher productivity

여러 추천 기법의 통합 환경에 따라 방법 X 와 방법 Y 중 좋은 경우 다름

두 추천 방법에 적절한 weight 을 줘서 결합할 필요성

여러 추천 기법의 구현 time-consuming, counter-productive

not easily expandable and manageable

6

Copyright 2010 by CEBT

Introduction

Limitations of Recommendation Hard Wired

NOT expressed declaratively – algorithm typically embedded in the system code

Make it hard to modify the algorithms, or experiment with different ap-proaches

No Flexibility

추천 결과는 fixed. End users are given few choices

Users may expect diverse recommendations in different contexts.

Unable to request recommendations for user-defined constraints

Limited World Model

일반적으로 추천은 deal with two types entities: users & items

Provide recommendations using richer data representations is not straight-forward.

7

Copyright 2010 by CEBT

Introduction

FlexRecs, Proposed Framework Flexible Recommendations

to be easily defined, customized, processed over structured data

Decouples definition of recommendation process

Declaratively define recommendation process as a high-level workflow

Enable generating any recommendations with the same engine

Recommendation expressed as a high-level workflow

contain traditional relational operators

plus new recommendation operators

can handle data in relational form

Designers can create multiple, customizable workflows

Prototype flexible recommendation engine that realizes the frame-work

Execute a workflow over conventional DBMS

8

Copyright 2010 by CEBT

Contents

Introduction

Related Work

Recommendation Framework

System Architecture

Experiments

Conclusions

9

Copyright 2010 by CEBT

Related Work

Limitations of Recommendation Systems Algorithms are hard wired in the system code.

Design, implement, experiment with new methods can be time-consuming.

Generate only a predefined and fixed set of recommendations

기존 방법 ( 컨텐츠 기반 , CF) 의 문제점 해결하기 위한 여러 시도들 과거 이력에 지나치게 의존하는 문제 , cold-start problem in CF 등

But, may be required under different circumstances by different users

Limited World

많은 실제 app. 에서 reside much richer data in DB.

Different types of entities may co-exist in a single DB.

Current ones are not very expressive

Some extensions Incorporate multi-criteria ratings into recommendations

Language RQL

Allow users to formulate recommendation in a flexible manner

But, not very expressive because formulated on a pre-specified multi-D cube of ratings

10

Contents

Introduction

Related Work

Recommendation Framework

Data Model

Operators

Recommendation Workflow

System Architecture

Experiments

Conclusions

11

Copyright 2010 by CEBT

Data Model

Data Model Data reside in structured form, and particularly in relational form.

Focus on databases that follow relational model

Base Relation Database comprises a set of relations.

A Relation has a set of attributes

An attribute instantiated to a single value is called base attribute.

A relation with only base attributes called base relation.

12

Copyright 2010 by CEBT

Data Model

Extended Relation The authors introduce the concept of an extended relation.

Now, an attribute value can be a relation.

13

Copyright 2010 by CEBT

Data Model

Extended Relation Examples

can be thought as “views”

Generalized?

Model and Language could be generalized to arbitrary nesting No need for generality for practical scenarios

Materialized or not?

This issue if orthogonal to their definition. may not be stored in DB

14

Copyright 2010 by CEBT

Operators

Base Operators can operate on base and extended relations

Operators

Select select tuples from relation, for which the condition holds

condition refers only to base attributes

결과는 base or extended relation depending on 원래 type

Project project the relation into a smaller set of its attributes

A is a list of base, embedded or extended attributes

Join combine tuples in two relations that meet some condition

condition refers only to base attributes

about Nested Relation Algebra

Such generality is not necessary for practical recommendation.

15

Copyright 2010 by CEBT

Operators

The Extend Operator information that conceptually refers to entity is found in several re-

lations.

create extended attributes in the tuples of a relation

Example

Ratings made by each studentas a single “unit of information” per student

16

Copyright 2010 by CEBT

The Recommend Operator

Comparison function Recommendations are based on comparisons

e.g. Courses are rated by comparing their topics to student’s interests.

e.g. User-User similarity in CF

Have a library of comparison functions for recommendation tasks

Comparison Function

P 에는 기본적으로 attribute 가 들어갈 수 있음 .

17

Copyright 2010 by CEBT

The Recommend Operator

Comparison function Examples

Comparisons of string values – Jaccard similarity

Comparisons of numerical values – Simple Distance

Using conditional probabilities

Comparisons of extended values

Comparisons of single values to extended values

18

Copyright 2010 by CEBT

The Recommend Operator

Aggregation Comparison function Comparison functions compare one tuple to another tuple.

Desirable to compare one tuple to a set of tuples

Combine all partial values into a final one (e.g. max, avg)

Example

Weighted average of the partial comparison values

19

Copyright 2010 by CEBT

The Recommend Operator

Recommend Operator Score value of each tuple is produced by comparing it to other tu-

ples

Ri 의 tuple ri 을 Rj 의 모든 tuple 와 함수 cf을 이용하여 비교한 후

aggregation 함수 a을 이용해서 그 결과를 aggregate 한 결과가 value v

추천 후보인 Ri 의 tuple ri 각각에 대해 점수 값을 얻게 됨 .

Example

Alice 에게 course 을 추천

20

Copyright 2010 by CEBT

The Blend Operator

Blend Combine recommendations generated by two different processing

paths

e.g. 친구들이 들은 과목 기반 추천 + 졸업을 위해 필요한 과목 기반 추천

Blending methods

Occurrence-based blending

Normalized blending

Weighted average blending

21

Copyright 2010 by CEBT

Recommendation Workflows

Recommendation and Blend Operators capture the essence of most recommendation approaches

can be composed and combined with select, project, join to describe rec.

Recommendation Workflow

Examples

take several examples

가상의 학생 (user) Alice 가 요청

당연한 몇 가지 사항들은 제외 Alice 가 이미 소비한 item 들은 제외하기

22

Copyright 2010 by CEBT

Recommendation Workflows

Recommendation Workflow Examples Example 1 : Related Courses

Alice 는 현재 “ Programming: Part One”(C22) 과목에 대해 보는 상태

2008 년에 제공되는 과목 중 이 과목과 비슷한 과목을 추천하기

비교 함수로는 과목명 (Title) 에 대해 Jaccard Similarity 를 이용

23

CourseID Title Score

C23 Programming: Part Two 2/4 = 0.5

C25 Advanced Programming Methodology

1/5 = 0.2

C30 Computer Graphics 0/5 = 0

… … …

Copyright 2010 by CEBT

Recommendation Workflows

Recommendation Workflow Examples Example 2 : Content-based Recommendation

Alice(StudID=1234) 는 literature, writing 관련 과목들을 이미 수강한 상태

올해 (2008 년 ) 들을 과목을 그 동안 Alice 가 들었던 과목과 비슷하게 추천 받고자 함

24

Copyright 2010 by CEBT

Recommendation Workflows

Recommendation Workflow Examples Example 3 : Nearest-neighbor collaborative filtering

SuID=444 인 학생과 비슷한 취향의 학생을 찾아서 이들의 이력을 기반으로 추천

비슷한 취향의 학생의 점수를 많이 반영하여 각 과목에 대한 점수 도출 Course is rated by taking weighted average of the ratings provided by these stu-

dents.

Comparisons of single values to extended values

25

Copyright 2010 by CEBT

Recommendation Workflows

Recommendation Workflow Examples Example 5 : Blending

Ex.2 에서 구한 content-based 결과와 Ex.3 에서 구한 CF 결과를 blend 0.7 :1 의 비율로 반영

26

Copyright 2010 by CEBT

Recommendation Workflows

Recommendation Workflow Examples Ex. Many recommend and blend operators

과목 내용이 비슷한 학생 , 학점 (GPA) 가 비슷한 학생 모두 고려하여 추천

Ex. Classification

Alice 가 Honor Student 들과 얼마나 비슷한지 판단하여 Honor Student 여부 판단

Ex. Recommending a major

Course 외의 다른 item(major) 도 추천 가능

Ex. Item-to-item movie recommendation

Item based CF

27

Contents

Introduction

Related Work

Recommendation Framework

System Architecture

Architecture

Recommendation Plan Generator

Experiments

Conclusions

28

Copyright 2010 by CEBT

System Architecture

Architecture Workflow Manager

allow designer to define rec. workflows

Hide details

Workflow Parser

Construct an expression tree

Recommendation Plan Generator

Generate a rec. execution plan

Plan is a sequence of SQL and func. calls

Recommendation Generator

Execute a plan and returns the rec.

Send SQL to DB engine

29

Copyright 2010 by CEBT

Recommendation Plan Generator

Recommendation Plan Generation Build a recommendation plan by traversing an expression tree

Query 1 – similar users (create temporary in-memory table)

Query 2 & 3 – One Recommendation (Example 3)

Query 4 - Blend

30

Contents

Introduction

Related Work

Recommendation Framework

System Architecture

Experiments

Conclusions

31

Copyright 2010 by CEBT

Experiments

Objective Examine the feasibility and performance of flexible recommenda-

tions

Study different workflows with different characteristics

real data 사용

written in Java on top of MySQL

Workflow Collaborative Filtering

Major recommendation

Related courses

Friends-of-friends

more complex that content-based and CF ones

32

Copyright 2010 by CEBT

Experiments

Workflow Collaborative Filtering

모든 user 에 대해 다른 모든 user 와 similarity 구해서 추천할 때 , user 별 평균 시간

Gen time 은 SQL 생성 시간으로 수행 시간에 비해 얼마 걸리지 않음

User 수 증가에 따라 선형적으로 증가

comparison function 어떤 것을 쓰더라도 비슷한 결과

Summary easy to create multiple workflows and execute them transparently

over the same flexible rec. system that combines extensibility with reasonable performance

33

Contents

Introduction

Related Work

Recommendation Framework

System Architecture

Experiments

Conclusions

34

Copyright 2010 by CEBT

Conclusions

Contributions decouple the definition of a recommendation process

Introduce an extend operator that generates a virtual nested relation

define recommend & blend operators that capture essence of rec. work-flows

provide several examples that show how common rec. can be expressed

describe a prototype flexible recommendation engine that realizes the pro-posed framework

New operators can be compiled into standard SQL for execution.

present experimental results that show the potential of FlexRecs

Future Work make possible to study the optimization of multiple recommendation work-

flows

currently work on scaling over very large inputs

Automatically balance complexity and effectiveness and identify the best option

It would be interesting to define flexible rec. for XML or ontologies.

design appropriate user interfaces for enabling users express flexible rec.35

Copyright 2010 by CEBT

Discussion

Flexible

make “flexible”

Synergy

Decouple the Definition of Recommendation

Recommend operator 로 generalize

use Nested Relation

Nested Relational Model 을 이렇게 이용

실제로는 GROUP BY 쓰면 될 일

SQL

use conventional DBMS

지금도 SQL 을 이용한 추천 구현이 가능한데 , 성능 평가가 필요한지

36

37

Thank you~