24
資訊科技導論 PART IV INTRODUCTION TO INFORMATION TECHNOLOGIES Topic 1: Advanced Web Techniques Instructor: I-Hsien Ting

資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

資訊科技導論 PART IVINTRODUCTION TO INFORMATION TECHNOLOGIESTopic 1: Advanced Web TechniquesInstructor: I-Hsien Ting

Page 2: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

RECOMMENDATION

Page 3: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

RECOMMENDATION APPROACHES

Most of Recommendation Systems are developed for PersonalizationDifferent Approaches for Recommendation

Simple ApproachContent-based ApproachCollaborative FilteringHybrid Methods

Page 4: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

SIMPLE APPROACH

Ratinge.g. John Doe gave the movie “Harry Potter” the rating of 7 (out of 10)Not-yet-rated problem??

User’s ProfileForm the preferences that identified by users

Page 5: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

CONTENT-BASED METHODS

The content-based approach to recommendation has its roots in information retrieval and information filtering.Let Content(s) be an item profile

Keywords for text-based systemsThe importance of word kj in document dj is determined with some weighting measure wij

Features for graphical images. Audio streams, and video streams.

Page 6: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

TF & IDFHow to measure the importance of keywords

TF (term-frequency)

IDF (inverse document frequency)

Weight

Page 7: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

SOME PROBLEMS OF CONTENT-BASEDRECOMMENDATION

Limited Content AnalysisLimited to text-based documentsAutomatic feature extraction methods are much harder to apply to multimedia data.

Overspecializatione.g. A person with no experience with Greek cuisine would never receive a recommendation for even the greatest Greek restaurant in town.Item should not be recommended if they are too similar.

New User ProblemFew ratings

Page 8: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

COLLABORATIVE RECOMMENDATION

Recommendation based on “Similar” usersTwo general classes

Memory-based (heuristic-based)Heuristic that make rating predictions based on the entire collection of previously rated items by the usersHow to measure the similarity is the key to generate recommendations

Model-basedThe collection of rating to learn a model, which is then used to make rating prediction.

Probabilistic approachLinear regressionBayesian modelData mining approaches: k-mean clustering……

Page 9: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

SIMILARITY

Pearson Correlation Coefficient

Cosine-based Approach

Mean-squared difference

Page 10: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

SOME PROBLEMS OF COLLABORATIVERECOMMENDATION

New User ProblemNew Item ProblemSparsity

It is not enough to generate recommendation only according to rating informationDemographic segments

The gender, age, area code, education and employment information

Page 11: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

HYBRID RECOMMENDATION METHODS

For more accurate recommendationsDifferent ways

Implementing collaborative and content-based methods separately and combining their predictions

DailyLearner SystemIncorporating some content-based characteristics into a collaborative approach

Fab and Collaboration via contentIncorporating some collaborative characteristics into a content-based approachConstructing a general unifying model that incorporates both content-based and collaborative characteristics

Page 12: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

RECOMMENDATION SYSTEMS: NEWS DUDE

Billsus and Pazzani, “A hybrid user model for news story classification,” Conf. on User Modeling, 1999.A content-based approach for filtering news.

A short term interest profile that record recently read news.A long term interest described as a probability model.

An article first goes through the short term interest profile, followed by long term interest.Experimental results show that the hybrid approach perform better than either model.

Page 13: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

RECOMMENDATION SYSTEMS: FIREFLY

Shardanand and Maes, “Social information filtering: Algorithms for automating ‘word of mouth’., CHI95.A collaborative approach for filtering music.An early version is called Ringo.

Page 14: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

RECOMMENDATION SYSTEMS: WEBWATCHER

Joachims, Freitag, Mitchell, “WebWatcher: A tour guide for the World Wide Web,” Conf. on AI, 1997.Combine content-based and collaborative approaches to weigh hyperlinks in a given page.The core is a content-based prediction.Users have to specify its goal of browsing at the beginning.The content of a hyperlink includes

Web page text.Users’ descriptive keywords.

The result has shown to be as good as human experts.

Page 15: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

RECOMMENDATION SYSTEMS: CLIXSMART

Perkowitz and Etzioni, “Adaptive Web sites: An AI challenge,” IJCAI97.A combination of content-based and collaborative recommendation for personalized TV guide.Serving more than 20,000 users in Ireland and Great Britain.Each program is featured by name, channel, airtime, genre, country of origin, cast, studio, director, writer, etc.Launched since 1999, there have been more than 20,000 registered users.Through questionnaires, users express high degree of satisfaction.Through precision measures, it is found that collaborative filtering behaves better than content-based, which again is better than randomization.

Page 16: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

SEMANTIC WEB

Page 17: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

SEMANTIC WEB

A self-explained WebWeb pages are design to be read by people, but not machineExample: Search for a low price flight ticketExample: Make Reservations to Library

Page 18: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

SEMANTIC WEB & NON-SEMANTIC WEB

Non-semantic web<item>cat</item>

Semantic Web<animal Kingdom="Animalia" Phylum="Chordata" Class="Mammalia" Order="Carnivora" Family="Felidae" Genus="Felis">Cat</animal>

Page 19: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

METADATA IN HTML<meta name="keywords" content="computing, computer studies, computer“><meta name="description" content="Cheap widgets for sale“><meta name="author" content="Hack's Hardware">

Page 20: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

RDF-AN EXTENTION OF METADATA

RDF: Resources Description FrameworkXML is a simply RDF A Simple RDF

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:terms="http://purl.org/dc/terms/"> <rdf:Description rdf:about="urn:x-states:New%20York"> <terms:alternative>NY</terms:alternative>

</rdf:Description> </rdf:RDF>

Page 21: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

OWLOWL: Web Ontology Language

A Family of Knowledge Representation Language for authoring ontologies

OWL Web Ontology Language Overviewhttp://www.w3.org/TR/owl-features/

<owl:Class rdf:ID="Burgundy"> ... <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource=“#hasSugar” /> <owl:hasValue rdf:resource=“#Dry” /> </owl:Restriction></rdfs:subClassOf>

</owl:Class>

Page 22: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

SPARQLAn RDF Query Language

PREFIX abc: <http://example.com/exampleOntology#> SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; abc:isCapitalOf ?y . ?y abc:countryname ?country ; abc:isInContinent abc:Africa .}

Page 23: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

LINKING OPEN DATA DIAGRAM

Creating openly accessible, and interlinked, RDF Data on the Web.

http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData

Page 24: 資訊科技導論 PART IV|Recommendation based on “Similar” users |Two general classes yMemory-based (heuristic-based) |Heuristic that make rating predictions based on the entire

SEMANTIC WEB EXAMPLES

BibServhttp://www.bibserv.org/

ESP Gamehttp://www.espgame.org/