39
Considering a Faceted Search-based Model Marti Hearst UCB SIMS [email protected] NAS CSTB DNS Meeting on Internet Navigation and the Domain Name System: Technical Alternatives and Policy Implications July 12, 2001

Considering a Faceted Search-based Model Marti Hearst UCB SIMS [email protected] NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Considering a Faceted Search-based Model

Marti HearstUCB SIMS

[email protected]

NAS CSTB DNS Meeting on

Internet Navigation and the Domain Name System:

Technical Alternatives and Policy Implications

July 12, 2001

Page 2: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Outline

The Klensin proposal Synopsis Issues Recommendations

UIs and faceted search

Page 3: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

A Proposal

A Search-based access model for the DNSIETF Internet-Draft by John Klensinhttp://www.ietf.org/internet-drafts/draft-klensin-dns-search-00.txt

A multi-layer approach to naming Faceted descriptions are used to facilitate both

flexible naming and inexact search

This talk: What does research tell us about the search issues?

Page 4: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Faceted Classification System(simple, regulated)

Free-textSearch

(unregulated)

DNS (unchanged)

Faceted System(detailed, unregulated)

Klensin’s proposal

Search

Lookup

Page 5: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Layer 2

IndustryCategory:Restaurant

Geolocation:Miami

Language:Spanish

NetworkLocation

Name:Jose’s Pizza

Page 6: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Faceted System(simple, regulated)

Layer 2

Inputs: search values for one or more facets

Outputs: appropriate DNS namesand all tuples with matched facets

Allow for partial (fuzzy) match

Jose’s Pizza, MiamiAlberto’s Pizza, MiamiJose’s Bistro, MiamiJose’s Pizza, SaratogaJoe’s Pizza, Miami…

Jose’s Pizza, MiamiAlberto’s Pizza, MiamiJose’s Bistro, MiamiJose’s Pizza, SaratogaJoe’s Pizza, Miami…

Page 7: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Layer 2 Selling Points

Allows sharing of name space among different (commercial) entities

Allows specification according to meaningful attributes

Page 8: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Layer 2 DNS Issues

How to guarantee uniqueness? How to determine appropriate

descriptors? How to use in a hyperlink? Requires a user interface for

confirmation of correct choice

Page 9: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Layer 2 Descriptor Issues

Emphasis on geolocation may be problematic

May be too spareSFMOMASFMOMA exhibitsSFMOMA exhibit on digital art called 101010

Page 10: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Faceted System(detailed, unregulated)

Layer 3

Not centrally coordinated(provided by commercial services)

More detailed facetsAllow for inheritanceContext-sensitive(e.g., restaurant has menu attribute auto repair has services, etc.)

Inputs: service-dependentOutputs: layer 2 names

Page 11: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Free-textSearch

(unregulated)

Layer 4

Use standard search to find sites that discuss topics that relate to the query (as web search works today)

Page 12: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Relation to Web Search

Web search is perceived to work better today than two years ago. Why? Finds appropriate starting points

Also known as source selection Search for “toyota” no longer returns “Tony’s Toyota pages” as

the top-ranked hit Before the web, source selection was a

separate operation from free text search Also, queries tended to be longer

Web search engines could do this exclusively – but they do other things as well.

Page 13: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Recommendations on Klensin Proposal

A promising, intriguing approach One tweak:

Combine layers 2 and 3 Have a partly regulated portion, and

an open portionThis however is susceptible to spamming

Not clear if this should be regulated

Page 14: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

General Pitfalls ofControlled Vocabularies

Difficult to get agreement on the set of labels

Difficult to assign labels consistently Granularity Salience / Emphasis Context Connotations

New labels always appearing; old ones shift in meaning

Lay people won’t know the system

Page 15: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

How to do it wrongForce into a Hierarchy

Let’s try to find UCB

Page 16: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

How to do it wrong

Page 17: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

How to do it wrong

Page 18: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

What is the problem?

Two deeply hierarchical facets Region Education

Forced in convoluted ways into one hierarchy with irregular cross links

Page 19: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Two Approaches

Statistical approaches map words into metadata terms

Create flexible user interfaces that progressively reveal appropriate subparts of the system (How to do so is a topic of our research.)

Page 20: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Practice

Using descriptors “under the hood” The limited empirical work indicates

Combining free text + descriptors works best Some e-commerce sites do this for finding

products Can sometimes match queries to standard

information needs “buy” + “palm” “review” + “crouching tiger” “berkeley” + “gap”

Page 21: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

walmart.comUses metadata

“under the hood”

Page 22: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Promise

Using descriptors in the User Interface

Use faceted metadata for navigation Query Previews Tailored Search Forms Tightly Combine Navigation & Search

Page 23: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Facets

Orthogonal sets of descriptors Gets complicated when they are

hierarchical Example: recipes

Page 24: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Metadata Facets

Time/Date Topic TaskGeoRegion

Advantage: Great for Mixing and Matching

Page 25: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Faceted Recipe Metadata

PrepareCuisineIngredient Dish

Recipe

Page 26: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

Sunset.comNot the right way

Page 27: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Dynamic Previews

Avoid empty results sets Show the possible next steps A way to seamlessly integrate

Related topics User preferences (personalization) Context-sensitivity

Page 28: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

Page 29: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

Page 30: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

Page 31: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

Page 32: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Metadata Usage in Epicurious

Can choose category types in any order But categories never more than one level

deep And can never use more than one instance

of a category Even though items may be assigned more than

one of each category type Items (recipes) are dead-ends

Don’t link to “more like this” Not fully integrated with search

Page 33: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

Epicurious Metadata UsageProblem: lacks integration with search

Page 34: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

This is fixed in marthastewart.com

Page 35: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Wron

Advanced search more specific than sunset.com;

also allows for disjunction;

thus less likely to get null

results

Page 36: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

UIs for faceted metadata

Use dynamic previews Allow user to select metadata in any order At each step, show different types of relevant

metadata, based on prior steps and personal history, include # of documents

Previews restricted to only those metadata types that might be helpful

Tightly integrate with keyword search

Page 37: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

The Flamenco Research Project

Systematically determine what works for integrating metadata into search interfaces

Develop recommendations that reflect both the task structure and the richness of the information structure

http://bailando.sims.berkeley.edu/flamenco.html

Page 38: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Summary

Agreement on metadata descriptors assignment is difficult to achieve Descriptors need to be constantly updated Layer 2 is probably not rich enough

Assigning specifiers is quite different than searching for specified items

Fuzzy search can help, but Requires a UI for confirmation of correct choices This will end up looking like a search service Can make search more meaningful and task-based

Page 39: Considering a Faceted Search-based Model Marti Hearst UCB SIMS hearst@sims.berkeley.edu NAS CSTB DNS Meeting on Internet Navigation and the Domain Name

Summary

Web search engines can do source selection, but Sometimes users do want source selection, But often search hits based on content of

pages is often closer to what users want to do

We need to be certain not to confuse source selection from content search