Информационный поиск, осень 2016: Федеративный поиск

Distributed information retrieval Aggregated search Summary

Information RetrievalFederated Search

Ilya [email protected]

University of Amsterdam

Ilya Markov [email protected] Information Retrieval 1


Course overview

Data Acquisition

Data Processing

Data Storage

EvaluationRankingQuery Processing

Aggregated Search

Click Models

Present and Future of IR

Offline

Online

Advanced



Advanced topics in IR

Data Acquisition

Data Processing

Data Storage


Aggregated Search

Click Models


Offline

Online

Advanced



Outline

1 Distributed information retrieval

2 Aggregated search

3 Summary



Outline

1 Distributed information retrievalResource descriptionResource selectionScore normalization and results merging

2 Aggregated search

3 Summary



Taxonomy of IR architectures



Broker-based DIR

Q QR R

QR

...

BROKER

Q QR R

Q

...

BROKER

Q Q

Q

...

BROKER

Q

...

BROKER

...

BROKER

Q

...

BROKER

...

BROKER

1 Resource description

2 Resource selection

3 Score normalization

4 Results merging



Outline




Resource Description

Full content

Index of the full content

Metadata

Statistics: tf, df, average document length, . . .

Combination

Other. . .



DIR cooperation

Cooperative environments: a resource provides full access todocuments and indices and responds to queries

Uncooperative environments: a resource does not provide anyaccess to documents and indices; it only respond to queries



Query-based sampling

Q

DESCRIPTION

RESOURCE

Q

DESCRIPTION

RESOURCE

QRESOURCE

QRESOURCE

RESOURCE

J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems,19(2):97–130, 2001.



QBS issues

Q

DESCRIPTION

RESOURCE

DESCRIPTION

RESOURCE?

DESCRIPTION

RESOURCE?

?How to select queries for sampling?

When to stop sampling?



Selecting sampling queries: where from?

Other Resource Description (ORD): selects terms from areference dictionary

Learned Resource Description (LRD): selects terms from theretrieved documents based on term statistics



Selecting sampling queries: how?

Random selection

Document Frequency (df )

Collection Frequency (ctf )

Average Term Frequency (ctf /df )



Outline




Resource selection

1 Large-document model

2 Small-document model

3 Classification-based selection



Large-document model

QUERY

RESOURCESELECTION

0.19

0.870.61

0.46

0.08

...

0.87

0.61

0.46

0.19

0.08

QUERY

RESOURCESELECTION

0.19

0.870.61

0.46

0.08

...

0.87

0.61

0.46

0.19

0.08

QUERY

RESOURCESELECTION

0.19

0.870.61

0.46

0.08

QUERY

RESOURCESELECTION

QUERY

RESOURCESELECTION

1 Resources asdocuments

2 Retrieval ondocuments

3 Documents areresources



Large-document model examples

CORI: BM25 scoring function

LM-KL: Kullback-Leibler divergence between a query and aresource language models



Small-document model

1 Rank sample documents for a given user’s query

2 Consider the top n documents

3 Calculate a score for a resource R based on its documents inthe top-n

1 2 3 4 5 kk-1... ...

1 2 3 4 5 kk-1... ...top n

L. Si and J. Callan. Relevant document distribution estimation method for resource selection. In Proceedings ofSIGIR, pages 298–305, 2003.



Small-document model examples

1 2 3 4 5 kk-1... ...top n

Counts

score(Rred | q) = 3 · |Rred ||Sred |

Retrieval scores

score(Rgreen | q) = (score(d2 | q) + score(d7 | q)) · |Rgreen||Sgreen|

Ranks: n − rank(d |q) + 1

score(Ryellow | q) = (6 + 3 + 1) · |Ryellow ||Syellow |

= 10 · |Ryellow ||Syellow |



Classification-based selection

Classify each resource as “shown” or “not shown”

Use multiple features, including LD and SD



Outline




Score normalization and results merging

Score Normalization

UnsupervisedSemi-

supervised Supervised

Results MergingLinear Non-linear

Semi-supervisedUnsupervised

Score Normalization

UnsupervisedSemi-






Linear score normalization

MinMax

snorm =s −min

max −min

Z-Score

snorm =s − µσ

Sum

s ′ = s −min, snorm =s ′∑i s′i

I. Markov, A. Arampatzis and F. Crestani. Unsupervised linear score normalization revisited. In Proceedings ofSIGIR, pages 1161–1162, 2012.



Score normalization and results merging

Score Normalization

UnsupervisedSemi-






Results merging (CORI)

100

87

41

15

...

0.1

0.03

0.015

0.001

1

0.75

0.23

0.1

1250

815

800

234

... ... ...

100

87

41

15

...

0.1

0.03

0.015

0.001

1

0.75

0.23

0.1

1250

815

800

234

... ... ...

0.95 0.56 0.29 0.15

100

87

41

15

...

0.1

0.03

0.015

0.001

1

0.75

0.23

0.1

1250

815

800

234

... ... ...

1.38

1.2

0.57

0.2

...

1.22

0.37

0.18

0.12

1.12

0.84

0.27

0.11

1.06

0.69

0.68

0.2

... ... ...

0.95 0.56 0.29 0.15

snorm = sMinMax · (1 + λ · wresource)

J. Callan, Z. Lu and B.W. Croft. Searching distributed collections with inference networks. In Proceedings ofSIGIR, pages 21–28, 1995.



Distributed information retrieval summary

1 Resource description

Query-based sampling

2 Resource selection

Large-document modelSmall-document modelClassification-based selection

3 Score normalization and results merging

Linear/non-linearSupervised/semi-supervise/unsupervised



Outline


2 Aggregated searchVertical representationVertical selectionResults presentationSearch behaviorEvaluationPresent and future

3 Summary



Aggregated search



Aggregated search

1 Vertical Representation: represent each vertical as a set ofhigh-level features

2 Vertical Selection: given a query, decide for each vertical if ithas to be presented or not

3 Results Presentation: blend the results and position themappropriately on the page



Outline




Vertical representation

Can be required for efficiency reasons

Vertical sampling : query-based sampling is used to retrievedocuments from each vertical. Sample queries are taken fromverticals’ query logs.

Wikipedia sampling : verticals are assigned to Wikipediacategories. Then documents are sampled from thesecategories.

J. Arguello, F. Diaz, J. Callan, and J.-F. Crespo. Sources of evidence for vertical selection. In Proceedings ofSIGIR, pages 315–322, 2009.



Outline




Vertical selection

Classification-based selection

Features

Query string featuresQuery-log featuresCorpus features (LD, SD)




Vertical Selection Results

Features Predictionprecision

diff %

all 0.583No query-logs 0.583 0.03%No triggers 0.583 -0.03%No query difficulty 0.582 -0.10%No geo-information 0.577† -1.01%No SD selection 0.568† -2.60%

Table: Multiple-feature predictors with one feature out, showing thecontribution of that feature




Outline




Results presentation

Learning to rank

Pre-retrieval features

Query: named-entity typeVertical: categoryQuery-vertical: click-through rateQuery-vertical: vertical intent

Post-retrieval features

Hit countsText similarity

Figure 3: A screenshot of the SERP showing results from ver-ticals at ToP, MoP and BoP.where x1 and x2 are the feature vectors for the queries q1 and q2

respectively. To find this function !V , we generate a set of pairs!x, l" that we call examples. Then we try to find a function !V thatbest fits the data. For this, we use a learning algorithm, which isdescribed in Section 5.

In an example !x, l", l is referred to as a label. In some sense, thel is the value we want the function !V to predict for input x. Wecan provide contradicting examples too. In this case, the learningalgorithm tries to find !V that best fits these examples under somemetric. We may also associate a weight w with each example !x, l"that indicates the relative importance of the example. So our inputto the learning algorithm is a list of tuples !x, l, w" There are twocommon methods for generating the examples for training:

• Sample a set of queries for which the vertical triggers. Thenshow the contents of the web results and the vertical to judgesand ask them to assign a label, say 0 or 1 for relevant or notrelevant, or a number in some range like 0, 1, 2, 3 indicatingthe degree of the relevance of the vertical for the query.

• Sample a set of queries for which the vertical triggers fromthe user logs, and based on the user behavior, assign a label,say 0 or 1 to indicate if the vertical may have satisfied theuser’s information need.

In this paper, we concentrate on the latter approach using the formeronly for comparison in our offline analysis. The details of our labelgeneration process follow in Section 5.3.

4.2 Calibration and PlacementOnce we arrive at a ranking function rV , our next task is to match

coverages at ToP, MoP and BoP based on our agreements with ver-tical owners. This is a straight-forward process. Given a sampleQ of queries for which the vertical triggers, we compute the valueprovided by the ranking function !V for these queries, sort themand pick two thresholds !V,ToP and !V,MoP such that

|{q # Q : !V ("(q, Wq, vq)) $ !V,ToP }|/|Q| = cToP

and

|{q # Q : !V,ToP > !V ("(q, Wq, vq)) $ !V,MoP }|/|Q| = cMoP .

This gives us a function µV : R %& S where

µV (x) =

(ToP if x $ !V,ToP

MoP if !V,ToP > x $ !V,MoP

BoP otherwise.

Finally, the function fV that we wanted to compute can be writ-ten as

fV (q, Wq, vq) = µV (!V ("V (q, Wq, vq))).

For convenience, with a slight abuse of notation, we will drop thevertical V from the subscript (since we rank each vertical individ-ually) and we may use !(q) to denote !V ("V (q, Wq, vq)).

5. STATISTICAL MODELINGIn this section, we first describe Gradient Boosted Decision Trees

(GBDT), which is the machine learning algorithm we used to learnonline user preference and predict human relevance grades. Thenwe describe how we represent the data using features such that theweb component of the feature vector becomes a reference frameor anchor, and finally we describe how we generate training labelsfrom online user click logs.

5.1 Gradient Boosted Decision TreesGradient boosted decision trees (GBDT) are non-parametric re-

gression models [8]. GBDTs and its stochastic variant [9] computea function approximation by performing a numerical optimizationin the function space instead of the parameter space. We providean overview of the GBDT algorithm.

A basic regression tree f(x), x # RK , partitions the space of ex-planatory variable values into disjoint regions Rj , j = 1, 2, . . . , Jassociated with the terminal nodes of the tree. Each region is as-signed a value "j such that f(x) = "j if x # Rj . Thus the com-plete tree is represented as:

T (x;!) =JX

j=1

"jI(x # Rj),

where ! = {Rj , "j}J1 , and I is the indicator function. Let the

given training data be denoted by (xi, yi, wi), i = 1, . . . , N . Thatis, xi are the observed feature vectors, yi the labels, and wi theweight associated with the training pair (xi, yi). For a loss functionL(yi, "j), parameters are estimated by minimizing the total loss:

!̂ = arg min!

JX

j=1

X

xi!Rj

wi · L(yi, "j).

In our experiments, we perform regression using the squared erroras loss function.

A boosted tree is an aggregate of such trees, each of which iscomputed in a sequence of stages. That is,

fM (x) =MX

m=1

T (x;!m),

718

A. K. Ponnuswami, K. Pattabiraman, Q. Wu, R. Gilad-Bachrach, and T. Kanungo. On composition of a federatedweb search result page: using online users to provide pairwise preference for heterogeneous verticals. InProceedings of WSDM, pages 715–724, 2011.



Outline




Search behavior: eye fixation

None of the changes were detectable by the subjects. While being asked after their query sessions, none of the subjects suspect any manipulation. This server is also used to log all click-through behavior and all SERPs subjects visit.

All subjects’ eye movements are recorded using a SMI RED250 eye-tracker, which utilizes infrared to reconstruct the subjects’ eye position. BeGaze Experimental Center is used for the simultaneous acquisition and analysis of the subjects’ eye movements. With this tracking device, the following indicators of ocular behaviors are recorded: fixations, saccades, pupil dilation, and scan paths [12]. Among these behaviors, we focus on eye fixation, which is the most relevant metric for evaluating information processing in online search. In this paper, eye fixation is defined as a spatially stable gaze lasting for approximately 200-300 milliseconds, during which visual attention is directed to a specific area of the visual display.

4.2 Do Users Examine Verticals First? To analyze which result users first pay attention to, we collect

subjects’ first two seconds eye fixations on the screen. Figure 8 shows two samples from eye-tracking data which shows users’ watching area on SERP with different kinds of verticals or no vertical results. From this figure we can see that users pay most attention to the first result when there is no vertical in SERP (which should be regarded as sign for position bias). However, when there is a multimedia vertical result at the third position, it attracts a lot of users’ direct attentions.

We set 250 milliseconds as the threshold of fixation action and labeled each document’s boundary manually. Then we can record each subject’s eye examining sequence on document results for each SERP. This statistical result shows users’ examining sequential behaviors. We compared users’ first examining behavior for each vertical class using the same form with users’ first click distribution we analyzed in previous section. Figure 9 shows the subjects’ first examining distribution at each document position for each vertical class compared with no vertical situation. From Figure 9(b) we can see that when multimedia vertical is placed on the first screen (rank 1, 3 and 5); it actually attracts more attention than ordinary results. This conclusion is consistent with CT 2 in section 3.5. For application verticals, Figure 9(c) shows that application vertical attracts slightly more attention than ordinary result. Meanwhile Figure 9(a) shows that text vertical doesn’t attract much user attention.

4.3 Behavior after Examining Verticals First To validate the findings in Section 3 about how users behave after clicking a vertical results first. We look into the examining sequences of subjects when they examine vertical results firstly.

Two typical examining sequential patterns are extracted from eye-tracking data and shown in Figure 10. We can see from this figure that users examine results above verticals either sequentially (top-down) or from the ones next to verticals first (bottom-up). Further statistics in Table 2 show that most of the subjects examine back to top results after examining a vertical first. This finding accords with our assumption in Section 3.3 that users will resume a top-down search session after being interrupted by a vertical result.

(a)

(b)

Figure 8. Heat map of the subjects’ eye fixation areas in first 2 seconds on (a) SERP with no vertical (b) SERP with

multimedia vertical placed at the 3rd position

(a) (b) (c)Figure 9. First examining distribution of SERPs with (a) text (b) multimedia (c) application vertical when vertical result is placed at rank 1, 3, 5 and 10 compared with SERPs with no verticals. (Brighter color means a higher first examining rate on document position. We don’t show document positions from 6 to 10 here because almost no subjects examine results located on them first)

508

None of the changes were detectable by the subjects. While being asked after their query sessions, none of the subjects suspect any manipulation. This server is also used to log all click-through behavior and all SERPs subjects visit.

All subjects’ eye movements are recorded using a SMI RED250 eye-tracker, which utilizes infrared to reconstruct the subjects’ eye position. BeGaze Experimental Center is used for the simultaneous acquisition and analysis of the subjects’ eye movements. With this tracking device, the following indicators of ocular behaviors are recorded: fixations, saccades, pupil dilation, and scan paths [12]. Among these behaviors, we focus on eye fixation, which is the most relevant metric for evaluating information processing in online search. In this paper, eye fixation is defined as a spatially stable gaze lasting for approximately 200-300 milliseconds, during which visual attention is directed to a specific area of the visual display.

4.2 Do Users Examine Verticals First? To analyze which result users first pay attention to, we collect

subjects’ first two seconds eye fixations on the screen. Figure 8 shows two samples from eye-tracking data which shows users’ watching area on SERP with different kinds of verticals or no vertical results. From this figure we can see that users pay most attention to the first result when there is no vertical in SERP (which should be regarded as sign for position bias). However, when there is a multimedia vertical result at the third position, it attracts a lot of users’ direct attentions.

We set 250 milliseconds as the threshold of fixation action and labeled each document’s boundary manually. Then we can record each subject’s eye examining sequence on document results for each SERP. This statistical result shows users’ examining sequential behaviors. We compared users’ first examining behavior for each vertical class using the same form with users’ first click distribution we analyzed in previous section. Figure 9 shows the subjects’ first examining distribution at each document position for each vertical class compared with no vertical situation. From Figure 9(b) we can see that when multimedia vertical is placed on the first screen (rank 1, 3 and 5); it actually attracts more attention than ordinary results. This conclusion is consistent with CT 2 in section 3.5. For application verticals, Figure 9(c) shows that application vertical attracts slightly more attention than ordinary result. Meanwhile Figure 9(a) shows that text vertical doesn’t attract much user attention.

4.3 Behavior after Examining Verticals First To validate the findings in Section 3 about how users behave after clicking a vertical results first. We look into the examining sequences of subjects when they examine vertical results firstly.

Two typical examining sequential patterns are extracted from eye-tracking data and shown in Figure 10. We can see from this figure that users examine results above verticals either sequentially (top-down) or from the ones next to verticals first (bottom-up). Further statistics in Table 2 show that most of the subjects examine back to top results after examining a vertical first. This finding accords with our assumption in Section 3.3 that users will resume a top-down search session after being interrupted by a vertical result.

(a)

(b)

Figure 8. Heat map of the subjects’ eye fixation areas in first 2 seconds on (a) SERP with no vertical (b) SERP with

multimedia vertical placed at the 3rd position

(a) (b) (c)Figure 9. First examining distribution of SERPs with (a) text (b) multimedia (c) application vertical when vertical result is placed at rank 1, 3, 5 and 10 compared with SERPs with no verticals. (Brighter color means a higher first examining rate on document position. We don’t show document positions from 6 to 10 here because almost no subjects examine results located on them first)

508

Wang et al. Incorporating vertical results into search click models. In Proceedings of SIGIR, pages 503–512,2013.



Search behavior: examination order

(a)

(b)

Figure 10. Typical eye-tracking cases of examining previous document (a) bottom up (b) top down after first examining

verticals. The examining sequence is from green circles, yellow circles (dashed border) to red circles (solid border)

Table 2 shows the proportion of each examining patterns when user first examines verticals. We can see that after examining vertical result first, most subjects (89% users who examine 3rd result first and 100% users who examine 5th result first) scan back to the previous results. This shows that users may be attracted by the vertical’s presentation and change their examination sequence; meanwhile results on top of the ranking list are always valued and not omitted by users.

Table 2. Proportion of different examining behaviors after user first examines a vertical result

Vertical Rank #Subject1 Next Previous Back to Top 3 9 0.11 0.22 0.67

5 2 0.00 0.00 1.00

4.4 Eye-tracking Analysis Findings In summary, we can conclude the main influence of vertical results on users’ examining behavior into two aspects: ET 1. Multimedia and application vertical results are examined

more frequently compared with ordinary web results. ET 2. After examining a vertical result first, most users will scan

back to examine the previous results before the vertical either bottom-up or top-down.

1 11 subjects out of all 22 subjects examine vertical result first

when vertical is placed at the 3rd or 5th position. We don’t consider sessions in which verticals are placed at the 1st position because there would be no “previous” or “back to top” patterns.

5. VERTICAL-AWARE CLICK MODEL We first state some definitions and notations that will be used

in the following part. A search session within the same query is called a query session. A web search user initializes a query session s by submitting a query q to the search engine. The SERP can be represented as sequentially (M = 10 if we only consider the first search result page), where is document at position from the top of the page.

Examination, click and document relevance are treated as probabilistic events. In particular, for a given query session, we use binary random variables , and to represent the examination, click and document attractiveness events of the document at position . The corresponding, examination and click probabilities for position are denoted by , and , respectively.

5.1 Preliminaries We first introduce two important hypotheses: examination

hypothesis and cascade hypothesis, which are the foundations of most existing click models.

The examination hypothesis [13] can be summarized as follows: | (1) | (2)

where . is defined as the document relevance, which is the conditional probability of a click event after examination. Given , is conditionally independent on previous examine/click events.

The cascade hypothesis in [8] assumes that users always begin the examination at the first document. The examination is strictly linear from top to bottom of the search result page, so a document is examined only if all previous documents are examined:

(3) | (4)

Given , is conditionally independent of all examine/click events above , but may depend on the click .

The user browsing model (UBM) [3] is based on the examination hypothesis, but it doesn’t follow the cascade hypothesis. Instead, it assumes that the examination probability depends on its own position and the previous clicked position :

| (5)

Given click , is conditionally independent of all previous examination events . If there is no click before , is set to 0. The probability of a query session under UBM is:

∏ (6)

5.2 Modeling Biases We can see from Section 3 and 4 that users treat verticals

differently from ordinary results. Now, we want to develop an effective click model for federated search containing both vertical and ordinary results. Notice that we only consider the situation that only one vertical appears in the SERP. Therefore, if there are two or more verticals in SERP, we only keep the first vertical and simply regard others as ordinary results. 5.2.1 Attraction Bias

According to the conclusions of ET 1 and CT 3 described in previous sections, certain vertical result (e.g. multimedia vertical) will attract users’ attention directly and cause users to examine and thus click it first. So we formalize the assumption as: Assumption 1 (Attraction Bias): If there is a vertical placed in the SERP, there is probability that users examine it first.

509

(a)

(b)

Figure 10. Typical eye-tracking cases of examining previous document (a) bottom up (b) top down after first examining

verticals. The examining sequence is from green circles, yellow circles (dashed border) to red circles (solid border)

Table 2 shows the proportion of each examining patterns when user first examines verticals. We can see that after examining vertical result first, most subjects (89% users who examine 3rd result first and 100% users who examine 5th result first) scan back to the previous results. This shows that users may be attracted by the vertical’s presentation and change their examination sequence; meanwhile results on top of the ranking list are always valued and not omitted by users.

Table 2. Proportion of different examining behaviors after user first examines a vertical result

Vertical Rank #Subject1 Next Previous Back to Top 3 9 0.11 0.22 0.67

5 2 0.00 0.00 1.00

4.4 Eye-tracking Analysis Findings In summary, we can conclude the main influence of vertical results on users’ examining behavior into two aspects: ET 1. Multimedia and application vertical results are examined

more frequently compared with ordinary web results. ET 2. After examining a vertical result first, most users will scan

back to examine the previous results before the vertical either bottom-up or top-down.

1 11 subjects out of all 22 subjects examine vertical result first

when vertical is placed at the 3rd or 5th position. We don’t consider sessions in which verticals are placed at the 1st position because there would be no “previous” or “back to top” patterns.

5. VERTICAL-AWARE CLICK MODEL We first state some definitions and notations that will be used

in the following part. A search session within the same query is called a query session. A web search user initializes a query session s by submitting a query q to the search engine. The SERP can be represented as sequentially (M = 10 if we only consider the first search result page), where is document at position from the top of the page.

Examination, click and document relevance are treated as probabilistic events. In particular, for a given query session, we use binary random variables , and to represent the examination, click and document attractiveness events of the document at position . The corresponding, examination and click probabilities for position are denoted by , and , respectively.

5.1 Preliminaries We first introduce two important hypotheses: examination

hypothesis and cascade hypothesis, which are the foundations of most existing click models.

The examination hypothesis [13] can be summarized as follows: | (1) | (2)

where . is defined as the document relevance, which is the conditional probability of a click event after examination. Given , is conditionally independent on previous examine/click events.

The cascade hypothesis in [8] assumes that users always begin the examination at the first document. The examination is strictly linear from top to bottom of the search result page, so a document is examined only if all previous documents are examined:

(3) | (4)

Given , is conditionally independent of all examine/click events above , but may depend on the click .

The user browsing model (UBM) [3] is based on the examination hypothesis, but it doesn’t follow the cascade hypothesis. Instead, it assumes that the examination probability depends on its own position and the previous clicked position :

| (5)

Given click , is conditionally independent of all previous examination events . If there is no click before , is set to 0. The probability of a query session under UBM is:

∏ (6)

5.2 Modeling Biases We can see from Section 3 and 4 that users treat verticals

differently from ordinary results. Now, we want to develop an effective click model for federated search containing both vertical and ordinary results. Notice that we only consider the situation that only one vertical appears in the SERP. Therefore, if there are two or more verticals in SERP, we only keep the first vertical and simply regard others as ordinary results. 5.2.1 Attraction Bias

According to the conclusions of ET 1 and CT 3 described in previous sections, certain vertical result (e.g. multimedia vertical) will attract users’ attention directly and cause users to examine and thus click it first. So we formalize the assumption as: Assumption 1 (Attraction Bias): If there is a vertical placed in the SERP, there is probability that users examine it first.

509

Wang et al. Incorporating vertical results into search click models. In Proceedings of SIGIR, pages 503–512,2013.



Results coherence

Top results in web search are more diversified than top resultsin vertical search

Images vertical has a stronger spillover effect than news,shopping, and video verticals

Top-positioned vertical causes a stronger spillover effect thanright-positioned vertical

J. Arguello and R. Capra. The Effects of Aggregated Search Coherence on Search Behavior. ACM Transactionson Information Systems, 35(1), 2:1–2:30, 2016.



Outline




Evaluation

TREC FedWeb 2013–2014

Evaluation framework by Zhou et al. (SIGIR’12)

Util(page) =

∑Bi=1 Exam(blocki ) · Gain(blocki )∑Bi=1 Exam(blocki ) · Effort(blocki )



Outline




Direct answers and cards

Picture taken from https://www.rezstream.com/blog/predictive-search-through-google-now


https://www.rezstream.com/blog/predictive-search-through-google-now


Direct answers and cards

Proactive vs. reactive

Observations

Top-heavy click distributionUsage is affected by temporaland local dynamics

Re-ranking proactive cards

Observations → featuresClicks and viewport duration →relevance

From Queries to Cards

Re-ranking Proactive Card Recommendations Based on Reactive Search History

Milad ShokouhiMicrosoft

Cambridge, United [email protected]

Qi GuoMicrosoft

Bellevue, [email protected]

General TermsThe growing accessibility of mobile devices has substantially re-formed the way users access information. While the reactive searchby query remains as common as before, recent years have witnessedthe emergence of various proactive systems such as Google Nowand Microsoft Cortana. In these systems, relevant content is pre-sented to users based on their context without a query. Interestingly,despite the increasing popularity of such services, there is very littleknown about how users interact with them.

In this paper, we present the first study on user interactions withinformation cards. We demonstrate that the usage patterns of thesecards vary depending on time and location. We also show that whileoverall different topics are clicked by users on proactive and reactiveplatforms, the topics of the clicked documents by the same usertend to be consistent cross-platform. Furthermore, we propose asupervised framework for re-ranking proactive cards based on theuser’s context and past history. To train our models, we use theviewport duration and clicks to infer pseudo-relevance labels for thecards. Our results suggest that the quality of card ranking can besignificantly improved particularly when the user’s reactive searchhistory is matched against the proactive data about the cards.

Categories and Subject DescriptorsInformation Systems [Information Retrieval]: Users and interac-tive retrieval—Personalization

KeywordsProactive ranking; zero query; Information cards

1. INTRODUCTIONMobile devices account for a significant fraction of online search

and browsing traffic. The number of mobile queries has grownfivefold in the past three years [6]. In fact, the number of mobilequeries has recently exceeded the number of those submitted fromdesktop devices in the United States and many other countries [1].Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies are notmade or distributed for profit or commercial advantage and that copies bearthis notice and the full citation on the first page. Copyrights for componentsof this work owned by others than ACM must be honored. Abstracting withcredit is permitted. To copy otherwise, or republish, to post on servers or toredistribute to lists, requires prior specific permission and/or a fee. Requestpermissions from [email protected]’15, August 09 - 13, 2015, Santiago, Chile.c� 2015 ACM. ISBN 978-1-4503-3621-5/15/08 ...$15.00.

DOI: http://dx.doi.org/10.1145/2766462.2767705.

Figure 1: Examples of interfaces with proactive “cards” presentedrespectively in Apple Siri, Google Now and Microsoft Cortana (top).Examples of reactive search queries submitted to Google on mobileand Bing on desktop (bottom) for query paris shooting.

Consequently, information retrieval systems are constantly evolvinginto more contextual and mobile-friendly services.

Among recent enhancements, the introduction of zero-query sys-tems [8] is arguably one of the most significant breakthroughs. In2012 at a workshop, a group of 45 leading experts in the field ac-knowledged zero-query search to be one of the six most interestingresearch directions in information retrieval (IR).

“Future information retrieval systems must anticipateuser needs and respond with information appropriate tothe current context without the user having to enter aquery.” [8]

The major players in the search industry have embraced the newtrends by releasing systems such as Google Now [4], Apple Siri[5] and Microsoft Cortana [2]. In all these systems, in addition tothe typical reactive search by a query, proactive information cardsare presented to users based on their context. Proactive systems are

695

M. Shokouhi and Q. Guo. From Queries to Cards: Re-ranking Proactive Card Recommendations Based onReactive Search History. In Proceedings of SIGIR, pages 695–704, 2015.



Whole-page relevance




1 Page content x

2 Page presentation p

3 SERP (x,p)

4 User response y

5 User satisfaction s = g(y)

Find the presentation p for a given page content x,such that when the SERP (x,p) is presented to the user,

her satisfaction score s is maximized.

Wang et al. Beyond Ranking: Optimizing Whole-Page Presentation. In Proceedings of WSDM, pages 103–112,2016.




1 Page content xquery featuressearch result featurescorpus-level and global result features

2 Page presentation pbinary indicator: “shown”/“not shown”categorical features (e.g., multimedia type)numerical features (e.g., brightness and contrast)

3 SERP (x,p)4 User response y

clickscollected through a randomization experiment

5 User satisfaction s = g(y)click-skip metric: g(y) =

∑ki=1 yi , where yi ∈ {−1, 1}

Wang et al. Beyond Ranking: Optimizing Whole-Page Presentation. In Proceedings of WSDM, pages 103–112,2016.



Constructing wikipedia-like pages from search results



Aggregated search summary

1 Vertical representation using sampling

2 Vertical selection as classification

3 Results presentation as learning to rank

4 Biases in search behavior

5 Evaluation6 Present and future

Direct answers and cardsWhole-page relevanceConstructing wikipedia-like pages from search results



Outline


2 Aggregated search

3 Summary



Federated search summary

Distributed information retrieval

Resource descriptionResource selectionScore normalization and results merging

Aggregated search

Vertical representationVertical selectionResults presentation



Materials

Milad Shokouhi and Luo SiFederated SearchFoundations and Trends in Information Retrieval, 2011

Kopliku et al.Aggregated Search: A New Information RetrievalParadigmACM Computing Surveys, 2014



Advanced topics in IR

Data Acquisition

Data Processing

Data Storage


Aggregated Search

Click Models


Offline

Online

Advanced