35
Event reports Tomáš Kliegr • EDBT 2008 • KDD 2008 • ECML/PKDD 2008 • SSMS 2008

Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Embed Size (px)

Citation preview

Page 1: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Event reportsTomáš Kliegr

• EDBT 2008• KDD 2008• ECML/PKDD 2008• SSMS 2008

Page 2: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

EDBT 2008

European Conference on Database Technology

Nantes, Francie

Page 3: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008
Page 4: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Příspěvek - PhD Workshop

• Vyší acceptance rate než v minulých letech (asi 46%)

• Dimensionality Reduction of Semantically Enriched Clickstreams

• Vynikající zpětná vazba – 5 obsáhlých recenzí• Velmi široký záběr témat• Postproceedings: Rozšířená verze

konferenčního příspěvku publikována v ACM DL

Page 5: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Vybrané workshop příspěvky• Improving the Accuracy of Entity Identification through Refinement

– the goal of entity identification is to correctly identify all the instances of the same entity so as to eliminate the inconsistency of data sources during data integration.

• Full-text indexing and Information Retrieval in P2P Systems – Distributed IR

• Reasoning about Taxonomies and Articulations – This work formalizes taxonomies and relationships between them as

formulas in logic. This formalization concretizes notions such as consistency and inconsistency of taxonomies and articulations (inter-taxonomic relations) between them, enables the derivation of new articulations based on a given set of taxonomies and articulations and provides a framework for testing assumptions about under-specified taxonomies.

Page 6: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Další čeští účastníci

• A Cost-based Join Selection for XML Twig Content-based Queries

• Radim Baca, Michal Kratky

Page 7: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Zaměření konference

• P2P• XML• Streaming

• Caching• Query Processing• Data Fusion

Page 8: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Industrial section

• Data Challenges at Yahoo!– Ricardo Baeza-Yates and Raghu Ramakrishnan

• Automatic Content Targeting on Mobile Phones

Page 9: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

KDD 2008

14th ACM SIGKDD International Conference, Las Vegas

Page 10: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008
Page 11: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008
Page 12: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008
Page 13: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Příspěvek- MDM KDD Workshop

• Combining Image Captions and Visual Analysis for Image Concept Classification– Kliegr, Svátek, Nemrava, Chandramouli, Isquierdo

• Pro zajímavost, na stejném workshopu v minulosti publikoval Pavel Praks:

Multimedia Data Mining Workshop (Pavel Praks’05): Iris Recognition Using the SVD-Free Latent Semantic Indexing

Page 14: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Zajímavé příspěvky z workshopu

• Annotating images and image objects using a hierarchical Dirichlet process model– We apply this model for predicting labels of objects in

images containing multiple objects. During training, the model has access to an un-segmented image and its caption, but not the labels for each object in the image. The trained model is used to predict the label for each region of interest in a segmented image.

• Mining the Web for Visual Concepts– Relevance feedback on Image + text data retrieved from

the web

Page 15: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Zajímavé příspěvky z konference

• Building Semantic Kernels for Text Classification using Wikipedia– In this paper, we overcome the shortages of the

BOW approach by embedding background knowledge derived from Wikipedia into a semantic kernel, which is then used to enrich the representation of documents.

Page 16: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

• Entity Categorization Over Large Document Collections– In this paper, we significantly improve the

accuracy of entity categorization by (i) considering an entity’s context across multiple documents containing it, and (ii) exploiting existinglarge lists of related entities (e.g., lists of actors, directors, books).

Page 17: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

ArnetMiner: Extraction and Mining of Academic Social Networks

• Extracting researcher profiles automatically from the Web; 2) Integrating the publication data into the network from existing digital libraries; 3) Modeling the entire academic network; and 4) Providing search services for the academic network. So far, 448,470 researcher profiles have been extracted using a unified tagging approach. We integrate publications from online Web databases and propose a probabilistic framework to deal with the name ambiguity problem. Furthermore, we propose a unified modeling approach to simultaneously model topical aspects of papers, authors, and publication venues. Search services such as expertise search and people association search have been provided based on the modeling results.

Page 18: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Heterogeneous Data Fusion for Alzheimer’s Disease Study

• In this paper, we propose to integrate heterogeneous data for AD prediction based on a kernel method. We further extend the kernel framework for selecting features (biomarkers) from heterogeneous data sources

• Experimental results show that the integration of multiple data sources leads to a coniderable improvement in the prediction accuracy.

Page 19: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Febrl – An Open Source Data Cleaning, Deduplication and

Record Linkage System with a Graphical User Interface

• Freely Extensible Biomedical Record• Linkage)• It contains many re-cently developed

techniques for data cleaning, deduplication and record linkage, and encapsulates them into a graphi-cal user interface (GUI).

• https://sourceforge.net/projects/febrl/

Page 20: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Using tagFlake for Condensing Navigable Tag Hierarchies from Tag Clouds

• Luigi Di Caro (University of Torino)

K. Selçuk Candan (Arizona State University)Maria Luisa Sapino (University of Torino)

Page 21: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Pictor: An Interactive System for Importing Data from a Website

• demonstration of an interactive wrapper in-duction system, called Pictor, which is able to minimize labeling cost, yet extract data with high accuracy from a website. Our demonstration will introduce two proposed technologies: record-level wrappers and a wrapper-assisted labeling strategy. These approaches allow Pictor to exploit previously generated wrappers, in order to predict similar labels in a partially labeled webpage or a completely new webpage.

Page 22: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Trendy

• Text minig• Advertising on web• The 2nd International Workshop on Data Min

ing and Audience Intelligence for Advertising (ADKDD 2008)

• Medical datamining– Workshop on Mining Medical Data and KDD Cup

2008

Page 23: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Further highlights

• The 2nd SNA-KDD Workshop on Social Network Mining and Analysis (SNA-KDD 2008)

• Workshop on Mining Medical Data and KDD Cup 2008

• The 2nd International Workshop on Mining Multiple Information Sources

Page 24: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

ECML/PKDD 2008

European Conference on Machine Learning and Principles and Practice of Knowledge

Discovery in DatabasesAntwerpy

Page 25: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008
Page 27: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Vybrané invited talks

• The Role of Hierarchies in Exploratory Data Mining– In a broad range of data mining tasks, the fundamental

challenge is to efficiently explore a very large space of alternatives. The difficulty is two-fold: first, the size of the space raises computational challenges, and second, it can introduce data sparsity issues even in the presence of very large datasets. In this talk, well consider how the use of hierarchies (e.g., taxonomies, or the OLAP multidimensional model) can help mitigate the problem.

Page 28: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Learning Language from Its Perceptual Context

• Raymond J. Mooney• The training data consists of textual human

commentaries on Robocup simulation games. A set of possible alternative meanings for each comment is automatically constructed from game event traces. Our previously developed systems for learning to parse and generate natural language (KRISP and WASP) were augmented to learn from this data and then commentate novel games. The system is evaluated based on its ability to parse sentences into correct meanings and generate accurate descriptions of game events.

Page 29: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Watch, Listen & Learn: Co-training on Captioned Images and Videos

• leverage the text that often accompanies visual data to learn robust models of scenes and actions from partially labeled collections. Our approach uses co-training.

Page 30: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Co-training

• semi-supervised learning algorithm that requires two distinct “views” of the training data

• First learns a separate classifier for each view using any labeled examples

• The most confident predictions of each classifier on the unlabeled data are then used to iteratively construct additional labeled training data.

Page 31: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

SSMS 2008

3rd Summer School on Multimedia Semantics, Chania, Crete

Page 32: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008
Page 33: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008
Page 34: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Přehledové prezentace

• http://www.mesh-ip.eu/ssms08.aspx?Page=ssms08

• Prezentace možno stáhnout

Page 35: Event reports Tomáš Kliegr EDBT 2008 KDD 2008 ECML/PKDD 2008 SSMS 2008

Wrap up

• ECML 09– Bled, Slovenia– 7 Sep 2009 - 11 Sep 2009

• KDD 09– Paris, France– Jun 28-Jul 1, 2009

• EDBT/ICDT 2009– Saint-Petersburg, Russia– March 23-26