29
TMRA 2009: Modeling Names 2009-11-13 1 of 28 Xuân Baldauf <[email protected]> Modeling Names

Modeling Names

  • Upload
    tmra

  • View
    3.096

  • Download
    0

Embed Size (px)

DESCRIPTION

This paper argues that the hierarchy between topic name items and variant items of the TMDM resembles a hierarchy between names and particular renderings of names in the real world, but for this resemblance to be a better match, topic name items should loose the requirement to always have a value property.

Citation preview

Page 1: Modeling Names

TMRA 2009: Modeling Names2009-11-13

1 of 28Xuân Baldauf <[email protected]>

Modeling Names

Page 2: Modeling Names

TMRA 2009: Modeling Names2009-11-13

2 of 28Xuân Baldauf <[email protected]>

Variants

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>

Page 3: Modeling Names

TMRA 2009: Modeling Names2009-11-13

3 of 28Xuân Baldauf <[email protected]>

kill Variants?

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>☠

Page 4: Modeling Names

TMRA 2009: Modeling Names2009-11-13

4 of 28Xuân Baldauf <[email protected]>

What are variants?„A variant name

is an alternative form of a topic name that

may be more suitable in a certain context than the corresponding base name.”

Well, then we can actually drop variants and replace them with topic names.

Page 5: Modeling Names

TMRA 2009: Modeling Names2009-11-13

5 of 28Xuân Baldauf <[email protected]>

What are variants?„A variant name

is an alternative form of a topic name that

may be more suitable in a certain context than the corresponding base name.” [TMDM]

When dropping variants, we loose the correspondence. Oh, there is correspondence.

If there is correspondence, then:

Each variant's value overrides its topic name's value (in a certain context).

Page 6: Modeling Names

TMRA 2009: Modeling Names2009-11-13

6 of 28Xuân Baldauf <[email protected]>

Default values of topic namesConsider: topic name which contains some variants.

As an author:

How to determine the default value? Choose any of the variants' values? Which?

Throw the dice?

Ask a sun^W^Wan oracle?

Take the variant who you are most familiar with?

Your default value is most likely culture-dependent.

Page 7: Modeling Names

TMRA 2009: Modeling Names2009-11-13

7 of 28Xuân Baldauf <[email protected]>

No default values of topic names!Cultural bias in default values of topic names?

Cultural bias in Topic Maps.

Should be avoided.

Proposed solution:

Drop default values of topic names.

Page 8: Modeling Names

TMRA 2009: Modeling Names2009-11-13

8 of 28Xuân Baldauf <[email protected]>

What makes names different?Consider the city at 41°N 29°E

Page 9: Modeling Names

TMRA 2009: Modeling Names2009-11-13

9 of 28Xuân Baldauf <[email protected]>

Different or not different?„Constantinople“ (English)

„Konstantinopolis“ (Turkish)

Are these names different?

Maybe...

Ko n s t a n t i n o polisC le

Page 10: Modeling Names

TMRA 2009: Modeling Names2009-11-13

10 of 28Xuân Baldauf <[email protected]>

Different or not different?„Konstantiniyye“

“قسطنطينيه„

Are these names different?

Looks like they are different.

Page 11: Modeling Names

TMRA 2009: Modeling Names2009-11-13

11 of 28Xuân Baldauf <[email protected]>

Different or not different?„Konstantiniyye“ (Ottoman Turkish)

(Ottoman Turkish) “قسطنطينيه„

Are these names different?

Well...

Page 12: Modeling Names

TMRA 2009: Modeling Names2009-11-13

12 of 28Xuân Baldauf <[email protected]>

Different or not different?„Konstantiniyye“ (Ottoman Turkish)

(Ottoman Turkish) “قسطنطينيه„

Both names encode the same sound.

Are these names different?

Uh!

Page 13: Modeling Names

TMRA 2009: Modeling Names2009-11-13

13 of 28Xuân Baldauf <[email protected]>

More candidates

„Istanbul“ (English)

„İstanbul“ (Turkish)

Are these names different?

Page 14: Modeling Names

TMRA 2009: Modeling Names2009-11-13

14 of 28Xuân Baldauf <[email protected]>

More candidates

„Istanbul“ (English)

„İstanbul“ (Turkish)

„Constantinople“ (English)

„Konstantinopolis“ (Turkish)

„Konstantiniyye“ (Ottoman Turkish, Latin script)

(Ottoman Turkish, Arabic script) “قسطنطينيه„

Are these names different?

Page 15: Modeling Names

TMRA 2009: Modeling Names2009-11-13

15 of 28Xuân Baldauf <[email protected]>

Groups of names

„Istanbul“ (English)

„İstanbul“ (Turkish)

„Constantinople“ (English)

„Konstantinopolis“ (Turkish)

„Konstantiniyye“ (Ottoman Turkish, Latin script)

(Ottoman Turkish, Arabic script) “قسطنطينيه„

Apparently, there is some „natural grouping“.

Page 16: Modeling Names

TMRA 2009: Modeling Names2009-11-13

16 of 28Xuân Baldauf <[email protected]>

Grouping propertiesNames within each group are „somehow“ similar.

For each scope, there is only one name per group.

Page 17: Modeling Names

TMRA 2009: Modeling Names2009-11-13

17 of 28Xuân Baldauf <[email protected]>

Group<->membersName<->variants

It looks like there is a structural match

between observed patterns

and TMDM

Page 18: Modeling Names

TMRA 2009: Modeling Names2009-11-13

18 of 28Xuân Baldauf <[email protected]>

Let's check the match

group member scope of member

(Name #1) “Istanbul” English

“İstanbul” Turkish

(Name #2) “Constantinople” English

“Konstantinopolis” Turkish

“Konstantiniyye” Ottoman Turkish (Latin-based script)

”قسطنطينيه“ Ottoman Turkish (Arabic-based script)

Page 19: Modeling Names

TMRA 2009: Modeling Names2009-11-13

19 of 28Xuân Baldauf <[email protected]>

Blueprint for topic names and variants

Except:

There is no value for the topic name.

There are only values for the variants.

Thus:

Abandon default values for topic names!

Page 20: Modeling Names

TMRA 2009: Modeling Names2009-11-13

20 of 28Xuân Baldauf <[email protected]>

How to fix (the TMDM)?

Well, we know:

TMDM is not going to change any time soon.

But maybe later.

3 possible solutions:

Making the topic name's value property optional.

Removing the topic name's value property.

Removing the topic name.

Page 21: Modeling Names

TMRA 2009: Modeling Names2009-11-13

21 of 28Xuân Baldauf <[email protected]>

Making the topic name item's value property optional

Plain implementation of the requirement.

Softly requires apps to employ value selection algorithms.

Allows for bad Topic Maps design (e.g. choosing default value anyway).

Perfectly compatible with existing Topic Maps.

May be to weak to actually drive change.

Page 22: Modeling Names

TMRA 2009: Modeling Names2009-11-13

22 of 28Xuân Baldauf <[email protected]>

Removing the topic name item's value property

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>

Page 23: Modeling Names

TMRA 2009: Modeling Names2009-11-13

23 of 28Xuân Baldauf <[email protected]>

Removing the topic name item's value property

For each old topic name, create an additional new variant.

Need to remove scope-restriction on variants as well.

Now apps are forced to employ value selection algorithms.

Page 24: Modeling Names

TMRA 2009: Modeling Names2009-11-13

24 of 28Xuân Baldauf <[email protected]>

Removing the topic name item

What?!?

Page 25: Modeling Names

TMRA 2009: Modeling Names2009-11-13

25 of 28Xuân Baldauf <[email protected]>

Removing the topic name item

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>

Page 26: Modeling Names

TMRA 2009: Modeling Names2009-11-13

26 of 28Xuân Baldauf <[email protected]>

Replacing the topic name item

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>

NameRendering:

type: Topic

value: String

datatype: IRI

scope: Set<Topic>

Page 27: Modeling Names

TMRA 2009: Modeling Names2009-11-13

27 of 28Xuân Baldauf <[email protected]>

Replacing the topic name item

NameRendering is binary compatible to Occurrence

Looks like CharacteristicMore opportunity to simplify the TMDM

Still compatible to current TMDM

Model grouping of names using TMDM, not within TMDM

using „name rendering group“

Disadvantage: complex query if only one rendering per group should be retrieved.

Page 28: Modeling Names

TMRA 2009: Modeling Names2009-11-13

28 of 28Xuân Baldauf <[email protected]>

Outlook

How, actually, should a value selection algorithm work?

User-culture-dependent, not author-culture-dependent

How to model names for analysis?

Patterns for speech recognition

What about sortnames?

Page 29: Modeling Names

TMRA 2009: Modeling Names2009-11-13

29 of 28Xuân Baldauf <[email protected]>

Finish

спасибо (Russian)

დიდი მადლობა (Georgian)

(Arabic) شكرا

謝謝 (Mandarin)

ありがとう (Japanese)

(Hebrew) רב תודות

ᖁᔭᓇᐃᓐᓂ (Inuktitut)

ki'esai (Lojban)