Modeling Names

Preview:

DESCRIPTION

This paper argues that the hierarchy between topic name items and variant items of the TMDM resembles a hierarchy between names and particular renderings of names in the real world, but for this resemblance to be a better match, topic name items should loose the requirement to always have a value property.

Citation preview

TMRA 2009: Modeling Names2009-11-13

1 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Modeling Names

TMRA 2009: Modeling Names2009-11-13

2 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Variants

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>

TMRA 2009: Modeling Names2009-11-13

3 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

kill Variants?

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>☠

TMRA 2009: Modeling Names2009-11-13

4 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

What are variants?„A variant name

is an alternative form of a topic name that

may be more suitable in a certain context than the corresponding base name.”

Well, then we can actually drop variants and replace them with topic names.

TMRA 2009: Modeling Names2009-11-13

5 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

What are variants?„A variant name

is an alternative form of a topic name that

may be more suitable in a certain context than the corresponding base name.” [TMDM]

When dropping variants, we loose the correspondence. Oh, there is correspondence.

If there is correspondence, then:

Each variant's value overrides its topic name's value (in a certain context).

TMRA 2009: Modeling Names2009-11-13

6 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Default values of topic namesConsider: topic name which contains some variants.

As an author:

How to determine the default value? Choose any of the variants' values? Which?

Throw the dice?

Ask a sun^W^Wan oracle?

Take the variant who you are most familiar with?

Your default value is most likely culture-dependent.

TMRA 2009: Modeling Names2009-11-13

7 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

No default values of topic names!Cultural bias in default values of topic names?

Cultural bias in Topic Maps.

Should be avoided.

Proposed solution:

Drop default values of topic names.

TMRA 2009: Modeling Names2009-11-13

8 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

What makes names different?Consider the city at 41°N 29°E

TMRA 2009: Modeling Names2009-11-13

9 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Different or not different?„Constantinople“ (English)

„Konstantinopolis“ (Turkish)

Are these names different?

Maybe...

Ko n s t a n t i n o polisC le

TMRA 2009: Modeling Names2009-11-13

10 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Different or not different?„Konstantiniyye“

“قسطنطينيه„

Are these names different?

Looks like they are different.

TMRA 2009: Modeling Names2009-11-13

11 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Different or not different?„Konstantiniyye“ (Ottoman Turkish)

(Ottoman Turkish) “قسطنطينيه„

Are these names different?

Well...

TMRA 2009: Modeling Names2009-11-13

12 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Different or not different?„Konstantiniyye“ (Ottoman Turkish)

(Ottoman Turkish) “قسطنطينيه„

Both names encode the same sound.

Are these names different?

Uh!

TMRA 2009: Modeling Names2009-11-13

13 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

More candidates

„Istanbul“ (English)

„İstanbul“ (Turkish)

Are these names different?

TMRA 2009: Modeling Names2009-11-13

14 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

More candidates

„Istanbul“ (English)

„İstanbul“ (Turkish)

„Constantinople“ (English)

„Konstantinopolis“ (Turkish)

„Konstantiniyye“ (Ottoman Turkish, Latin script)

(Ottoman Turkish, Arabic script) “قسطنطينيه„

Are these names different?

TMRA 2009: Modeling Names2009-11-13

15 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Groups of names

„Istanbul“ (English)

„İstanbul“ (Turkish)

„Constantinople“ (English)

„Konstantinopolis“ (Turkish)

„Konstantiniyye“ (Ottoman Turkish, Latin script)

(Ottoman Turkish, Arabic script) “قسطنطينيه„

Apparently, there is some „natural grouping“.

TMRA 2009: Modeling Names2009-11-13

16 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Grouping propertiesNames within each group are „somehow“ similar.

For each scope, there is only one name per group.

TMRA 2009: Modeling Names2009-11-13

17 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Group<->membersName<->variants

It looks like there is a structural match

between observed patterns

and TMDM

TMRA 2009: Modeling Names2009-11-13

18 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Let's check the match

group member scope of member

(Name #1) “Istanbul” English

“İstanbul” Turkish

(Name #2) “Constantinople” English

“Konstantinopolis” Turkish

“Konstantiniyye” Ottoman Turkish (Latin-based script)

”قسطنطينيه“ Ottoman Turkish (Arabic-based script)

TMRA 2009: Modeling Names2009-11-13

19 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Blueprint for topic names and variants

Except:

There is no value for the topic name.

There are only values for the variants.

Thus:

Abandon default values for topic names!

TMRA 2009: Modeling Names2009-11-13

20 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

How to fix (the TMDM)?

Well, we know:

TMDM is not going to change any time soon.

But maybe later.

3 possible solutions:

Making the topic name's value property optional.

Removing the topic name's value property.

Removing the topic name.

TMRA 2009: Modeling Names2009-11-13

21 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Making the topic name item's value property optional

Plain implementation of the requirement.

Softly requires apps to employ value selection algorithms.

Allows for bad Topic Maps design (e.g. choosing default value anyway).

Perfectly compatible with existing Topic Maps.

May be to weak to actually drive change.

TMRA 2009: Modeling Names2009-11-13

22 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Removing the topic name item's value property

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>

TMRA 2009: Modeling Names2009-11-13

23 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Removing the topic name item's value property

For each old topic name, create an additional new variant.

Need to remove scope-restriction on variants as well.

Now apps are forced to employ value selection algorithms.

TMRA 2009: Modeling Names2009-11-13

24 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Removing the topic name item

What?!?

TMRA 2009: Modeling Names2009-11-13

25 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Removing the topic name item

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>

TMRA 2009: Modeling Names2009-11-13

26 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Replacing the topic name item

TopicName:

type: Topic

value: String

scope: Set<Topic>

variants: Set<Variant>

Variant:

value: String

datatype: IRI

scope: Set<Topic>

NameRendering:

type: Topic

value: String

datatype: IRI

scope: Set<Topic>

TMRA 2009: Modeling Names2009-11-13

27 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Replacing the topic name item

NameRendering is binary compatible to Occurrence

Looks like CharacteristicMore opportunity to simplify the TMDM

Still compatible to current TMDM

Model grouping of names using TMDM, not within TMDM

using „name rendering group“

Disadvantage: complex query if only one rendering per group should be retrieved.

TMRA 2009: Modeling Names2009-11-13

28 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Outlook

How, actually, should a value selection algorithm work?

User-culture-dependent, not author-culture-dependent

How to model names for analysis?

Patterns for speech recognition

What about sortnames?

TMRA 2009: Modeling Names2009-11-13

29 of 28Xuân Baldauf <xuan--names--2009--tmra.de@academia.baldauf.org>

Finish

спасибо (Russian)

დიდი მადლობა (Georgian)

(Arabic) شكرا

謝謝 (Mandarin)

ありがとう (Japanese)

(Hebrew) רב תודות

ᖁᔭᓇᐃᓐᓂ (Inuktitut)

ki'esai (Lojban)