Click here to load reader
View
214
Download
0
Embed Size (px)
Hochschule München Hochschule für angewandte Wissenschaften Fakultät für Geoinformation
Bachelorarbeit zur Erlangung des akademischen Grades Bachelor of Engineering (B.Eng.) im
Studiengang Geoinformatik und Satellitenpositionierung
Volltextsuche mit Elasticsearch im Geodaten-Umfeld
vorgelegt von
Mosaab Asli Matrikelnummer: 61825514
eingereicht am 09. April 2018
Betreuer: Prof. Dr. Gerhard Joos
Dipl.-Ing. Florian Mückl
Inhaltsverzeichnis Listingverzeichnis...............................................................................................................................III Abbildungsverzeichnis.......................................................................................................................IV Tabellenverzeichnis............................................................................................................................IV Abkürzungsverzeichnis........................................................................................................................V Vorwort...............................................................................................................................................VI Abstrakt.............................................................................................................................................VII 1 Einleitung...........................................................................................................................................1
1.1 Problemstellung............................................................................................................................... 1 1.2 Zielsetzung...................................................................................................................................... 1 1.3 Aufbau der Arbeit............................................................................................................................ 2
2 Entwicklung der Datenbanksysteme.................................................................................................3 3 Elasticsearch......................................................................................................................................4
3.1 Hauptmerkmale von Elasticsearch................................................................................................... 4 3.2 Grundlegendes Konzept.................................................................................................................. 5
3.2.1 Logisches Layout...........................................................................................................................5 3.2.2 Physisches Layout.........................................................................................................................5
3.3 Indexstruktur.................................................................................................................................... 6 3.3.1 Index Settings................................................................................................................................6 3.3.2 Mapping Type................................................................................................................................8 3.3.3 Aliases...........................................................................................................................................8
3.4 Arbeitsmechanismus........................................................................................................................ 9 3.5 Suche............................................................................................................................................. 11
3.5.1 Match_All-Abfrage......................................................................................................................12 3.5.2 Volltextsuche................................................................................................................................13 3.5.3 Term-Level-Abfrage....................................................................................................................15
3.6 Highlighting................................................................................................................................... 16 3.7 Aggregation................................................................................................................................... 16 3.8 Kommunikation mit Elasticsearch................................................................................................. 18
3.8.1 HTTP-Kommunikation................................................................................................................18 3.8.2 Native Kommunikation................................................................................................................19 3.8.3 cURL...........................................................................................................................................20 3.8.4 Kibana Dev-Tools........................................................................................................................20
4 Elasticsearch im Vergleich...............................................................................................................21 4.1 Vergleich mit ausgewählten DBMS............................................................................................... 21 4.2 Leistungsvergleich mit PostgreSQL.............................................................................................. 22
5 Programmierung und Implementierung...........................................................................................24 5.1 Getrennte Datenhaltung................................................................................................................. 24 5.2 Docker........................................................................................................................................... 25 5.3 Vorgehensweise............................................................................................................................. 26
5.3.1 View erstellen und anpassen........................................................................................................26 5.3.2 Von PostgreSQL zu Elasticsearch................................................................................................30 5.3.3 Java-Applikation generalisieren...................................................................................................31 5.3.4 Mapping und Settings festlegen...................................................................................................31 5.3.5 Daten-Import...............................................................................................................................34 5.3.6 Bash-Skripte................................................................................................................................35
6 REST-Fassade..................................................................................................................................36 6.1 Vorbereiten der Suchabfrage.......................................................................................................... 37 6.2 ElasticAdress-API......................................................................................................................... 39 6.3 REST-Endpoint.............................................................................................................................. 44 6.4 Weitere Bestandteile der Softwareentwicklung.............................................................................44
I
6.4.1 Logging.......................................................................................................................................44 6.4.2 Unit-Tests....................................................................................................................................45 6.4.3 Upgrade auf neuere Version.........................................................................................................46
7 Fazit und Ausblick...........................................................................................................................47 Literaturverzeichnis............................................................................................................................48 Anhänge..............................................................................................................................................52
II
Listingverzeichnis Listing 1: Elasticsearch Index Settings.................................................................................................................................6 Listing 2: Tokenizer..............................................................................................................................................................7 Listing 3: HTML-Strip-Char-Filter......................................................................................................................................7 Listing 4: Index Aliases........................................................................................................................................................9 Listing 5: Einstellung von Shards und Replicas...................................................................................................................9 Listing 6: Match_All-Abfrage............................................................................................................................................12 Listing 7: Termbasierte Query........................................................