OpenSeaMap-dev:IHO Hydographic Dictionary S-32: Unterschied zwischen den Versionen

Aus OpenSeaMap-dev
Wechseln zu: Navigation, Suche
(Other dictionaries)
(Languages and Writing)
Zeile 393: Zeile 393:
  
 
== Languages and Writing ==
 
== Languages and Writing ==
: [https://de.wikipedia.org/wiki/ISO_15924 ISO 15924: Writing sytems}
+
: [https://de.wikipedia.org/wiki/ISO_15924 ISO 15924: Writing sytems]
 
: [https://meta.wikimedia.org/wiki/Table_of_Wikimedia_projects Wikimedia Language-Code]
 
: [https://meta.wikimedia.org/wiki/Table_of_Wikimedia_projects Wikimedia Language-Code]
 
: [https://de.wikipedia.org/wiki/Wikipedia:Sprachen Wikipedia-Sprachen]
 
: [https://de.wikipedia.org/wiki/Wikipedia:Sprachen Wikipedia-Sprachen]

Version vom 18. Oktober 2016, 17:44 Uhr

Data base description...

HowTo

Store the folder "S-32" (with the two files "IHO_Hydrographic_Dictionary_S-32.xls" and "IHO Hydrographic Dictionary S-32.mdb")
whereever you like.

For editing the database:

  1. In your Excel set in menu "Extra" in "Macro > Security" to "low"
  2. Start Excel with "IHO_Hydrographic_Dictionary_S-32.xls"
  3. In table "Start": click "Edit database by User form"

You see:

an "Userform" with the "IHO Definition" (English) (left), and "National Language" (editable) (right)
an "Excel sheet" (behind) with all "Terms" in English, Français and Español.

Excel sheet

In the Excel sheet you can:

  • Compare al terms in English, Français and Español
  • Edit a dataset (by doubleclick into a term)

Click exact into a therm (if you don't, you will jump to the end)

User-form

In the User-form you can:

  • Set the language displayed in the right side for editing
  • Search in "Term":
    "xx" finds all Terms with "xx" somehere in it
    "x*y" finds all Terms with " 'x' and 'something_between x and y' and 'y' "
  • Scroll between datasets (by arrow-button)
  • Edit each data field
  • Check each data field and confirm this "Full check"

Reqirements

  • Excel 2003 or later
    Menu: Extras > Macros > Security > low
  • Folder "S-32" with:
    IHO Hydrographic Dictionary S-32.xls
    IHO Hydrographic Dictionary S-32.mdb

Status

3 Languages eng, fra, esp
11 Attributes ID, old IHO-HD-No, term, description, type, sex, plural, deleted, SaveDat, LastCheck, BE/AE
7000 Terms max-length: 60 characters
7000 Descriptions max-length: 1500 characters
20.000 Records

ToDo

OK Task Who Until Done Remarks
Ok.png Investigate the source documents OpenSeaMap 2015-12-
Ok.png Transform the source into a database OpenSeaMap 2016-01-10 2015-01-20
Ok.png Showcase Word.doc "Terms in English, Français, Español" OpenSeaMap 2016-01-30
Ok.png Develop a tool for editing datasets OpenSeaMap 1016-01-05
Approve tool for editing datasets IHO/OpenSeaMap
Decisions about DB design IHB ASAP
Implement new design OpenSeaMap
Decisions about rules IHB ASAP
Implement new rules OpenSeamap
Edit the "English master" IHB ASAP
Add 3 more languages OpenSeaMap 2016-11 ger, zho, lzh
Prepare for unlimited new languages OpenSeaMap Wikipedia languages
Define 50 Terms/Descriptions in 3 languages IHO 2016-10 XLS for ger, zho, lzh
..

Data improvements already done

  1. Unique ID for each Term (in addition to the old number).
  2. Split Term and Description.
  3. Extract semantic data from text (Type of word, Sex of word, Plural, en-UK/en-US)
    (~90% of all, some errors included)
  4. Add meta data about the datasets (Safe datum, LastCheck datum, Deleted)
  5. Delete nonbreaking hyphens.
  6.  ? First letter of each Term as capital letter.
  7.  ? Dot at the end of each Description.

Rules

clear structure

Use only:

  • Term
  • Description
  • Attributes

All texts in UTF-8.

  • IHO Master is the English version of the database
    (pragramatically "General English" is used, in case of doubt: British English).

Attributes

Attribute Type Length Content Description
ID incremential Unique ID for all languages
number text 6 #### a Old number for old terms (may be obsolete in future)
term text 255 Term with one or more words,
without special marks, without abbrevieations,
optional one (1) colon (:)
description memo 65.535 Description of the term
sex text 1 f
m
n
feminine
masculine
neutrum
plural text 2 pl
<empty>
term is commonly used in plural
term in singular
type text 5 subst
adj
adv
vi
vt
substantiv
adjective
adverb
verb intransitive
verb transitive
deleted boolean yes/no mark as deleted if not more used
en-AB text en-BE
en-AE
en-both
if a term is only British or only American or both
savedat datum Datum of the last save
lastcheck datum Datum of the last approved full check of a dataset
..

IHO tasks/decisions ToDo

use only words in a term

old:

86 | alignment correction(tape)

new:

86 | alignment correction by tape


use speakable terms

use speakable terms and link to the ordered term (or vice versa)

1. example
86 Alignment correction by tape see "tape: alignment correction"
5292 tape: alignment correction (Description)
2. example
5 Absolute accuracy see "Accuracy: absolute"
22 Accuracy: absolute (Description)
Best solution
Sort
5 Absolute accuracy Accuracy (Description)
22 Accuracy: absolute! (Description)
86 alignment correction by tape Tape (Description)
5292 tape: alignment correction! (Description)

split Sub-Languages

(but be sure it is not only a dialect)

en into en-BE and en-AE
es into es-ESP and es-ARG

old:

1756 fair chart (Brit) (Description)
1757 fair sheet see "fair chart"
4793 smooth sheet (US) see "fair chart"

new en-BE:

1756 fair chart (Description)
1757 fair sheet see "fair chart"
4793 (nothing) see "fair chart"

new en-AE:

1756 smooth sheet (Description)
1757 (nothing) see "smooth sheet"
4793 (nothing) see "smooth sheet"

If you don't like to split Sub-Languages:

decide which is the Master Language

Suggestion: en-BE (which may be a political issue)

old:

1756 fair chart (Brit) (Description)
1757 fair sheet see "fair chart"
4793 smooth sheet (US) see "fair chart"

new:

1756 fair chart (Description)
1757 fair sheet see "fair chart"
4793 smooth sheet (AE) see "fair chart"

split Multi-Terms

old:

998 continental (or island) shelf

new:

998 a continental shelf
998 b island shelf

old:

398 base tape (or wire)

new:

398 a base tape
398 b base wire


old:

3104 marine nature reserve (U.S. marine sanctuary)

new en-BE:

3104 marine nature reserve

new en-AE:

3104 marine sanctuary

split Multi-Descriptions

This has to be discussed... (how to find/list identic terms?)

old:

1015 control point (Description 1)... (Description 2)...

new:

1015 control point (Description 1)
#### control point (Description 2)

split Synonyms

Synonyms are possible in the form:

#### Term_1 (Description)
#### Synonyme_1 of Term_1 see: Term_1

Aberrations needs an own term

old:

1276 deep scattering layer (DSL)

new:

1276 a deep scattering layer (description)
1276 b DSL see "deep scattering layer"

but how to deal with multiple meanings:

1276 c DSL see "Digital Subscriber Line"

use Singular

old:

1424 divider(s)

new:

1424 a divider

or split if necessary:

1424 a divider
1424 b dividers

Errors

English

Some small typos...

French

The system does not match to the English system...

Some terms have double numbers:

1408-1411 Aberration radiale

Some terms have no number:

... Abaque d’échelle

Spanish

Terms marked with "(Esp)" this is split between Term and Declaration.
Terms marked with "(Arg)" this is split between Term and Declaration.

Use of the database

as dictionary

Read-only user-form
with "Search in Term-text"
left: Language-1, right: Language-2
Read-only Excel-sheet
left: Language-1, right: Language-2
Read-only Word-document
left: Language-1, right: Language-2
Read-only PDF
left: Language-1, right: Language-2

Append database

User-form with "Search in Term-text"
left: Language-en, right: Language-2
append a new term, marked as new suggestion to the IHB

Missed terms

ID OK Term:en Term:de Remarks
nautical light Leuchtfeuer
3486 Nominal range Tragweite see: Range
Feuerhöhe
Turmhöhe
3771 Period (of nautical light) Wiederkehr ()

Mantain database

Search and list all last FullCheck
Search and list all last FullCheck before <datum>
search all Deleted
search all Combobox attributes
do Statistic
Export

Frank Richter http://dict.tu-chemnitz.de Telefon: +49 (0)371 531-31879 Frank.Richter@hrz.tu-chemnitz.de

Other dictionaries

IALA-Dictionary (2700)
IALA Acronyms (1000)
[ S-100]
INT-1 (~1200)
WP: Glossary of nautical terms (~1600)
Wiktionary: Category:en:Nautical (2200)

Languages and Writing

ISO 15924: Writing sytems
Wikimedia Language-Code
Wikipedia-Sprachen