OpenSeaMap-dev:IHO Hydographic Dictionary S-32: Unterschied zwischen den Versionen

Aus OpenSeaMap-dev
Wechseln zu: Navigation, Suche
(Spanish)
(Improvements already done)
Zeile 51: Zeile 51:
 
# Extract semantic data from text (Type of word, Sex of word, Plural, en-UK/en-US) <br>(~90% of all, some errors included)
 
# Extract semantic data from text (Type of word, Sex of word, Plural, en-UK/en-US) <br>(~90% of all, some errors included)
 
# Add meta data about the datasets (Safe datum, LastCheck datum, Deleted)
 
# Add meta data about the datasets (Safe datum, LastCheck datum, Deleted)
 
+
# First letter of each Term as capitl letter.
 +
# Dot at the end of each Description.
  
 
= Rules =
 
= Rules =

Version vom 30. Januar 2016, 04:38 Uhr

Data base description...

HowTo

For editing the database:

  1. Start Excel (IHO_Hydrographic_Dictionary_S-32_Terms.xls)
  2. In table "Start": click "Edit database"

You see:

an "Excel sheet" with all "Terms" in English, Français and Español.
an "Userform" with the "English master" (left), and "Editable language" (right)

Excel sheet

In the Excel sheet you can:

  • Compare al terms in English, Français and Español
  • Edit a dataset (by doubleclick into a term)

User-form

In the User-form you can:

  • Set the language displayed in the right side for editing
  • Search in "Term":
    "xx" finds all Terms with "xx" somehere in it
    "(*)" finds all Terms with "(" and "something_between" and ")"
  • Scroll between datasets (by arrow-button)
  • Edit each data field
  • Check each data field and confirm this "Full check"

ToDo

OK Task Who Until Done Remarks
Ok.png Investigate the source documents OpenSeaMap 2015-12-
Ok.png Transform the source into a database OpenSeaMap 2016-01-10 2015-01-20
Ok.png Showcase Word.doc "Terms in English, Français, Español" OpenSeaMap
Ok.png Develop a tool for editing datasets OpenSeaMap
Decisions about rules IHB
Implement new rules OpenSeamap
Edit the "English master" IHB
..

Improvements already done

  1. Unique ID for each Term (in addition to the old number).
  2. Split Term and Description.
  3. Extract semantic data from text (Type of word, Sex of word, Plural, en-UK/en-US)
    (~90% of all, some errors included)
  4. Add meta data about the datasets (Safe datum, LastCheck datum, Deleted)
  5. First letter of each Term as capitl letter.
  6. Dot at the end of each Description.

Rules

clear structure

Use only:

  • Term
  • Description
  • Attributes

All texts in UTF-8.

Attributes

Attribute Type Length Content Description
ID incremential Unique ID for all languages
number text 6 #### a Old number for old terms (may be obsolete in future)
term text 255 Term with one or more words,
without special marks, without abbrevieations,
optional one (1) colon (:)
description memo 65.535 Description of the term
sex text 1 f
m
n
feminine
masculine
neutrum
plural text 2 pl
<empty>
term is commonly used in plural
term in singular
type text 5 subst
adj
adv
vi
vt
substantiv
adjective
adverb
verb intransitive
verb transitive
deleted boolean yes/no mark as deleted if not more used
en-AB text en-BE
en-AE
en-both
if a term is only British or only American or both
savedat datum Datum of the last save
lastcheck datum Datum of the last approved full check of a dataset
..

use only words in a term

old:

86 | alignment correction(tape)

new:

86 | alignment correction by tape


use speakable terms

use speakable terms and link to the ordered term (or vice versa)

1. example
86 Alignment correction by tape see "tape: alignment correction"
5292 Tape: alignment correction (Description)

Other idea:

Sort
86 alignment correction by tape Tape (Description)
2. example
5 Absolute accuracy see "Accuracy: absolute"
22 Accuracy: absolute (Description)

Other idea:

Sort
22 Absolute accuracy Accuracy (Description)

split Sub-Languages

(but be sure it is not only a dialect)

en into en-BE and en-AE
es into es-ESP and es-ARG

old:

1756 fair chart (Brit) (Description)
1757 fair sheet see "fair chart"
4793 smooth sheet (US) see "fair chart"

new en-BE:

1756 fair chart (Description)
1757 fair sheet see "fair chart"
4793 (nothing) see "fair chart"

new en-AE:

1756 smooth sheet (Description)
1757 (nothing) see "smooth sheet"
4793 (nothing) see "smooth sheet"

If you don't like to split Sub-Languages:

decide which is the Master Language

Suggestion: en-BE (which may be a political issue)

old:

1756 fair chart (Brit) (Description)
1757 fair sheet see "fair chart"
4793 smooth sheet (US) see "fair chart"

new:

1756 fair chart (Description)
1757 fair sheet see "fair chart"
4793 smooth sheet (AE) see "fair chart"

split Multi-Terms

old:

998 continental (or island) shelf

new:

998 a continental shelf
998 b island shelf

old:

398 base tape (or wire)

new:

398 a base tape
398 b base wire


old:

3104 marine nature reserve (U.S. marine sanctuary)

new en-BE:

3104 marine nature reserve

new en-AE:

3104 marine sanctuary

split Multi-Descriptions

This has to be discussed... (how to deal with identic terms?)

old:

1015 control point (Description 1)... (Description 2)...

new:

1015 control point (Description 1)
#### control point (Description 2)

split Synonyms

Synonyms are possible in the form:

#### Term_1 (Description)
#### Synonyme_1 of Term_1 see: Term_1

Aberrations needs an own term

old:

1276 deep scattering layer (DSL)

new:

1276 a deep scattering layer
1276 b DSL see "deep scattering layer"

but how to deal with multiple meanings:

1276 c DSL see "Digital Subscriber Line"

use Singular

old:

1424 divider(s)

new:

1424 a divider

or split if necessary:

1424 a divider
1424 b dividers

Errors

English

Some small typos...

French

Some terms have double numbers:

1408-1411 Aberration radiale

Some terms have no number:

... Abaque d’échelle

Spanish

Terms marked with "(Esp)" this is split between Term and Declaration.
Terms marked with "(Arg)" this is split between Term and Declaration.

Use of the database

as dictionary

Read-only user-form
with "Search in Term-text"
left: Language-1, right: Language-2
Read-only Excel-sheet
left: Language-1, right: Language-2
Read-only Word-document
left: Language-1, right: Language-2
Read-only PDF
left: Language-1, right: Language-2

Append database

User-form with "Search in Term-text"
left: Language-en, right: Language-2
append a new term, marked as new suggestion to the IHB

Missed terms

ID OK Term:en Term:de Remarks
nautical light Leuchtfeuer
3486 Nominal range Tragweite see: Range
Feuerhöhe
Turmhöhe
3771 Period (of nautical light) Wiederkehr ()