DSF/ISO/DIS 24614-1
Language resource management - Word segmentation of written texts for monolingual and multilingual information processing - Part 1: Basic concepts and general principles
| Organization: | DS |
| Status: | inactive |
| Page Count: | 19 |
| ICS Code (Writing and transliteration): | 01.140.10 |
scope:
This standard is the first part of the series of ISO standards that are targeted at word segmentation (3.28) in written languages, with special attention to Chinese, Japanese, and Korean,Part 1 focuses on the basic concepts and general principles of word segmentation (3.28) that can be applied independently of the actual language at hand. The subsequent parts will focus on the issues specific for particular languages. In actual applications, particularly when representing lexical items (3.14) in a lexicon or when representing the results of word segmentation (3.28) in text, this standard is used in compliance with ISO CD 24612 Language resource management - Linguistic annotation framework (LAF) and in conjunction with ISO FDIS 24613 Language resource management - Lexical markup framework (LMF) and ISO DIS 24611 Language resource management -- Morpho-syntactic annotation framework (MAF). Word segmentation (3.28) is applied at a high level of linguistic annotation, similar to lexical markup and morpho-syntactic annotation.
Document History