DSF/ISO/CD 24622-1
Language resource management -- Component Metadata Infrastructure - The Component Metadata Model (CMDI-1)
| Organization: | DS |
| Status: | pending |
| Page Count: | 21 |
| ICS Code (Information sciences): | 01.140.20 |
scope:
The scope of this standard CMD-1 is to describe a model that enables the flexible construction of interoperable metadata schemas for Language Resources (LRs). The metadata schemas based on this model can be used to describe resources at different levels of granularity (e.g. descriptions both on the collection level and on the level of individual resources). The model description is the first part of an infrastructure that forms a complete package for the creation of metadata schemas. As stated in Foreword, the complete infrastructure standard contains, in addition to this Component Metadata Model specification (Part 1), one or more metadata component specification languages (Part 2) and a number of recommended metadata components and profiles (Part 3). Since this standard CMD-1 specifies an abstract model, we will rely mainly on UML [7] to describe it. Figure 1. Describing resources with metadata This standard addresses the basic need to provide a model that makes it easy for metadata modelers (e.g. researchers and resource description experts) to create new metadata schemas, which can in turn be used either to describe new types of resources or to enable a more appropriate description for resources in specific circumstances. The metadata schema is instantiated into metadata records (i.e. the metadata descriptions that describe the actual resource(s)) (Figure 1). The context of this desire for flexible metadata modeling is that for scientific work there are usually various requirements for the proper description of LRs, and these requirements can derive from the specific needs of a project or from the facility or repository that will be used to store the resource for future use. This variation requires a flexible framework that enables the easy creation of new metadata schemas for different purposes, but is also a framework (i) in which the instantiations have a strictly defined format so that at least syntactic correctness can be checked, and (ii) which provides explicit semantics for the schema elements for interpretation of the metadata record content. COPYRIGHT © Danish Standards Foundation. Not for commercial use or reproduction. DSF/ISO/CD 24622-1 ISO/DIS 24622-1 2 © ISO 2013 All rights reserved The metadata descriptions generated by schemas compliant with this model will also be compliant with other TC37 standards, for example, those requiring that references to the described resources and resource parts use ISO 24619:2011 PISA-compatible persistent identifiers (PIDs). The definition of a resource in this context is very broad. This standard takes a pragmatic view: for example, an image can be a resource in itself when it is associated with a PID and can be referenced as such, or it can be part of a document where it lacks an identity of its own. In addition, a reference can point to a part of this image. An individual resource can stand alone in one environment and be treated as part of a collection in another environment. Also, metadata descriptions describe resources, but they, too, are a resource in different contexts. This standard needs to support all such cases, and the model needs to provide descriptions at all levels of granularity. This standard takes two types of collection into account: 1. A complex resource may have been created as a collection originally and, versioning aside, it will exist as such in a rather static published form. Its specification will be treated as an independent entity by the responsible archiving institution that also provides a PID for such a collection. In the context of this standard, the metadata for the collection is the collection specification. The archiving institution is responsible for maintaining the metadata representing the collection. 2. In contrast, a different type of collection is one that was not planned and designed as a collection by its creators or by the holding archive, but achieves its status as a federated resource based on research that needs to be verifiable. Such collections, although purposefully constructed by the researcher, may not have any significance outside the context of the research for which they were created. Referring from the research documents to the collection may also become tedious if the collection contains hundreds of individual resources. It follows that there is a need to capture these types of collection with a metadata record that is associated with all its constituent resources and appropriate metadata, but only as the incarnation of this collection. There is no natural responsible party to maintain this metadata record. It is unlikely that the researcher who created the 'virtual' collection (VC) has any way of consistently maintaining and curating this metadata record in the long term. There may be special registries maintained by digital archives or publishers where researchers can register such virtual collections. Both types of collection are identified with the PID that refers to the collection metadata.
Document History