(since v2.6.0)
1. Introduction
This transformation provides rules with which code lists contained in the model can be populated with information. Typically, the codes of a code list are managed outside of an application schema. That means that even though the application schema content may be fixed in a specific version of the schema, the codes of the code lists that are used by the schema can vary. Externally managed code lists are typically modelled without any codes.
A use case where the transformation is useful is encoding a code list in a specific format, like an ISO 19139 code list dictionary. The transformation loads the codes of a code list from its authoritative source. Subsequent ShapeChange processes, like the ISO 19139 Codelist Dictionary target, can then derive a dictionary in the required format. The same workflow can be used to update the dictionary at any time.
2. Configuration
2.1. Class
The class for this transformer implementation is de.interactive_instruments.shapechange.core.transformation.adding.CodeListLoader
2.2. Transformation Rules
2.2.1. rule-trf-cls-loadCodes
This rule loads codes from authoritative code list sources into the model.
The transformation looks at each code list contained in the schemas selected for processing. If the code list defines the source, then the codes will be retrieved from the source and added to the code list.
Note
|
A code list which defines a code list source may already contain codes (typically modeled as attributes). Loading codes from the source may result in duplicate codes. By setting transformation parameter removeExistingCodesBeforeLoading to true, the pre-existing codes can be removed before codes are loaded from the source. This behavior is available since v4.0.0 of ShapeChange, and only applies to the code lists for which a code list source is defined. |
The source is defined by tagged value codeListSource, which provides a link to a remote or local resource in a specific representation. The representation is defined via tagged value codeListSourceRepresentation (or, since v4.0.0, using transformation parameter defaultCodeListSourceRepresentation as fallback). At the moment, the following representation is supported:
-
Representation application/x.iso639_2:
-
Definition: The CSV-based list of ISO 639-2 language codes, as published by the Library of Congress.
-
Code retrieval details: The code name is the alpha-3 code (bibliographic, in case that both a bibliographic and a terminologic code are available). The english name is used as the definition and documentation of the code.
-
-
Representation application/x-re3gistry-json:
-
Definition: The JSON format for codelist encoding of the Re3gistry implementation. Tested with the GDI-DE re3gistry instance of the German mapping agencies.
-
Code retrieval details: The code list source is a code list hosted by a Re3gistry instance, in a specific register. The code list source either is a URL or the path to a local JSON file (especially useful for unit tests, or in cases where the Re3gistry is not readily available). If a URL is given, and the URL does end with '.json', it is assumed that the URL directly links to the JSON representation in the desired language (see parameter re3gistryLang). Otherwise, the URL is assumed to link to the base resource of the code list in a specific register (which is determined from the URL), and
'/' + {code list name} + '.' + {value of parameter re3gistryLang} + '.json'
is added to the source URL. If the code list source is a local JSON file, parameter re3gistryRegister must be set, in order to identify the relevant register. From the JSON document, the objects in the member that represents the register are parsed:-
The
definition/text
is set as definition of the code list - unless the code list already has a non-empty definition. -
For each item object within
containedItems
, itsvalue
member is retrieved. It represents a particular code. If the URL invalue/status/id
ends with/valid
or/retired
, the code is loaded. Otherwise, it is ignored. Thus, only relevant codes are loaded from the registry. The value of membervalue/CodeListValue_local_Id/text
is used as the initial value for the new attribute that will be added to the code list to represent the code, while the value of membervalue/label/text
is used as attribute name.
-
-
Additional representations can be added in the future.
The tagged value codeListSourceCharset can be used to define the character set of the resource. This information is used to correctly read the code list resource. If the tagged value is blank, then UTF-8 is used by default. You can use one of the character sets supported by Java. For further details, see the documentation of the Java class Charset and its method forName(String charsetName).
2.3. Parameters
2.3.1. defaultCodeListSourceRepresentation
(since v4.0.0)
Alias: none
Type: enum, one of 'application/x.iso639_2' and 'application/x-re3gistry-json'
Default value: none
Behavior
Default representation for code list sources.
Applies to Rule(s)
2.3.2. re3gistryLang
(since v4.0.0)
Alias: none
Type: string (should be a code for one of the languages supported by the relevant Re3gistry instance(s))
Default value: en
Behavior
Define the code of the language in which to retrieve code (list) metadata.
Applies to Rule(s)
2.3.3. re3gistryRegister
(since v4.0.0)
Alias: none
Type: string
Default value: none
Behavior
Identifier of the register in which the code lists are located in the re3gistry. Only relevant for cases in which the code list source is a local json file. If the source is an HTTP URL, the register is automatically determined from the URL.
Applies to Rule(s)
3. Configuration Example
1
2
3
4
5
6
7
8
<Transformer class="de.interactive_instruments.shapechange.core.transformation.adding.CodeListLoader"
id="TRF_CL_LOADER" input="IDENTITY" mode="enabled">
<rules>
<ProcessRuleSet name="cl_loader_rules">
<rule name="rule-trf-cls-loadCodes"/>
</ProcessRuleSet>
</rules>
</Transformer>