[DDI-users] Unrestricted usage of scopeOfUniqueness leads to ambiguities in reference resolution
Jani Hautamäki
Jani.Hautamaki at staff.uta.fi
Fri Dec 19 08:43:41 EST 2014
I could not find any restrictions for scoping the maintainable IDs
uniqueness from the field-level documentation nor from the technical
documentation (part 1),
http://www.ddialliance.org/Specification/DDI-Lifecycle/3.2/XMLSchema/HighLevelDocumentation/DDI_Part_I_TechnicalDocument.pdf
I am inclined to interpret the absence of any restrictions
so that it is allowed to set @scopeOfUniqueness="Maintainable"
for any AbstractMaintainableType element.
Is my interpretation of the documentation correct?
For reference, here are the relevant parts regarding scopeOfUniqueness
that I found from the technical documentation,
page 11:
[Unique ID in DDI is...] An identification which is unique within
a) the agency (sub-agency), or b) within the parent maintainable.
If the context is the parent maintainable the Unique ID is the ID
of the parent maintainable plus the ID of the object within that
maintainable separated by a ".".
page 11:
Identifiable objects have a unique ID within the context
of their specified scope of uniqueness.
page 12:
Versionable objects have a unique ID within the context
of their specified scope of uniqueness.
(NOTE: The page continues with Maintainable objects, but
in this case nothing is said about the uniqueness of the ID)
page 17:
DDI 3.2 supports scoping the uniqueness of identifier
to the parent Maintainable or to the Agency (sub-agency).
[...]
When the ID is scoped to the Maintainable the unique identification of
a non-Maintainable object requires the Agency, ID of the parent
Maintainable, the ID of the object, and the Version Number of the object.
[...]
This attribute [scopeOfUniqueness] defines how the ID will be expressed
in the Canonical URN and what is required for a complete reference
to the object within the Maintaining Agency.
page 18:
If the scopeOfUniqueness equals "Maintainable" the ID of a non-Maintainable
object is structured as follows:
"urn:ddi:agency[.sub-agency]:MaintainableID.ObjectID:Version".
Obviously, a reference should be unambiguously resolvable to
a specific data element. However, this is not always possible within
the current specification.
Example in DDI v3.2 (ambiguous1.xml)
-----8<-----8<-----8<-----8<-----8<-----
<?xml version="1.0" encoding="utf-8"?>
<ddi:DDIInstance
xmlns:ddi="ddi:instance:3_2"
xmlns:r="ddi:reusable:3_2"
xmlns:s="ddi:studyunit:3_2"
xmlns:c="ddi:conceptualcomponent:3_2"
xml:lang="en"
scopeOfUniqueness="Maintainable"
>
<r:Agency>org.acme</r:Agency>
<r:ID>unique_within_parent</r:ID>
<r:Version>1</r:Version>
<s:StudyUnit scopeOfUniqueness="Maintainable">
<r:Agency>org.acme</r:Agency>
<r:ID>unique_within_parent</r:ID>
<r:Version>1</r:Version>
<!-- FIRST conceptual component -->
<c:ConceptualComponent scopeOfUniqueness="Maintainable">
<r:Agency>org.acme</r:Agency>
<r:ID>conceptual_component_first</r:ID>
<r:Version>1</r:Version>
<c:ConceptScheme scopeOfUniqueness="Maintainable">
<r:Agency>org.acme</r:Agency>
<r:ID>unique_within_parent</r:ID>
<r:Version>1</r:Version>
<c:Concept scopeOfUniqueness="Maintainable">
<r:Agency>org.acme</r:Agency>
<r:ID>unique_within_parent</r:ID>
<r:Version>1</r:Version>
<c:ConceptName>
<r:String>the concept name</r:String>
</c:ConceptName>
</c:Concept>
</c:ConceptScheme>
</c:ConceptualComponent>
<!-- SECOND conceptual component -->
<c:ConceptualComponent scopeOfUniqueness="Maintainable">
<r:Agency>org.acme</r:Agency>
<r:ID>conceptual_component_second</r:ID>
<r:Version>1</r:Version>
<c:ConceptScheme scopeOfUniqueness="Maintainable">
<r:Agency>org.acme</r:Agency>
<r:ID>unique_within_parent</r:ID>
<r:Version>1</r:Version>
<c:Concept scopeOfUniqueness="Maintainable">
<r:Agency>org.acme</r:Agency>
<r:ID>unique_within_parent</r:ID>
<r:Version>1</r:Version>
<c:ConceptName>
<r:String>different concept name</r:String>
</c:ConceptName>
</c:Concept>
</c:ConceptScheme>
</c:ConceptualComponent>
</s:StudyUnit>
</ddi:DDIInstance>
-----8<-----8<-----8<-----8<-----8<-----
I was not able to figure out from the specifications
whether the canonical URN for a Maintainable with
scopeOfUniqueness="Maintainable" should include
the parent maintainable or not.
page 17 (in technical documentation) gets close,
but does not include this case:
"When the ID is scoped to the Maintainable the unique identification
of a non-Maintainable object requires [...]"
Here the ID is scoped to the Maintainable,
and we are concerned about the unique identification
of a Maintainable object (instead of non-Maintainable).
However, as there are only two alternatives, it is
possible to explore both:
Case 1 (include parent maintainable in the URN)
In this case the urn would be:
urn:ddi:org.acme:unique_within_parent.unique_within_parent:1
Case 2 (do not include parent maintainable in the URN)
In this case the urn would be:
urn:ddi:org.acme:unique_within_parent:1
As it turns out, both of these cases fail to provide unambiguous
identification sequence.
Including the TypeOfObject information will not resolve the issue,
since is is possible to have two ore more non-Maintainables
which all will have the same reference.
For instance, in the example there are two different <Concept> Versionables
which both will have the same reference:
<!-- Reference for "the concept name" concept -->
<r:ConceptReference>
<r:URN>urn:ddi:org.acme:unique_within_parent.unique_within_parent:1</r:URN>
<r:TypeOfObject>Concept</r:TypeOfObject>
</r:ConceptReference>
<!-- Reference for the "different concept name" concept -->
<r:ConceptReference>
<r:URN>urn:ddi:org.acme:unique_within_parent.unique_within_parent:1</r:URN>
<r:TypeOfObject>Concept</r:TypeOfObject>
</r:ConceptReference>
It is apparent that in order to provide unambiguous reference using
a sequence of object IDs, each ID must be unique within the preceding object.
For instance, in the canonical URN of a non-Maintainable object
the sequence of IDs is:
(agency, maintainable, identifiable)
For this to work, agency must be globally unique, maintainable must be
unique within the agency and identifiable must be unique within
the maintainable. This suggests that @scopeOfUniqueness for Maintainables
should be restricted to allow "Agency" only.
Alternative way to resolve the issue is to modify the URN structure to allow
arbitrary identification sequences. In this case, an additional restriction is
also needed: objects which are not contained (or cannot be contained)
within any other identified object (eg. DDIInstance) must be scoped to
be unique within the agency. In this approach the identification sequence
for an object that has scopeOfUniqueness="Maintainable" would need to
include the IDs of all the ancestor maintainables up to a maintainable that
has scopeOfUniqueness="Agency". An example of such an identification
sequence would be:
(agency, unique_within_agency, unique_within_parent, unique_within_parent, ...)
I guess the first approach might be the simplest solution for resolving
the ambiguity introduced by the unrestricted usage of @scopeOfUniqueness.
Hopefully the issue will be resolved in one way or the other.
The presented options are just the two most obvious solutions.
More information about the DDI-users
mailing list