[DDI-users] Unrestricted usage of scopeOfUniqueness leads to ambiguities in reference resolution

Dan Smith dan at colectica.com
Fri Dec 19 12:23:36 EST 2014


Hi Jani,

Yes you are correct, maintainable IDs are always scoped to an agency.

"This suggests that @scopeOfUniqueness for Maintainables
should be restricted to allow "Agency" only."

Sure, the attribute's value could be set to fixed to ensure developers
are obeying this constraint.

In Colectica we scope every item to the Agency, which is also the
default setting in the schema.

I am also working on some DDI 3.2 examples for you which you asked about
at EDDI, I will try to publish them soon.

Cheers
Dan

On 12/19/2014 7:43 AM, Jani Hautamäki wrote:
> 
> I could not find any restrictions for scoping the maintainable IDs 
> uniqueness from the field-level documentation nor from the technical 
> documentation (part 1),
> 
>     http://www.ddialliance.org/Specification/DDI-Lifecycle/3.2/XMLSchema/HighLevelDocumentation/DDI_Part_I_TechnicalDocument.pdf
> 
> I am inclined to interpret the absence of any restrictions
> so that it is allowed to set @scopeOfUniqueness="Maintainable" 
> for any AbstractMaintainableType element.
>     
> Is my interpretation of the documentation correct?
> 
> For reference, here are the relevant parts regarding scopeOfUniqueness
> that I found from the technical documentation,
> 
>     page 11:
>     [Unique ID in DDI is...] An identification which is unique within 
>     a) the agency (sub-agency), or b) within the parent maintainable. 
>     If the context is the parent maintainable the Unique ID is the ID
>     of the parent maintainable plus the ID of the object within that
>     maintainable separated by a ".".
> 
>     page 11:
>     Identifiable objects have a unique ID within the context 
>     of their specified scope of uniqueness.
> 
>     page 12:
>     Versionable objects have a unique ID within the context
>     of their specified scope of uniqueness.
> 
>     (NOTE: The page continues with Maintainable objects, but
>     in this case nothing is said about the uniqueness of the ID)
>         
>     page 17:
>     DDI 3.2 supports scoping the uniqueness of identifier 
>     to the parent Maintainable or to the Agency (sub-agency).
>     [...]
>     When the ID is scoped to the Maintainable the unique identification of
>     a non-Maintainable object requires the Agency, ID of the parent 
>     Maintainable, the ID of the object, and the Version Number of the object.
>     [...]
>     This attribute [scopeOfUniqueness] defines how the ID will be expressed
>     in the Canonical URN and what is required for a complete reference 
>     to the object within the Maintaining Agency.
> 
>     page 18:
>     If the scopeOfUniqueness equals "Maintainable" the ID of a non-Maintainable
>     object is structured as follows: 
>     "urn:ddi:agency[.sub-agency]:MaintainableID.ObjectID:Version".
> 
> Obviously, a reference should be unambiguously resolvable to 
> a specific data element. However, this is not always possible within 
> the current specification.
> 
> Example in DDI v3.2 (ambiguous1.xml)
> -----8<-----8<-----8<-----8<-----8<-----
> <?xml version="1.0" encoding="utf-8"?>
> <ddi:DDIInstance
>     xmlns:ddi="ddi:instance:3_2"
>     xmlns:r="ddi:reusable:3_2"
>     xmlns:s="ddi:studyunit:3_2"
>     xmlns:c="ddi:conceptualcomponent:3_2"
>     xml:lang="en"
>     scopeOfUniqueness="Maintainable"
>     >
>   <r:Agency>org.acme</r:Agency>
>   <r:ID>unique_within_parent</r:ID>
>   <r:Version>1</r:Version>
> 
>   <s:StudyUnit scopeOfUniqueness="Maintainable">
>     <r:Agency>org.acme</r:Agency>
>     <r:ID>unique_within_parent</r:ID>
>     <r:Version>1</r:Version>
> 
>     <!-- FIRST conceptual component -->
> 
>     <c:ConceptualComponent scopeOfUniqueness="Maintainable">
>       <r:Agency>org.acme</r:Agency>
>       <r:ID>conceptual_component_first</r:ID>
>       <r:Version>1</r:Version>
>       
>       <c:ConceptScheme scopeOfUniqueness="Maintainable">
>         <r:Agency>org.acme</r:Agency>
>         <r:ID>unique_within_parent</r:ID>
>         <r:Version>1</r:Version>
>         
>         <c:Concept scopeOfUniqueness="Maintainable">
>           <r:Agency>org.acme</r:Agency>
>           <r:ID>unique_within_parent</r:ID>
>           <r:Version>1</r:Version>
>           
>           <c:ConceptName>
>             <r:String>the concept name</r:String>
>           </c:ConceptName>
>         </c:Concept>
> 
>       </c:ConceptScheme>
>     </c:ConceptualComponent>
>     
>     <!-- SECOND conceptual component -->
>     
>     <c:ConceptualComponent scopeOfUniqueness="Maintainable">
>       <r:Agency>org.acme</r:Agency>
>       <r:ID>conceptual_component_second</r:ID>
>       <r:Version>1</r:Version>
>       
>       <c:ConceptScheme scopeOfUniqueness="Maintainable">
>         <r:Agency>org.acme</r:Agency>
>         <r:ID>unique_within_parent</r:ID>
>         <r:Version>1</r:Version>
>         
>         <c:Concept scopeOfUniqueness="Maintainable">
>           <r:Agency>org.acme</r:Agency>
>           <r:ID>unique_within_parent</r:ID>
>           <r:Version>1</r:Version>
>           
>           <c:ConceptName>
>             <r:String>different concept name</r:String>
>           </c:ConceptName>
>         </c:Concept>
> 
>       </c:ConceptScheme>
>     </c:ConceptualComponent>
>     
>   </s:StudyUnit>
> </ddi:DDIInstance>
> -----8<-----8<-----8<-----8<-----8<-----
> 
> I was not able to figure out from the specifications 
> whether the canonical URN for a Maintainable with 
> scopeOfUniqueness="Maintainable" should include 
> the parent maintainable or not.
> 
> page 17 (in technical documentation) gets close, 
> but does not include this case:
>     
>     "When the ID is scoped to the Maintainable the unique identification 
>     of a non-Maintainable object requires [...]"
> 
> Here the ID is scoped to the Maintainable,
> and we are concerned about the unique identification
> of a Maintainable object (instead of non-Maintainable).
> 
> However, as there are only two alternatives, it is
> possible to explore both:
> 
> Case 1 (include parent maintainable in the URN)
> 
>     In this case the urn would be:
>     urn:ddi:org.acme:unique_within_parent.unique_within_parent:1
> 
> Case 2 (do not include parent maintainable in the URN)
>     
>     In this case the urn would be:
>     urn:ddi:org.acme:unique_within_parent:1
> 
> As it turns out, both of these cases fail to provide unambiguous
> identification sequence.
> 
> Including the TypeOfObject information will not resolve the issue, 
> since is is possible to have two ore more non-Maintainables
> which all will have the same reference.
>     
> For instance, in the example there are two different <Concept> Versionables 
> which both will have the same reference:
> 
>     <!-- Reference for "the concept name" concept -->
>     <r:ConceptReference>
>       <r:URN>urn:ddi:org.acme:unique_within_parent.unique_within_parent:1</r:URN>
>       <r:TypeOfObject>Concept</r:TypeOfObject>
>     </r:ConceptReference>
> 
>     <!-- Reference for the "different concept name" concept -->
>     <r:ConceptReference>
>       <r:URN>urn:ddi:org.acme:unique_within_parent.unique_within_parent:1</r:URN>
>       <r:TypeOfObject>Concept</r:TypeOfObject>
>     </r:ConceptReference>
> 
> It is apparent that in order to provide unambiguous reference using 
> a sequence of object IDs, each ID must be unique within the preceding object.
> 
> For instance, in the canonical URN of a non-Maintainable object 
> the sequence of IDs is:
>     
>     (agency, maintainable, identifiable)
> 
> For this to work, agency must be globally unique, maintainable must be
> unique within the agency and identifiable must be unique within 
> the maintainable. This suggests that @scopeOfUniqueness for Maintainables
> should be restricted to allow "Agency" only.
> 
> Alternative way to resolve the issue is to modify the URN structure to allow 
> arbitrary identification sequences. In this case, an additional restriction is 
> also needed: objects which are not contained (or cannot be contained) 
> within any other identified object (eg. DDIInstance) must be scoped to 
> be unique within the agency. In this approach the identification sequence
> for an object that has scopeOfUniqueness="Maintainable" would need to 
> include the IDs of all the ancestor maintainables up to a maintainable that
> has scopeOfUniqueness="Agency". An example of such an identification 
> sequence would be:
> 
>     (agency, unique_within_agency, unique_within_parent, unique_within_parent, ...)
> 
> I guess the first approach might be the simplest solution for resolving
> the ambiguity introduced by the unrestricted usage of @scopeOfUniqueness.
> 
> Hopefully the issue will be resolved in one way or the other. 
> The presented options are just the two most obvious solutions.
> 
> _______________________________________________
> DDI-users mailing list
> DDI-users at icpsr.umich.edu
> http://lists.icpsr.umich.edu/mailman/listinfo/ddi-users
> 


-- 
Dan Smith
+1 608-213-2867
Colectica - Statistical Data Management
http://www.colectica.com


More information about the DDI-users mailing list