[DDI-users] DDI 3.2: Schema allows double identification sequence

Jani Hautamäki Jani.Hautamaki at staff.uta.fi
Wed Dec 3 17:58:00 EST 2014


>From what you said I get the impression that it is a bug in the XML Schema. Intension is expressed in the specifcation, but incapable of formulating the restriction in XML Schema. Is this correct?


The patch was included in the message and the resulting .xsd was tested against libxml2 v2.7.8. The patched schema worked just as intended. Including the patch in the XML Schema would reduce the number of documents that formally valid but invalid according to the specification.


Of course, accepting the fix to the schema would imply that also the similar structure in the ReferenceType should be fixed in a similar manner. However, in my opinion, this would only be a good thing to happen...


________________________________
From: ddi-users-bounces at icpsr.umich.edu <ddi-users-bounces at icpsr.umich.edu> on behalf of Wendy Thomas <wlt at umn.edu>
Sent: Thursday, December 4, 2014 00:27
To: Data Documentation Initiative Users Group
Subject: Re: [DDI-users] DDI 3.2: Schema allows double identification sequence

Yes we know about this but have been unable in XML to allow only one of each but require at least one or the other. We have had the bug filed since version 3.0. The best we could do was document what was supposed to be done.

URL OR sequence OR both is the intention and so documented.

However 2 URLs or 2 sequences will validate in the XML

Wendy


On Wed, Dec 3, 2014 at 4:16 PM, Jani Hautamäki <Jani.Hautamaki at staff.uta.fi<mailto:Jani.Hautamaki at staff.uta.fi>> wrote:

In DDI-Lifecycle 3.2 the narrative documentation

http://www.ddialliance.org/Specification/DDI-Lifecycle/3.2/XMLSchema/FieldLevelDocumentation/schemas/reusable_xsd/complexTypes/AbstractIdentifiableType.html

for the type

{ddi:reusable:3_2}AbstractIdentifiableType

states that

"An entity can either be identified either by a URN and/or an identification
sequence. At a minimum, one or the other is required. "

However, according to the XML Schema, the following XML document is valid.

---8<---8<---8<---
<?xml version="1.0" encoding="utf-8"?>
<ddi:DDIInstance
    xmlns:ddi="ddi:instance:3_2"
    xmlns:r="ddi:reusable:3_2"
    >

  <r:Agency>acme.org<http://acme.org></r:Agency>
  <r:ID>ddi_instance</r:ID>
  <r:Version>1</r:Version>

  <r:Agency>acme.org<http://acme.org></r:Agency>
  <r:ID>another_ddi_instance</r:ID>
  <r:Version>2</r:Version>

</ddi:DDIInstance>
---8<---8<---8<---

This is invalid according to specification, but the restriction
is not expressed formally with XML Schema.

My question is then:

Is this a mistake/bug in the XML Schema? (If not, please explain why
it is better to formally allow such invalid documents?)

The XML Schema language allows one to express formally the restriction
"either one or both". The details are laid down, for instance, in the answer
http://stackoverflow.com/questions/9863056/xsd-schema-either-one-or-both

Here's a patch for the file "reusable.xsd" that is included in the current
distribution package of the DDI-Lifecycle 3.2:


http://www.pastebucket.com/72481



_______________________________________________
DDI-users mailing list
DDI-users at icpsr.umich.edu<mailto:DDI-users at icpsr.umich.edu>
http://lists.icpsr.umich.edu/mailman/listinfo/ddi-users




--
Wendy L. Thomas                              Phone: +1 612.624.4389
Data Access Core Director                 Fax:   +1 612.626.8375
Minnesota Population Center             Email: wlt at umn.edu<mailto:wlt at umn.edu>
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.icpsr.umich.edu/pipermail/ddi-users/attachments/20141203/4f1536f8/attachment-0001.html 


More information about the DDI-users mailing list