[DDI-users] A "home" for the DDI

Mark R. Diggory ddi-users@icpsr.umich.edu
Tue, 19 Aug 2003 13:12:55 -0400


ikuo@icpsr.umich.edu wrote:
> Quoting "Mark R. Diggory" <mdiggory@latte.harvard.edu>:
>>Again, this embedding is just a characteristic capability of XSD, its 
>>just the capability of an XSD "Container" element, that you can place 
>>any other content into that container element ( in OAI's approach, they 
>>require it to carry a schema that can be validated against). From our 
>>experience, there is no major modification to the contents XSD (the DDI 
>>in this case) that needs occur to make it "embeddable" in OAI.
>>
>>I think the issue of actually introducing this same container-like 
>>functionality into the DDI, for example, with the DDI containing MARCXML 
>>markup or Dublin Core XML markup, that would totally break any 
>>compatibility. It would be impossible to maintain a comparable DTD if 
>>such features were introduced. I'm sure there are other feature that I'm 
>>not strongly familiar with that would have such requirements as well.
> 
> 
> Let's think about this in a little more detail. If you use DDI as a wrapper, 
> then you're right, you don't need to make structural changes; you can just use 
> namespaces. But let me say that I don't think that the DDI should be used to 
> wrap one big MARCXML or Dublin Core document. To put it another way, I don't 
> think that the DDI should be used as some kind of envelope protocol the way 
> that SOAP is or that TCP/IP is. My feeling is that an envelope type of 
> protocol should be lightweight and singleminded. The DDI is just too big and 
> clumsy to be the wrapper of a large object. Also, this kind of usage of the 
> DDI does not offer any advantages to simply keeping the DDI and MARCXML or 
> Dublic Core in separate documents but zipped together for easy transport. The 
> value of the embedding has to come from some interaction between DDI and the 
> other XML that's embedded in it.

Very true on all accounts. Those were probably poor examples. A more 
realistic example comes from ideas we've had about embedding our own 
elements into the DDI for administrative management purposes inside the 
VDC itself.

For example, we have occasionally explored the idea of the following 
embedding example in our meetings.

<codeBook>
...
<otherMat xmlns:vdc="..." ...>
    <vdc:category>foo</vdc:category>
    <vdc:class>bar</vdc:class>
    <vdc:mimetype>bam</vdc:mimetype>
</otherMat>
</codeBook>

as opposed to

<codeBook>
...
<otherMat xmlns:vdc="..." ...>
    <notes type="vdc:category">foo</notes>
    <notes type="vdc:class">bar</notes>
    <notes type="vdc:mimetype">bam</notes>
</otherMat>
</codeBook>


> 
> The reason that DDI should not be used as a container/wrapper for MARCXML or 
> DCXML is that they all have overlapping concerns. On the other hand, it's OK 
> if DDI contains XML-FO or XHTML, because the concerns of DDI doesn't overlap 
> the display concerns of XML-FO or XHTML (I'm ignoring the well-intentioned but 
> misguided inclusion of formatting TEI tags). This separation of concerns is 
> really the same principle underlying the 7 layers of the transport protocol -- 
> each layer has its own concern so they can be independent.

This is another strong example of embedding, yes.

You are right about overlapping concerns, this is why there is 
significant work being done to provide "crosswalks" between the various 
formats of MarcXML, Dublin Core XML,..., and DDI. This is a critical 
aspect of the OAI-MHP protocol as well.

> 
> So, in my opinion, if the DDI is to contain MARCXML or Dublic Core, then the 
> external XML has to be broken up and stored in individual elements of the DDI. 
> This would require structural changes because the concepts in MARCXML or 
> Dublin Core aren't organized in the same way ( we're finding this out as the 
> DDI is trying to harmonize itself with ISO1179. Many attributes match up but 
> they're organized differently ). XSD offers the capability of namespaces, but 
> in my opinion, mixing namespaces doesn't do any good unless the elements in 
> the individual namespaces play nicely with each other.
> 

This is a very important project that the technical committee could 
"spearhead". If I could identify the issues that a technical committee 
should be addressing, I think it would follow these lines:

A.) Appropriate technical issues involved in mapping between DDI and 
other standard Metadata formats. Identifying and authoring appropriate 
"crosswalks" from DDI to MarcXML, Simple Dublin Core, METS, etc. And 
provide technical implementations of those crosswalks (say, in XSLT). 
Recommending appropriate changes to the DDI to make it more "amiable" 
for crosswalking.

B.) Establishing "best practices" for the embedding of alternate 
namespaced content into the DDI. What formats are allowed, which 
elements are containers for such formats. XSD provides both 1.) very 
selective control over what namespaces are allowed in an element and 2.) 
very generic control where an element can contain just about any markup 
  in any namespace. So there is a wide range of control that can be 
applied in this case. What is the best practice for allowing such 
content expansion.

C.) Provide expertise on what elements and attributes in the DDI XSD 
datatyping could be applied to; Date's, id's as examples.

> On the other hand, the OAI seems to be a good candidate for a protocol used to 
> wrap DDI, MARCXML, Dublin Core, etc., but I don't know if it's designed that 
> way.

The default "record" format of the wrapper protocol that all OAI 
providers need to support is "Simple Dublin Core", so our providers need 
to crosswalk from DDI to this Dublin core xml format,  a provider can 
then provide other "crosswalks" and disseminate their content in any 
number of other formats which the client can query and select from. Your 
right, the OAI-MHP wrapper is basically used as a "transport" protocol 
for the various dissemination formats.

-Mark