[DDI-users] A "home" for the DDI

Mark R. Diggory ddi-users@icpsr.umich.edu
Wed, 13 Aug 2003 14:21:42 -0400


I'd like to get back to this subject matter:

ikuo@icpsr.umich.edu wrote:

>>2.) Considering the above situation, versioning is critical to managing
>>the technical implementations of the DDI above and beyond its conceptual
>>versioning. While the currently generated w3c schemas on the DDI site do
>>meet the DDI specification, I can tell you right now, there are a number
>>of errors which I have corrected in my own versions of the w3c schema
>>that are not reflected in the current versions released on the ICPSR DDI
>>site. These were only discovered through discussion and interaction
>>between Matthew Richardson, Sanda Ionescu and myself. We have had a few
>>discussions concerning where to track these changes and where to house
>>these w3c schema copies/versions. Clearly there is a technical and
>>development related versioning issue here that is above that of
>>conceptual versioning. I would contest that this should be appropriately
>>tracked, I my opinion CVS is the best tool to manage this.
> 
> 
> This would, I think, be an appropriate use of CVS, because you have multiple 
> people working on the same set of documents. All right, so you've convinced me. 
> 
> Now, if you were just to set up a private CVS, I don't think there would be any 
> objection to it. But as you say, it would be far easier to set this up via 
> SourceForge. However, whether this process should be placed on SourceForge with 
> an open license is something that would require the blessing of the DDI 
> Council. I'd certainly like to see this made an item on the next Council agenda.
> 

Does it necessarily need to be a private facility? In terms of
sourceforge, the ability to "modify" contents of the cvs are controlled
by the admin/development team and this control is very private and
secure. One would setup a core group of administrators to manage the
project and control the access rights for the developers who are doing
the modifications.

Its only read access to the content itself that is exposed to the
public. This transparency has its value, the users of the DDI may not
necessarily be its developers or even on its committee. Exposing this
process to the public provides room for comment from such users "prior"
to a versioning release (not after). Relating positively to user needs
is a critical aspect of producing a spec that the community will
continue to want to use.

> 
>>4.) I required the development of a w3c Schema for many specific reasons 
>>to deal with the limitations in the integration of DTD based XML 
>>content. OAI's Harvesting Protocol being the largest requirement for xsd 
>>based validation and a central location for DDI w3c schema's. It would 
>>be a false statement to suggest that these w3c Schema implementations 
>>had any Council involvement beyond my direct interaction with the DDI 
>>group. Unfortunately, I was not able to attend the Conference last 
>>month, so I do not know if any discussions occurred around this subject 
>>at the meeting.
> 
> 
> I'm not familiar with OAI Harvesting Protocol so I'm not sure how it mandates 
> the use of a schema, but your problem seems to be a general problem of ensuring 
> that the markup meets certain standards. You've chosen (or OAI mandates that 
> you choose) to use XSD to do this.
> 
> I have a similar problem in that my application accepts DDI documents to be 
> input the database so that a search can be performed on the variable level. 
> However, I've found that documents that validate to the DTD do not necessarily 
> provide high-enough quality markup for a search to be effective. My solution to 
> the problem is that I've started working on is an XSLT quality-checker 
> stylesheet to supplement DTD validation.
> 
> The advantages of this XSLT approach are:
> 
>   - I can restrict attribute type even if the specifications do not.
>   - I can check validity, type, and number of ID references. Even XSD cannot do 
> this, and there's a lot more I can do via XSLT that XSD will never be able to 
> do.
>   - While this "XSLT validation" overlaps with DDI specification development, 
> it does not actually conflict with it. I will have no problems if and when the 
> DDI becomes an XSD. Using your approach, because a single XML document cannot 
> validate against two XSDs (unless namespaces are used), you would have to 
> abandon your XSD when an official DDI XSD came out which did not adopt all your 
> suggested changes.
>   - Because I'm not writing a DTD or XSD, I don't need pre-approval from the 
> Council, I can go ahead and start working.
>   - I'm effectively separating validation into mostly "routine validation" to 
> be handled by DTD/XSD, and a little bit of "custom validation" to be handled 
> by XSLT.
> 

An interesting idea. I expect that the any development in the area of
XSD, that the XSD technical implementation would currently be required
to maintain a certain amount of "backward compatibility" in reference to
validation against the DTD implementation for that Version of the DDI.

But in reality, there are technical limitations to the above idea:

1.) More "specific" control of content where it is appropriate. This can
maintain "backward compatibility". A XML document validated against an
XSD would also be valid against a DTD. But not necessarily the other
way around.

2.) Embedding the DDI into other xml content and embedding other content
into the DDI (such as OAI). This would not be able to maintain backward
compatibility because DTD just doesn't support any thing near this
capability.

> 
> I would love to have a developer community associated with the DDI project. 
> However, I think a serious obstacle to that in the orientation of the DDI 
> Group. From my point of view, a lot of the changes have been made without 
> taking into account of how difficult the changes would be to implement. It 
> seems to me that to the DDI Group, "difficult to implement" means "difficult to 
> mark up a study", whereas to me, a developer, it should also mean "difficult to 
> get a machine to process the resultant markup".
> 

This is really where the process of a technical committee would provide a
"control" in terms of the "usability" of the DDI. I think this is very
necessary for the DDI's future.


-Mark