[DDI-SRG] [disco] issues on missing values for numeric response domain and on conceptual variable
Hoyle, Larry
larryhoyle at ku.edu
Thu Mar 14 09:22:02 EDT 2019
Issue 1
In DDI4 we used an approach from iso11404 to describe the datatype of a missing value enumerated and/or described. As the comments on the issue note, sometimes missing values cannot be described with a list (e.g. a number greater than 9, or a negative number - both would be infinite lists). Some missing value domains can really be a mixture (e.g. 9 or <0). The description could be verbal or actionable (e.g. a regular expression or a mathematical expression like <0).
Both the description and the enumeration could be extensions of a Concept. What seems to be unclear in concept.png is the explicit use of a description.
Issue 2
Again looking at what we did in DDI4, what distinguishes a ConceptualVariable from a Concept is the ability to tie together a Concept, a UnitType, and conceptual domains for the measure and sentinel (e.g. missing) values. The latter might be the difference between the concept of gender and the concept of gender(of people) with only two categories - male and female.
Of course you might say that "dichotomous person gender" is a concept. We have political fights over what gender means that include the conceptual value domain. Describing it with a text string might be more difficult for software to discover if you were searching for data on persons only.
From: ddi-srg-bounces at icpsr.umich.edu <ddi-srg-bounces at icpsr.umich.edu> On Behalf Of Wackerow, Joachim
Sent: Thursday, March 14, 2019 7:29 AM
To: DDI Structural Reform Working Group. <ddi-srg at icpsr.umich.edu>
Cc: Zapilko, Benjamin <Benjamin.Zapilko at gesis.org>
Subject: [DDI-SRG] [disco] issues on missing values for numeric response domain and on conceptual variable
Benjamin Zapilko and I are currently reviewing the open issues of Disco. The goal is to resolve the issues and to prepare Disco finally for publication.
Now I have questions on two issues:
--
There is an issue on how to describe missing values for a numeric response domain.
Details and my comment see at https://github.com/linked-statistics/disco-spec/issues/130<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flinked-statistics%2Fdisco-spec%2Fissues%2F130&data=02%7C01%7Clarryhoyle%40ku.edu%7Cbe118e8811aa411a926508d6a878a199%7C3c176536afe643f5b96636feabbe3c1a%7C0%7C1%7C636881633398959083&sdata=nxidinxWSPl9osNol4JuRXznq7Cjmd2TsVkMw0Lypo8%3D&reserved=0>.
My question:
Is there really missing something in Disco? I don't have the impression. But maybe I misunderstood something.
--
The other issue is that the conceptual variable of DDI 3.2 does not exist in Disco.
The hierarchy is only Variable, RepresentedVariable, skos:Concept.
Details and my comment see at https://github.com/linked-statistics/disco-spec/issues/226<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flinked-statistics%2Fdisco-spec%2Fissues%2F226&data=02%7C01%7Clarryhoyle%40ku.edu%7Cbe118e8811aa411a926508d6a878a199%7C3c176536afe643f5b96636feabbe3c1a%7C0%7C1%7C636881633398969088&sdata=H%2FSrM%2By%2Fo9RSMpzjoxWi6QSz1tf5ruKzgheIXEBmJC4%3D&reserved=0>.
The whole approach of Disco is to focus on a simple subset of DDI Codebook and DDI Lifecycle for Discovery purposes. It is not a 1:1 representation.
My question:
Is it really important to be able to search for the ConceptualVariable in addition to Variable, RepresentedVariable, and Concept.
Is this addition really worth it? This might result in some work for Disco.
Any thoughts would be helpful.
Thanks
Achim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.icpsr.umich.edu/pipermail/ddi-srg/attachments/20190314/58306547/attachment-0001.html
More information about the DDI-SRG
mailing list