[DDI-SRG] [disco] issues on missing values for numeric response domain and on conceptual variable

Gillman, Daniel - BLS Gillman.Daniel at bls.gov
Thu Mar 14 08:56:48 EDT 2019


Achim,

I have some reactions to your questions:


1)      The allowed values, or value domain, for a variable come in 2 kinds and 2 roles. They can be enumerated - all allowed values listed - or they can be described - some expression or rule determines which values are valid. Also, value domains have roles, the substantive domains for the allowed subject matter values, and the sentinel domain for missing values. Separating the substantive and sentinel domains make data management easier. Finally, in either case (substantive or sentinel) a value domain can be mixed, some values enumerated and some described. So, in the case of a numeric response domain, it sounds as though the numeric domain is substantive, The missing values are sentinels. Manage them separately.

2)      It depends on the use cases and the complexity of all the variables you want to describe as to whether you will miss the conceptual variable. In general, I'd say keep it. The concept above variables can be used as a way to group semantically related variables. Consider the concept income. This could include conceptual variables covering wages (money earned directly for work), income (wages plus investment and retirement income), and compensation (income plus monetary value of benefits). Each will have its own conceptual variable, yet the concept linking them is semantically useful.

Yours,
Dan

From: ddi-srg-bounces at icpsr.umich.edu [mailto:ddi-srg-bounces at icpsr.umich.edu] On Behalf Of Wackerow, Joachim
Sent: Thursday, March 14, 2019 8:29 AM
To: DDI Structural Reform Working Group. <ddi-srg at icpsr.umich.edu>
Cc: Zapilko, Benjamin <Benjamin.Zapilko at gesis.org>
Subject: [DDI-SRG] [disco] issues on missing values for numeric response domain and on conceptual variable

Benjamin Zapilko and I are currently reviewing the open issues of Disco. The goal is to resolve the issues and to prepare Disco finally for publication.

Now I have questions on two issues:

--
There is an issue on how to describe missing values for a numeric response domain.
Details and my comment see at https://github.com/linked-statistics/disco-spec/issues/130.

My question:
Is there really missing something in Disco? I don't have the impression. But maybe I misunderstood something.

--
The other issue is that the conceptual variable of DDI 3.2 does not exist in Disco.
The hierarchy is only Variable, RepresentedVariable, skos:Concept.
Details and my comment see at https://github.com/linked-statistics/disco-spec/issues/226.

The whole approach of Disco is to focus on a simple subset of DDI Codebook and DDI Lifecycle for Discovery purposes. It is not a 1:1 representation.

My question:
Is it really important to be able to search for the ConceptualVariable in addition to Variable, RepresentedVariable, and Concept.
Is this addition really worth it? This might result in some work for Disco.
Any thoughts would be helpful.

Thanks
Achim


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.icpsr.umich.edu/pipermail/ddi-srg/attachments/20190314/f58e42b4/attachment.html 


More information about the DDI-SRG mailing list