[DDI-SRG] [disco] issues on missing values for numeric response domain and on conceptual variable
Wackerow, Joachim
Joachim.Wackerow at gesis.org
Sat Mar 16 06:40:52 EDT 2019
Wendy,
Yes, this all makes absolutely sense to me. Disco can indicate per code or value if it is missing or not. This seems to be sufficient to represent 2.5. and 3.1.
The question is if it should represent 3.2 in this regard. Is the more advanced description of missing values really important for a data search.
A related change would be some work.
I tend to invent for Disco a section on limitations where it is mentioned that the ConceptualVariable in 3.2 and the advanced description of missing values in 3.2 cannot be represented.
This is my understanding from 3.2
Variable has VariableRepresentation 0..1
VariableRepresentation has ValueRepresentation 0..1 (i.e. CodeRepresentation, NumericRepresentation) and MissingValuesReference 0..1
CodeRepresentation has CodeListReference
CodeListReference points to CodeList
CodeList has Code 0..*
NumericRepresentation has NumberRange 0..*
MissingValuesReference points to ManagedMissingValuesRepresentation
ManagedMissingValuesRepresentation has MissingCodeRepresentation 0..*, MissingNumericRepresentation 0..*
MissingCodeRepresentation has CodeListReference 0..1
MissingNumericRepresentation has NumberRange 0..*
Achim
From: Wendy Thomas [mailto:wlt at umn.edu]
Sent: Freitag, 15. März 2019 15:40
To: Wackerow, Joachim
Cc: DDI Structural Reform Working Group.; Zapilko, Benjamin
Subject: Re: [DDI-SRG] [disco] issues on missing values for numeric response domain and on conceptual variable
RE: missing values
Achim,
As I recall Disco was created prior to our division of substantive and sentinel values. Earlier DDI-Lifecycle used something similar to Codebook where you could designate blank as missing and specify which values were missing either by listing in an attribute (3.1) or indicating if a value in a catVal was determined to be missing. If Disco covers that it will work with all versions of codebook and lifecycle. When we added the ability to show missing values as a separate representation we did not remove the short hand approach so that major surgery on earlier versions was not required.
Wendy
On Fri, Mar 15, 2019 at 7:14 AM Wackerow, Joachim <Joachim.Wackerow at gesis.org<mailto:Joachim.Wackerow at gesis.org>> wrote:
Many thanks to Dan, Larry, and Wendy for your thoughts.
First, I would like to mention again the frame of this discussion,
Our focus is here: What can we do for Disco like it is currently?
The whole approach of Disco is to focus on a simple subset of DDI Codebook and DDI Lifecycle for Discovery purposes.
It is not a 1:1 representation of DDI Codebook or Lifecycle. It is not related to DDI 4 which is a moving target.
The intention is to finalize Disco not to make Disco as good or better than DDI 4.
Furthermore, any changes shouldn’t be extensive. This wouldn’t be affordable.
Re: ConceptualVariable
I have here a similar thinking as Wendy. For the purpose of Disco, i.e. for searches on specific data, does the ConceptualVariable really add substantial value?
I tend to leave the current Disco structure unchanged.
Re: missing variables for numeric response domain
It looks like an actionable item (math expression) seems to be the right way to go. My impression is that Disco has Representation but doesn’t make a distinction between categorical and numeric representation.
Would the simple approach be that Representation has a property (i.e. missingValue) with the type skos:Concept?
See variable diagram of Disco: https://raw.githubusercontent.com/linked-statistics/disco-spec/master/diagrams/variable.png.
Achim
From: Wendy Thomas [mailto:wlt at umn.edu<mailto:wlt at umn.edu>]
Sent: Donnerstag, 14. März 2019 15:29
To: Wackerow, Joachim
Cc: DDI Structural Reform Working Group.; Zapilko, Benjamin
Subject: Re: [DDI-SRG] [disco] issues on missing values for numeric response domain and on conceptual variable
I think that it is not as important to have that step in the hierarchy in Disco. The purpose of Disco, at least initially, was to facilitate discovery of data and related metadata in an RDF environment. As one can locate and link a concept to an existing variable that is what seems to be important. The value of a Represented Variable in the discovery process is the ability to track variable reuse across iterations of a study or a common variable, such as the U.S. OMB definition of the Race variable across studies. Unless there is some discovery advantage to exposing a Conceptual Variable I don't think its expression in Disco if vital.
Wendy
On Thu, Mar 14, 2019 at 7:29 AM Wackerow, Joachim <Joachim.Wackerow at gesis.org<mailto:Joachim.Wackerow at gesis.org>> wrote:
Benjamin Zapilko and I are currently reviewing the open issues of Disco. The goal is to resolve the issues and to prepare Disco finally for publication.
Now I have questions on two issues:
--
There is an issue on how to describe missing values for a numeric response domain.
Details and my comment see at https://github.com/linked-statistics/disco-spec/issues/130.
My question:
Is there really missing something in Disco? I don’t have the impression. But maybe I misunderstood something.
--
The other issue is that the conceptual variable of DDI 3.2 does not exist in Disco.
The hierarchy is only Variable, RepresentedVariable, skos:Concept.
Details and my comment see at https://github.com/linked-statistics/disco-spec/issues/226.
The whole approach of Disco is to focus on a simple subset of DDI Codebook and DDI Lifecycle for Discovery purposes. It is not a 1:1 representation.
My question:
Is it really important to be able to search for the ConceptualVariable in addition to Variable, RepresentedVariable, and Concept.
Is this addition really worth it? This might result in some work for Disco.
Any thoughts would be helpful.
Thanks
Achim
_______________________________________________
DDI-SRG mailing list
DDI-SRG at icpsr.umich.edu<mailto:DDI-SRG at icpsr.umich.edu>
http://lists.icpsr.umich.edu/mailman/listinfo/ddi-srg
--
Wendy L. Thomas Phone: +1 612.624.4389
Data Access Core Director Fax: +1 612.626.8375
Minnesota Population Center Email: wlt at umn.edu<mailto:wlt at umn.edu>
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455
_______________________________________________
DDI-SRG mailing list
DDI-SRG at icpsr.umich.edu<mailto:DDI-SRG at icpsr.umich.edu>
http://lists.icpsr.umich.edu/mailman/listinfo/ddi-srg
--
Wendy L. Thomas Phone: +1 612.624.4389
Data Access Core Director Fax: +1 612.626.8375
Minnesota Population Center Email: wlt at umn.edu<mailto:wlt at umn.edu>
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.icpsr.umich.edu/pipermail/ddi-srg/attachments/20190316/17fe1ba1/attachment-0001.html
More information about the DDI-SRG
mailing list