[DDI-SRG] [disco] issues on missing values for numeric response domain and on conceptual variable

Wendy Thomas wlt at umn.edu
Sat Mar 16 12:56:12 EDT 2019


Achim,
Your understanding is correct. Note that 3.2 still retains the 3.1 means of
identifying missing values also. Larry's point is also valid. The extended
details are more important in terms of working with and analyzing  the data
than it is in searching for data. Essentially the difference would be that
in 2.5 and 3.2 Disco would pull from a single depiction of a value
representation while in 3.2 it should pull from both value representation
and missing value representation. At that point those from a missing value
representation would indicate they were missing in however Disco is
relaying that. This would ensure that those using the new structure ONLY
could also map missing values to Disco. For the most part people don't
search for missing values when searching for data :-)

Wendy

On Sat, Mar 16, 2019 at 5:41 AM Wackerow, Joachim <
Joachim.Wackerow at gesis.org> wrote:

> Wendy,
>
>
>
> Yes, this all makes absolutely sense to me. Disco can indicate per code or
> value if it is missing or not. This seems to be sufficient to represent
> 2.5. and 3.1.
>
>
>
> The question is if it should represent 3.2 in this regard. Is the more
> advanced description of missing values really important for a data search.
>
>
>
> A related change would be some work.
>
>
>
> I tend to invent for Disco a section on limitations where it is mentioned
> that the ConceptualVariable in 3.2 and the advanced description of missing
> values in 3.2 cannot be represented.
>
>
>
> This is my understanding from 3.2
>
>
>
> Variable has VariableRepresentation 0..1
>
> VariableRepresentation has ValueRepresentation 0..1 (i.e.
> CodeRepresentation, NumericRepresentation) and MissingValuesReference 0..1
>
>
>
> CodeRepresentation has CodeListReference
>
> CodeListReference points to CodeList
>
> CodeList has Code 0..*
>
>
>
> NumericRepresentation has NumberRange 0..*
>
>
>
> MissingValuesReference points to ManagedMissingValuesRepresentation
>
> ManagedMissingValuesRepresentation has MissingCodeRepresentation 0..*,
> MissingNumericRepresentation 0..*
>
> MissingCodeRepresentation has CodeListReference 0..1
>
> MissingNumericRepresentation has NumberRange 0..*
>
>
>
> Achim
>
>
>
> *From:* Wendy Thomas [mailto:wlt at umn.edu]
> *Sent:* Freitag, 15. März 2019 15:40
> *To:* Wackerow, Joachim
> *Cc:* DDI Structural Reform Working Group.; Zapilko, Benjamin
> *Subject:* Re: [DDI-SRG] [disco] issues on missing values for numeric
> response domain and on conceptual variable
>
>
>
> RE: missing values
>
> Achim,
>
> As I recall Disco was created prior to our division of substantive and
> sentinel values. Earlier DDI-Lifecycle used something similar to Codebook
> where you could designate blank as missing and specify which values were
> missing either by listing in an attribute (3.1) or indicating if a value in
> a catVal was determined to be missing. If Disco covers that it will work
> with all versions of codebook and lifecycle. When we added the ability to
> show missing values as a separate representation we did not remove the
> short hand approach so that major surgery on earlier versions was not
> required.
>
>
>
> Wendy
>
>
>
> On Fri, Mar 15, 2019 at 7:14 AM Wackerow, Joachim <
> Joachim.Wackerow at gesis.org> wrote:
>
> Many thanks to Dan, Larry, and Wendy for your thoughts.
>
>
>
> First, I would like to mention again the frame of this discussion,
>
> Our focus is here: What can we do for Disco like it is currently?
>
> The whole approach of Disco is to focus on a simple subset of DDI Codebook
> and DDI Lifecycle for Discovery purposes.
>
> It is not a 1:1 representation of DDI Codebook or Lifecycle. It is not
> related to DDI 4 which is a moving target.
>
> The intention is to finalize Disco not to make Disco as good or better
> than DDI 4.
>
> Furthermore, any changes shouldn’t be extensive. This wouldn’t be
> affordable.
>
>
>
> Re: ConceptualVariable
>
> I have here a similar thinking as Wendy. For the purpose of Disco, i.e.
> for searches on specific data, does the ConceptualVariable really add
> substantial value?
>
> I tend to leave the current Disco structure unchanged.
>
>
>
> Re: missing variables for numeric response domain
>
> It looks like an actionable item (math expression) seems to be the right
> way to go. My impression is that Disco has Representation but doesn’t make
> a distinction between categorical and numeric representation.
>
> Would the simple approach be that Representation has a property (i.e.
> missingValue) with the type skos:Concept?
>
> See variable diagram of Disco:
> https://raw.githubusercontent.com/linked-statistics/disco-spec/master/diagrams/variable.png.
>
>
>
>
> Achim
>
>
>
> *From:* Wendy Thomas [mailto:wlt at umn.edu]
> *Sent:* Donnerstag, 14. März 2019 15:29
> *To:* Wackerow, Joachim
> *Cc:* DDI Structural Reform Working Group.; Zapilko, Benjamin
> *Subject:* Re: [DDI-SRG] [disco] issues on missing values for numeric
> response domain and on conceptual variable
>
>
>
> I think that it is not as important to have that step in the hierarchy in
> Disco. The purpose of Disco, at least initially, was to facilitate
> discovery of data and related metadata in an RDF environment. As one can
> locate and link a concept to an existing variable that is what seems to be
> important. The value of a Represented Variable in the discovery process is
> the ability to track variable reuse across iterations of a study or a
> common variable, such as the U.S. OMB definition of the Race variable
> across studies. Unless there is some discovery advantage to exposing a
> Conceptual Variable I don't think its expression in Disco if vital.
>
>
>
> Wendy
>
>
>
> On Thu, Mar 14, 2019 at 7:29 AM Wackerow, Joachim <
> Joachim.Wackerow at gesis.org> wrote:
>
> Benjamin Zapilko and I are currently reviewing the open issues of Disco.
> The goal is to resolve the issues and to prepare Disco finally for
> publication.
>
>
>
> Now I have questions on two issues:
>
>
>
> --
>
> There is an issue on how to describe missing values for a numeric response
> domain.
>
> Details and my comment see at
> https://github.com/linked-statistics/disco-spec/issues/130.
>
>
>
> My question:
>
> Is there really missing something in Disco? I don’t have the impression.
> But maybe I misunderstood something.
>
>
>
> --
>
> The other issue is that the conceptual variable of DDI 3.2 does not exist
> in Disco.
>
> The hierarchy is only Variable, RepresentedVariable, skos:Concept.
>
> Details and my comment see at
> https://github.com/linked-statistics/disco-spec/issues/226.
>
>
>
> The whole approach of Disco is to focus on a simple subset of DDI Codebook
> and DDI Lifecycle for Discovery purposes. It is not a 1:1 representation.
>
>
>
> My question:
>
> Is it really important to be able to search for the ConceptualVariable in
> addition to Variable, RepresentedVariable, and Concept.
>
> Is this addition really worth it? This might result in some work for Disco.
>
> Any thoughts would be helpful.
>
>
>
> Thanks
>
> Achim
>
>
>
>
>
> _______________________________________________
> DDI-SRG mailing list
> DDI-SRG at icpsr.umich.edu
> http://lists.icpsr.umich.edu/mailman/listinfo/ddi-srg
>
>
>
> --
>
> Wendy L. Thomas                              Phone: +1 612.624.4389
>
> Data Access Core Director                 Fax:   +1 612.626.8375
>
> Minnesota Population Center             Email: wlt at umn.edu
>
> University of Minnesota
>
> 50 Willey Hall
>
> 225 19th Avenue South
>
> Minneapolis, MN 55455
>
> _______________________________________________
> DDI-SRG mailing list
> DDI-SRG at icpsr.umich.edu
> http://lists.icpsr.umich.edu/mailman/listinfo/ddi-srg
>
>
>
> --
>
> Wendy L. Thomas                              Phone: +1 612.624.4389
>
> Data Access Core Director                 Fax:   +1 612.626.8375
>
> Minnesota Population Center             Email: wlt at umn.edu
>
> University of Minnesota
>
> 50 Willey Hall
>
> 225 19th Avenue South
>
> Minneapolis, MN 55455
>


-- 
Wendy L. Thomas                              Phone: +1 612.624.4389
Data Access Core Director                 Fax:   +1 612.626.8375
Minnesota Population Center             Email: wlt at umn.edu
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.icpsr.umich.edu/pipermail/ddi-srg/attachments/20190316/b6155148/attachment-0001.html 


More information about the DDI-SRG mailing list