[DDI-users] standard missing values via DDI

Wendy Thomas wlt at umn.edu
Wed Jun 25 10:31:50 EDT 2014


Does this 3.2 structure do what you need? it can be referenced from any
variable, noted as the default missing values for a LogicalRecord and a
Physical Instance.

<r:ManagedMissingValuesRepresentation>  (note I've left off the
identification and other versionable type information)
  <r:ManagedMissingValuesRepresenntationName>Combined Missing
Types</r:ManagedMissingValuesRepresentationName>
  <r:MissingCodeRepresentation>
    <r:RecommendedDataType>integer</r:RecommendedDataType>
    <r:CodeListReference/>               to a CodeList with name Missing at
Random
 </r:MissingCodeRepresentation>
  <r:MissingCodeRepresentation>
    <r:RecommendedDataType>integer</r:RecommendedDataType>
    <r:CodeListReference/>               to a CodeList with name Missing by
Design
 </r:MissingCodeRepresentation>
</r:ManagedMissingValuesRepresentation>


On Wed, Jun 25, 2014 at 2:39 AM, Adrian Dușa <dusa.adrian at unibuc.ro> wrote:

> Dear All,
>
> Following a private discussion, an idea emerged that i think it's useful
> to circulate and discuss.
>
> From what I understand, SAS codes special missing values as extremely low
> values, while Stata went for the opposite way, coding them as extremely
> large values.
>
> Those are decisions which are software specific, and it is unlikely that
> other software packages will follow one trend or another.
>
> There might be a way to solve all particular needs, using DDI as a
> mediator and most importantly using only "normal" values.
>
> The main quest is to differentiate between missing values. In R, and I'm
> sure DDI can do that too, each variable can be attached with a list of
> attributes. One such component of the list of attributes could be dedicated
> to the missing values, and further differentiate within:
> - "missing at random": 1, 5, 9
> - "missing by design": 8, 15, 78
>
> Here, the (simple integer) numbers 1, 5, 8, 9, 15 and 78 are nothing but
> the indexes of the line numbers (ie the cases) where the missing values
> reside in a particular variable.
>
> If I had this kind of information in the DDI XML file, I could then
> instruct my R function to create <specific> setup files for SAS or Stata
> using .r and .d in those specific cases, while in R all missing values
> could remain as simple NAs but users can still differentiate between
> missings by just looking at the list of attributes.
>
> This way it would accomplish the other need to avoid accidental mistakes,
> and it is both package independent and specific in the same time, using DDI
> as an exchange platform.
>
> Recoding specific missing values is trivial in R, but I have to confess I
> don't know if and how this might be done in other software via setup files.
> People using specific software packages might confirm if this approach is
> possible or not. Raw data should be read by all packages from a .csv file
> where missing values are system missing (empty) values.
>
> Best wishes,
> Adrian
>
>
> --
> Adrian Dusa
> University of Bucharest
> Romanian Social Data Archive
> 1, Schitu Magureanu Bd.
> 050025 Bucharest sector 5
> Romania
> Tel.:+40 21 3126618 \
>         +40 21 3120210 / int.101
> Fax: +40 21 3158391
>
>
> _______________________________________________
> DDI-users mailing list
> DDI-users at icpsr.umich.edu
> http://lists.icpsr.umich.edu/mailman/listinfo/ddi-users
>
>


-- 
Wendy L. Thomas                              Phone: +1 612.624.4389
Data Access Core Director                 Fax:   +1 612.626.8375
Minnesota Population Center             Email: wlt at umn.edu
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.icpsr.umich.edu/pipermail/ddi-users/attachments/20140625/90bbfa00/attachment.html 


More information about the DDI-users mailing list