[DDI-users] [DDI-SRG] ISSUE 602

Wendy Thomas wlt at umn.edu
Sun Jun 2 12:06:20 EDT 2013


Hense my question to the list....Should we add an attribute to
DynamicText @primaryLanguage type="xml:lang" or prehaps
audienceLanguage or something

Your example indicates that we should. To disambiguate it from an
xml:lang attribute which designates the language of the content, this
would be used to designate the language of the intended audience.

On Sun, Jun 2, 2013 at 6:00 PM, Jeremy Iverson <jeremy at colectica.com> wrote:
> Excellent. This will give each LiteralText a Content element with xml:lang.
>
> There remains a separate problem. When we specify the language individually
> for each segment, it is impossible for a machine to know the actual language
> of the full question. For example, in a question with three segments, 2 in
> English and one in German:
>
>   QuestionText [lang = ???]
>     LiteralText
>       Text
>         Content lang="en"
>     LiteralText
>       Text
>         Content lang="de"
>     LiteralText
>       Text
>         Content lang="en"
>
> I see a few ways we could guess:
>
> - the language of the first segment
> - the language of the last segment
> - the language with the most words
> - pick one randomly
> - telephone the author and ask what they intended
>
> These are not precise. Putting the language on QuestionText allows us to
> explicitly state what we mean.
>
>
> On 6/2/2013 5:56 PM, Wendy Thomas wrote:
>>
>> I'll note correction of line 1588 from name= to ref=  which should
>> fix this.
>>
>>
>> On Sun, Jun 2, 2013 at 5:46 PM, Jeremy Iverson <jeremy at colectica.com>
>> wrote:
>>>
>>> Wendy,
>>>
>>> There is a bug here. The samples in Mantis issue #602 do not
>>> validate.
>>>
>>> I am looking at proposed3.2/schema/datacollection.xsd, svn revision
>>> 157, line 1581 - 1596, the LiteralTextType is extended from
>>> TextContentType, which gives it an r:Description element.
>>> LiteralTextType also defines an element named Text, which has no
>>> type information.
>>>
>>> Line 1588 should have a ref= instead of a name= to tie it to to the
>>> type defined on line 1597. As it is now, the type of Text element
>>> is xs:anyType.
>>>
>>>
>>>
>>> On 6/2/2013 5:06 PM, Wendy Thomas wrote:
>>>>
>>>>
>>>> See under LiteralText/Text  TextType extension base="r:Content"
>>>> @xml:space
>>>>
>>>> r:Content is the language specific subelement of
>>>> StructuredStringType
>>>>
>>>>
>>>> <xs:complexType name="DynamicTextType"> <xs:annotation>
>>>> <xs:documentation>Structure supporting the use of dynamic text,
>>>> where portions of the textual contend change depending on
>>>> external information (pre-loaded data, response to an earlier
>>>> query, environmental situations, etc.).</xs:documentation>
>>>> </xs:annotation> <xs:sequence> <xs:element ref="TextContent"
>>>> maxOccurs="unbounded"> <xs:annotation> <xs:documentation>This is
>>>> the head of a substitution group and is never used directly as an
>>>> element name. Instead it is replaced with either LiteralText or
>>>> ConditionalText.</xs:documentation> </xs:annotation>
>>>> </xs:element> </xs:sequence> <xs:attribute
>>>> name="isStructureRequired" type="xs:boolean" default="false">
>>>> <xs:annotation> <xs:documentation>If textual structure (e.g.
>>>> size, color, font, etc.) is required to understand the meaning of
>>>> the content change value to "true".</xs:documentation>
>>>> </xs:annotation> </xs:attribute> </xs:complexType> <xs:element
>>>> name="TextContent" type="TextContentType" abstract="true">
>>>> <xs:annotation> <xs:documentation>Abstract type existing as the
>>>> head of a substitution group. May be replaced by any valid member
>>>> of the substitution group TextContent.</xs:documentation>
>>>> </xs:annotation> </xs:element> <xs:complexType
>>>> name="TextContentType" abstract="true"> <xs:annotation>
>>>> <xs:documentation>Abstract type existing as the head of a
>>>> substitution group. May be replaced by any valid member of the
>>>> substitution group TextContent. Provides the common element
>>>> Description to all members using TextContent as an extension
>>>> base.</xs:documentation> </xs:annotation> <xs:sequence>
>>>> <xs:element ref="r:Description" minOccurs="0"> <xs:annotation>
>>>> <xs:documentation>A description of the content and purpose of
>>>> the text segment. May be expressed in multiple languages and
>>>> supports the use of structured content.</xs:documentation>
>>>> </xs:annotation> </xs:element> </xs:sequence> </xs:complexType>
>>>> <xs:element name="LiteralText" type="LiteralTextType"
>>>> substitutionGroup="TextContent"> <xs:annotation>
>>>> <xs:documentation>A substitution for TextContent containing the
>>>> static (unchanging) text.</xs:documentation> </xs:annotation>
>>>> </xs:element> <xs:complexType name="LiteralTextType">
>>>> <xs:annotation> <xs:documentation>Literal (static) text to be
>>>> used in the instrument using the StructuredString structure plus
>>>> an attribute allowing for the specification of white space to be
>>>> preserved.</xs:documentation> </xs:annotation>
>>>> <xs:complexContent> <xs:extension base="TextContentType">
>>>> <xs:sequence> <xs:element name="Text"> <xs:annotation>
>>>> <xs:documentation>The value of the static text string. Supports
>>>> the optional use of XHTML formatting tags within the string
>>>> structure. If the content of a literal text contains more than
>>>> one language, i.e. "What is your understanding of the German word
>>>> 'Gesundheit'?", the foreign language element should be placed in
>>>> a separate LiteralText component with the appropriate xml:lang
>>>> value and, in this case, isTranslatable set to "false". If the
>>>> existance of white space is critical to the understanding of the
>>>> content (such as inclusion of a leading or trailing white space),
>>>> set the attribute of Text xml:space to
>>>> "preserve".</xs:documentation> </xs:annotation> </xs:element>
>>>> </xs:sequence> </xs:extension> </xs:complexContent>
>>>> </xs:complexType> <xs:element name="Text" type="TextType">
>>>> <xs:annotation> <xs:documentation>The static portion of the text
>>>> expressed as a StructuredString with the ability to preserve
>>>> whitespace if critical to the understanding of the
>>>> content.</xs:documentation> </xs:annotation> </xs:element>
>>>> <xs:complexType name="TextType"> <xs:annotation>
>>>> <xs:documentation>The static portion of the text expressed as a
>>>> StructuredString with the ability to preserve whitespace if
>>>> critical to the understanding of the content.</xs:documentation>
>>>> </xs:annotation> <xs:complexContent> <xs:extension
>>>> base="r:ContentType"> <xs:attribute ref="xml:space"
>>>> default="default"> <xs:annotation> <xs:documentation>The default
>>>> setting states that leading and trailing white space will be
>>>> removed and multiple adjacent white spaces will be treated as a
>>>> single white space. If the existance of any of these white spaces
>>>> is critical to the understanding of the content, change the value
>>>> of this attribute to "preserve".</xs:documentation>
>>>> </xs:annotation> </xs:attribute> </xs:extension>
>>>> </xs:complexContent> </xs:complexType> <xs:element
>>>> name="ConditionalText" type="ConditionalTextType"
>>>> substitutionGroup="TextContent"> <xs:annotation>
>>>> <xs:documentation>A substitution for TextContent, contains
>>>> command code or source of the dynamic (changing)
>>>> text.</xs:documentation> </xs:annotation> </xs:element>
>>>> <xs:complexType name="ConditionalTextType"> <xs:annotation>
>>>> <xs:documentation>Text which has a changeable value depending on
>>>> a stated condition, response to earlier questions, or as input
>>>> from a set of metrics (pre-supplied data).</xs:documentation>
>>>> </xs:annotation> <xs:complexContent> <xs:extension
>>>> base="TextContentType"> <xs:choice> <xs:element ref="Expression"
>>>> minOccurs="0"> <xs:annotation> <xs:documentation>The condition
>>>> on which the associated text varies expressed by a command code.
>>>> For example, a command that inserts an age by calculating the
>>>> difference between today’s date and a previously defined date
>>>> of birth.</xs:documentation> </xs:annotation> </xs:element>
>>>> <xs:element ref="r:SourceParameterReference" minOccurs="0">
>>>> <xs:annotation> <xs:documentation>This allows for the simple
>>>> insert of a piece of information from another specified
>>>> parameter. For example, if the text of the item using conditional
>>>> text included the respondent’s name use
>>>> SourceParameterReference to reference the InParameter of the
>>>> question that is bound to the OutParameter of the question:
>>>> “What is your name?†</xs:documentation> </xs:annotation>
>>>> </xs:element> </xs:choice> </xs:extension> </xs:complexContent>
>>>> </xs:complexType>
>>>>
>>>>
>>>> On Sun, Jun 2, 2013 at 4:36 PM, Jeremy Iverson
>>>> <jeremy at colectica.com> wrote:
>>>>>
>>>>>
>>>>> Hi Wendy,
>>>>>
>>>>> Where is the Content element where you can specify the
>>>>> language? I only see this structure, which uses xs:anyType for
>>>>> the Text, not Content. Content is used for Description, but
>>>>> that is not actually the question text, it is a description of
>>>>> the question text.
>>>>>
>>>>> QuestionItem QuestionText LiteralText Description Content Text
>>>>> xs:anyType ConditionalText
>>>>>
>>>>> If Text become TextType as you note, this would allow the
>>>>> language to be specified at the segment level. However, if the
>>>>> language is only specified for each segment, it is impossible
>>>>> to know the actual language of the question: is it the language
>>>>> of the first segment, or the one with the most words, or
>>>>> something else? Those are not precise. Putting the language on
>>>>> QuestionText let's us be explicit.
>>>>>
>>>>> If I am asking the question in English and happen to use a
>>>>> single German word, is it really necessary to document the fact
>>>>> that the single word is German? This seems like overkill, but
>>>>> if somebody has raised this as a use case I'd be curious to
>>>>> find out more.
>>>>>
>>>>> I am not sure I understand the idea behind the Description,
>>>>> either. Would a Description on the QuestionItem be more
>>>>> appropriate, rather than having a Description of each segment
>>>>> of a question's text?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Jeremy
>>>>>
>>>>> -- Jeremy Iverson +1 608-213-1637 http://www.colectica.com/
>>>>> Colectica - Statistical Data Management
>>>>>
>>>>> On 6/2/2013 12:28 PM, Wendy Thomas wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am sending this out as it seems to be a general interest
>>>>>> question and I'd like broader feedback. There is a specific
>>>>>> question regarding the resolution of this issue stated within
>>>>>> the Note below. The brief answer to the issue as stated is
>>>>>> that you can declare language in a QuestionItem and other
>>>>>> DynamicText, its just that the language and translation tags
>>>>>> lie within the Content tag (which is the language specific
>>>>>> string in a StructuredStringType). The question is whether or
>>>>>> not we need a top level "primary language" attribute to
>>>>>> clarify when the content of a single language example
>>>>>> contains foreign text. See details below.
>>>>>>
>>>>>> Please make your comments known as soon as possible. --
>>>>>> Wendy
>>>>>>
>>>>>>
>>>>>>
>>>>>> Summary  0000602: QuestionText no longer has xml:lang.
>>>>>> Cannot specify the language of questions.
>>>>>>
>>>>>> Description: The QuestionText element no longer has xml:lang,
>>>>>> so it is impossible to specify the language of question text,
>>>>>> or to specify questions with translations.
>>>>>>
>>>>>> Apologies if this has already been resolved as part of some
>>>>>> other issue. Or am I missing something here? This seems
>>>>>> quite serious.
>>>>>>
>>>>>> Proposed Solution: Restore xml:lang on QuestionText. This
>>>>>> would be consistent with the documentation for QuestionText,
>>>>>> which states "Note that when using QuestionText, the full
>>>>>> QuestionText must be repeated for multi-language versions of
>>>>>> the content
>>>>>>
>>>>>> NOTE 1654 In all cases of DynamicText we decided that the
>>>>>> object itself must repeat to clearly provide a language
>>>>>> alternative. All XxxxText objects of DynamicTextType reside
>>>>>> in a parent complex object that is the one carrying the ID.
>>>>>> The documentation states that the XxxxText object is
>>>>>> repeatable for the purpose of expressing multiple languages
>>>>>> and that the assumption is that the content of each
>>>>>> repetition within the parent object is equivilent content in
>>>>>> an alternate language.
>>>>>>
>>>>>> LiteralText is no longer a StructuredStringType but contains
>>>>>> the repeatable object Content which is the language specific
>>>>>> subelement of a StructuredString.
>>>>>>
>>>>>>
>>>>>> So in a QuestionItem:
>>>>>>
>>>>>> < d:QuestionText><d:LiteralText><r:Content
>>>>>> xml:lang="de">Kommen Sie
>>>>>> mit?</r:Content></d:LiteralText><d:QuestionText> <
>>>>>> d:QuestionText><d:LiteralText><r:Content xml:lang="en">Do
>>>>>> you want to come
>>>>>> with?</r:Content></d:LiteralText><d:QuestionText>
>>>>>>
>>>>>> This was done because a question could have multiple
>>>>>> language segements and because the dynamic text may fall in
>>>>>> different locations in various language strings. We felt it
>>>>>> was confusing to mix multiple language strings into a single
>>>>>> QuestionText under such conditions and could even be
>>>>>> impossible to parse out.
>>>>>>
>>>>>> So at the moment it is a matter of digging further into the
>>>>>> DynamicText content to determine language. The question we
>>>>>> should address is the following:
>>>>>>
>>>>>> Do we need to provide information on the primary language of
>>>>>> the DynamicText content at the parent object level?
>>>>>>
>>>>>> Pro: Saves digging into question and also clarifies the
>>>>>> primary language for mult-language content within a
>>>>>> questions, e.g. the following:
>>>>>>
>>>>>> < d:QuestionText><d:LiteralText><r:Content xml:lang="en">What
>>>>>> is your understanding of the German word
>>>>>> </r:Content><r:Content
>>>>>> xml:lang="de">"Kölsch"?</r:Content></d:LiteralText><d:QuestionText>
>>>>>>
>>>>>>
>>>>>>
>>>
>>>>>>
> Con: What is the rule for language identification conflicts between
>>>>>>
>>>>>>
>>>>>> primary language information at DynamicText level and
>>>>>> Content level? For example I could be asking a question in
>>>>>> one language for a questionnaire that was intended for use in
>>>>>> another language group. In short resolving conflicts is not a
>>>>>> one answer fits all situations.
>>>>>>
>>>>>> Note that "Content" has the full set of language and
>>>>>> translation information found in any international or
>>>>>> structred string. Also note that for ALL other string types
>>>>>> that support multple languages the language and translation
>>>>>> information is contained in the sub-element. The object that
>>>>>> is of InternationalStringType or StructuredStringType is a
>>>>>> means of binding multiple language equivilencies together.
>>>>>>
>>>>>> -- Wendy L. Thomas                              Phone: +1
>>>>>> 612.624.4389 Data Access Core Director                 Fax:
>>>>>> +1 612.626.8375 Minnesota Population Center
>>>>>> Email: wlt at umn.edu University of Minnesota 50 Willey Hall 225
>>>>>> 19th Avenue South Minneapolis, MN 55455
>>>>>>
>>>>>> _______________________________________________ DDI-SRG
>>>>>> mailing list DDI-SRG at icpsr.umich.edu
>>>>>> http://lists.icpsr.umich.edu/mailman/listinfo/ddi-srg
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>



-- 
Wendy L. Thomas                              Phone: +1 612.624.4389
Data Access Core Director                 Fax:   +1 612.626.8375
Minnesota Population Center             Email: wlt at umn.edu
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455



More information about the DDI-users mailing list