[DDI-users] [DDI-SRG] ISSUE 602

Wendy Thomas wlt at umn.edu
Sun Jun 2 09:43:22 EDT 2013


Sam

We have changed the structure of all International and
StructuredStrings in order to bundle together sets of language
equivilent texts at the request of a large group of users who explored
and requested this change. There is no "Text" within "Content", there
is just either xs:string or the option for XHTML tags. The case I was
using was one where regardless of the language of the primary part of
the text "What is your understanding of the German word" "Kolsch"? the
term Kolsch would always be there and always be in German. If we want
to indicate the language of the reader we need to use something other
than xml:lang directly. The attribute xml:lang has a specific usage
and indicates the language of the text string of the object it is an
attribute of. We could have an intended language of reader, user, etc.
but this is the first time this has been raised.

We adopted a specific approach to Dynamic Text so that you didn't end
up with something like this:

<LiteralText><Content xml:lang="en">Blah Blah Blah</Content></LiteralText>
<LiteralText><Content xml:lang="de">Etwas Etwas</Content></LiteralText>
<ConditionalText>whatever the conditional section is</ConditionalText>
<LiteralText><Content xml:lang="en">Blah Blah</Content></LiteralText>
<LiteralText><Content xml:lang="de">Etwas Etwas Etwas</Content></LiteralText>
<ConditionalText>a bit of extra because of the idiosycracies of the
german language</ConditionalText>
<LiteralText><Content xml:lang="de">Etwas Etwas</Content></LiteralText>

How do you parse that out?

BTW...if you are changing beers for each language you should use
conditional text and have the value based on language I would think.

Oh if we went back to 3.1 for DynamicText the new StructuredStringText
would result in the following:

<LiteralText><Content xml:lang="en">What is your understanding of the
German word </Content><Content xml:lang="es">[whatever that is in
Spanish]</Content></LiteralText><LiteralText><Content xml:lang="de"
isTranslatable="false">Kolsh?</Content></LiteralText>

So how would you parse that for a Spanish reader?

I think a) leaving the structure of DynamicText as it is in 3.2 and b)
adding a top level @primaryLanguage type xml:lang could indicate the
intended usage language for a mixed language question where one part
is translatable and the other isn't.

Wendy

p.s. see 223 for why LiteralText is now a structured string and for
full discussion regarding representation of multilingual material see
315

On Sun, Jun 2, 2013 at 3:09 PM, Samuel Spencer
<theodore.therone at gmail.com> wrote:
> Wendy,
>
> Your proposed rational for attatching the language at the content level
> actually does fit a defined use case.
> First of all, if I as a developer were to rend this question for an English
> speaker, the german tagged section of the question should presumably not be
> displayed.
>
> Secondly, what happens if instead of kolsch we ask the reader interpretation
> of the Czech word "pilsner". Now the problem is, Pilsner is a Czech word for
> a specific beer, but now, that is an english word for the same thing.
> XML:lang cannot be repeated and the content cannot be space delimited so is
> Pilsner in this context a Czech word or an English word? See also,
> entrepreneur (French), veranda (Indian), sushi (Japanese).
> < d:QuestionText><d:LiteralText><r:Content xml:lang="en">What is your
> understanding of the Czech word </r:Content><r:Content
> xml:lang="cz">"pilsner"?</r:Content></d:LiteralText><d:QuestionText>
>
> Consider also the German word "tagfraggensicht"
> < d:QuestionText><d:LiteralText><r:Content xml:lang="en">What is your
> understanding of the German word </r:Content><r:Content
> xml:lang="??">"tagfraggensicht"?</r:Content></d:LiteralText><d:QuestionText>
>
> In this example, the word "tagfraggensicht" isn't german, in fact it isn't a
> word. Perhaps the intent of the question is to examine lying? Under tyour
> proposed solution I would have to have a null xml:lang tag as the string
> content has no language.
>
> As an alternative, I would suggest in the context of the question the lang
> tag identifies the language of the intended reader, so instead:
> < d:QuestionText><d:LiteralText><r:Content xml:lang="en">What is your
> understanding of the German word
> "agfraggensicht"?</r:Content></d:LiteralText><d:QuestionText>
>
> However, as you have pointed out, the Content tag needs a Text tag so it
> becomes:
> < d:QuestionText><d:LiteralText><r:Content xml:lang="en"><r:Text>What is
> your
> understanding of the German word
> "agfraggensicht"?</r:Text></r:Content></d:LiteralText><d:QuestionText>
>
> Which is an extra layer of detail than in the original schema! The Text
> components for questions in DDI3.1 were a little verbose and could use
> simplification, but they worked and were very descriptive. This not only
> adds additional elements, it invalidates vast amounts of existing content,
> and makes development against the schema harder.
>
> I vote that the bug is acknowledged as is, and the QuestionText element is
> rolled back to the 3.1 schema definition.
>
> Cheers,
> Sam.
>
> --- Specificity is the soul of all good communication ---
> --- When the game is over, the king and the pawn go into the same box ---
> Find out more about me: http://about.me/legostormtroopr
>
>
> On 2 June 2013 20:48, Wendy Thomas <wlt at umn.edu> wrote:
>>
>> Minor update....note that in example below "r:Content" should be
>> d:Text of type d:TextType.
>>
>> d:TextType uses the extension base of r:Content and adds an attribute
>> that allows for the recognition of leading and trailing spaces.
>>
>> My error...I should have looked at the schema closer and not relyed on
>> my memory...which was close but clearly not firing on all cylinders. I
>> have edited the note in Mantis to relay corrected information. Also
>> added the missing end tags and start tag for the second literal text
>> segment in the second example. Clearly had a fun and exhausting time
>> at IASSIST!
>>
>> Wendy
>>
>> On Sun, Jun 2, 2013 at 12:28 PM, Wendy Thomas <wlt at umn.edu> wrote:
>> > I am sending this out as it seems to be a general interest question
>> > and I'd like broader feedback. There is a specific question regarding
>> > the resolution of this issue stated within the Note below. The brief
>> > answer to the issue as stated is that you can declare language in a
>> > QuestionItem and other DynamicText, its just that the language and
>> > translation tags lie within the Content tag (which is the language
>> > specific string in a StructuredStringType). The question is whether or
>> > not we need a top level "primary language" attribute to clarify when
>> > the content of a single language example contains foreign text. See
>> > details below.
>> >
>> > Please make your comments known as soon as possible. -- Wendy
>> >
>> >
>> >
>> > Summary  0000602: QuestionText no longer has xml:lang. Cannot specify
>> > the language of questions.
>> >
>> > Description: The QuestionText element no longer has xml:lang, so it is
>> > impossible to specify the language of question text, or to specify
>> > questions with translations.
>> >
>> > Apologies if this has already been resolved as part of some other
>> > issue. Or am I missing something here? This seems quite serious.
>> >
>> > Proposed Solution: Restore xml:lang on QuestionText. This would be
>> > consistent with the documentation for QuestionText, which states "Note
>> > that when using QuestionText, the full QuestionText must be repeated
>> > for multi-language versions of the content
>> >
>> > NOTE 1654
>> > In all cases of DynamicText we decided that the object itself must
>> > repeat to clearly provide a language alternative. All XxxxText objects
>> > of DynamicTextType reside in a parent complex object that is the one
>> > carrying the ID. The documentation states that the XxxxText object is
>> > repeatable for the purpose of expressing multiple languages and that
>> > the assumption is that the content of each repetition within the
>> > parent object is equivilent content in an alternate language.
>> >
>> > LiteralText is no longer a StructuredStringType but contains the
>> > repeatable object Content which is the language specific subelement of
>> > a StructuredString.
>> >
>> >
>> > So in a QuestionItem:
>> >
>> > < d:QuestionText><d:LiteralText><r:Content xml:lang="de">Kommen Sie
>> > mit?</r:Content></d:LiteralText><d:QuestionText>
>> > < d:QuestionText><d:LiteralText><r:Content xml:lang="en">Do you want
>> > to come with?</r:Content></d:LiteralText><d:QuestionText>
>> >
>> > This was done because a question could have multiple language
>> > segements and because the dynamic text may fall in different locations
>> > in various language strings. We felt it was confusing to mix multiple
>> > language strings into a single QuestionText under such conditions and
>> > could even be impossible to parse out.
>> >
>> > So at the moment it is a matter of digging further into the
>> > DynamicText content to determine language. The question we should
>> > address is the following:
>> >
>> > Do we need to provide information on the primary language of the
>> > DynamicText content at the parent object level?
>> >
>> > Pro: Saves digging into question and also clarifies the primary
>> > language for mult-language content within a questions, e.g. the
>> > following:
>> >
>> > < d:QuestionText><d:LiteralText><r:Content xml:lang="en">What is your
>> > understanding of the German word </r:Content><r:Content
>> > xml:lang="de">"Kölsch"?</r:Content></d:LiteralText><d:QuestionText>
>> >
>> > Con: What is the rule for language identification conflicts between
>> > primary language information at DynamicText level and Content level?
>> > For example I could be asking a question in one language for a
>> > questionnaire that was intended for use in another language group. In
>> > short resolving conflicts is not a one answer fits all situations.
>> >
>> > Note that "Content" has the full set of language and translation
>> > information found in any international or structred string. Also note
>> > that for ALL other string types that support multple languages the
>> > language and translation information is contained in the sub-element.
>> > The object that is of InternationalStringType or StructuredStringType
>> > is a means of binding multiple language equivilencies together.
>> >
>> > --
>> > Wendy L. Thomas                              Phone: +1 612.624.4389
>> > Data Access Core Director                 Fax:   +1 612.626.8375
>> > Minnesota Population Center             Email: wlt at umn.edu
>> > University of Minnesota
>> > 50 Willey Hall
>> > 225 19th Avenue South
>> > Minneapolis, MN 55455
>>
>>
>>
>> --
>> Wendy L. Thomas                              Phone: +1 612.624.4389
>> Data Access Core Director                 Fax:   +1 612.626.8375
>> Minnesota Population Center             Email: wlt at umn.edu
>> University of Minnesota
>> 50 Willey Hall
>> 225 19th Avenue South
>> Minneapolis, MN 55455
>>
>> _______________________________________________
>> DDI-SRG mailing list
>> DDI-SRG at icpsr.umich.edu
>> http://lists.icpsr.umich.edu/mailman/listinfo/ddi-srg
>
>



-- 
Wendy L. Thomas                              Phone: +1 612.624.4389
Data Access Core Director                 Fax:   +1 612.626.8375
Minnesota Population Center             Email: wlt at umn.edu
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455



More information about the DDI-users mailing list