[DDI-users] [DDI-SRG] ISSUE 602

Samuel Spencer theodore.therone at gmail.com
Sun Jun 2 10:05:30 EDT 2013


Wendy,


Referring to this block:
--------------------------------------------------
We adopted a specific approach to Dynamic Text so that you didn't end
up with something like this:

<LiteralText><Content xml:lang="en">Blah Blah Blah</Content></LiteralText>
<LiteralText><Content xml:lang="de">Etwas Etwas</Content></LiteralText>
<ConditionalText>whatever the conditional section is</ConditionalText>
<LiteralText><Content xml:lang="en">Blah Blah</Content></LiteralText>
<LiteralText><Content xml:lang="de">Etwas Etwas
Etwas</Content></LiteralText>
<ConditionalText>a bit of extra because of the idiosycracies of the
german language</ConditionalText>
<LiteralText><Content xml:lang="de">Etwas Etwas</Content></LiteralText>

How do you parse that out?
--------------------------------------------------

You couldn't parse that in German anyway? German and English have entirely
different syntactic orders. Imagine the Question "Where is your
[husband/wife] come from? The syntax for this in German is entirely the
other way to English "Where come does your spouse?" Language syntax,
especially with Dynamic text becomes so different in different natural
languages, that what you're proposing wouldn't work.

The solution that I have seen independently developed in tools is to use
the QuestionText to identify the language, and repeat that for different
languages. Any other solution for displaying becomes difficult to implement.

You have said that many people have asked for this change, but I think
again it would be useful to know who, so we can construct a workable use
case. Why do they want this change, how do they envisage this happening in
software, can it be achieved in other ways, and more bluntly, do they
actually need this content at all?

As remarked, this change invalidates massive amounts of existing working
DDI, and will mean a large change for tools developers. At best, I would
assume that many tools would just use the "primaryLanguage" Attribute and
ignore any child language attributes.

Additionally, contrary to George Bush's belief "entrepreneur" is a French
word. But in DDI3.2 is it tagged as en or fr? It can't be both, and even in
the context of the above, it is a word in 2 languages. How would it be
tagged? How would it be tagged if the intent of the question is to gauges
English speakers familiarity with foreign loan words in English?

Sam.

--- Specificity is the soul of all good communication ---
--- When the game is over, the king and the pawn go into the same box ---
Find out more about me: http://about.me/legostormtroopr


On 2 June 2013 23:43, Wendy Thomas <wlt at umn.edu> wrote:

> Sam
>
> We have changed the structure of all International and
> StructuredStrings in order to bundle together sets of language
> equivilent texts at the request of a large group of users who explored
> and requested this change. There is no "Text" within "Content", there
> is just either xs:string or the option for XHTML tags. The case I was
> using was one where regardless of the language of the primary part of
> the text "What is your understanding of the German word" "Kolsch"? the
> term Kolsch would always be there and always be in German. If we want
> to indicate the language of the reader we need to use something other
> than xml:lang directly. The attribute xml:lang has a specific usage
> and indicates the language of the text string of the object it is an
> attribute of. We could have an intended language of reader, user, etc.
> but this is the first time this has been raised.
>
> We adopted a specific approach to Dynamic Text so that you didn't end
> up with something like this:
>
> <LiteralText><Content xml:lang="en">Blah Blah Blah</Content></LiteralText>
> <LiteralText><Content xml:lang="de">Etwas Etwas</Content></LiteralText>
> <ConditionalText>whatever the conditional section is</ConditionalText>
> <LiteralText><Content xml:lang="en">Blah Blah</Content></LiteralText>
> <LiteralText><Content xml:lang="de">Etwas Etwas
> Etwas</Content></LiteralText>
> <ConditionalText>a bit of extra because of the idiosycracies of the
> german language</ConditionalText>
> <LiteralText><Content xml:lang="de">Etwas Etwas</Content></LiteralText>
>
> How do you parse that out?
>
> BTW...if you are changing beers for each language you should use
> conditional text and have the value based on language I would think.
>
> Oh if we went back to 3.1 for DynamicText the new StructuredStringText
> would result in the following:
>
> <LiteralText><Content xml:lang="en">What is your understanding of the
> German word </Content><Content xml:lang="es">[whatever that is in
> Spanish]</Content></LiteralText><LiteralText><Content xml:lang="de"
> isTranslatable="false">Kolsh?</Content></LiteralText>
>
> So how would you parse that for a Spanish reader?
>
> I think a) leaving the structure of DynamicText as it is in 3.2 and b)
> adding a top level @primaryLanguage type xml:lang could indicate the
> intended usage language for a mixed language question where one part
> is translatable and the other isn't.
>
> Wendy
>
> p.s. see 223 for why LiteralText is now a structured string and for
> full discussion regarding representation of multilingual material see
> 315
>
> On Sun, Jun 2, 2013 at 3:09 PM, Samuel Spencer
> <theodore.therone at gmail.com> wrote:
> > Wendy,
> >
> > Your proposed rational for attatching the language at the content level
> > actually does fit a defined use case.
> > First of all, if I as a developer were to rend this question for an
> English
> > speaker, the german tagged section of the question should presumably not
> be
> > displayed.
> >
> > Secondly, what happens if instead of kolsch we ask the reader
> interpretation
> > of the Czech word "pilsner". Now the problem is, Pilsner is a Czech word
> for
> > a specific beer, but now, that is an english word for the same thing.
> > XML:lang cannot be repeated and the content cannot be space delimited so
> is
> > Pilsner in this context a Czech word or an English word? See also,
> > entrepreneur (French), veranda (Indian), sushi (Japanese).
> > < d:QuestionText><d:LiteralText><r:Content xml:lang="en">What is your
> > understanding of the Czech word </r:Content><r:Content
> > xml:lang="cz">"pilsner"?</r:Content></d:LiteralText><d:QuestionText>
> >
> > Consider also the German word "tagfraggensicht"
> > < d:QuestionText><d:LiteralText><r:Content xml:lang="en">What is your
> > understanding of the German word </r:Content><r:Content
> >
> xml:lang="??">"tagfraggensicht"?</r:Content></d:LiteralText><d:QuestionText>
> >
> > In this example, the word "tagfraggensicht" isn't german, in fact it
> isn't a
> > word. Perhaps the intent of the question is to examine lying? Under tyour
> > proposed solution I would have to have a null xml:lang tag as the string
> > content has no language.
> >
> > As an alternative, I would suggest in the context of the question the
> lang
> > tag identifies the language of the intended reader, so instead:
> > < d:QuestionText><d:LiteralText><r:Content xml:lang="en">What is your
> > understanding of the German word
> > "agfraggensicht"?</r:Content></d:LiteralText><d:QuestionText>
> >
> > However, as you have pointed out, the Content tag needs a Text tag so it
> > becomes:
> > < d:QuestionText><d:LiteralText><r:Content xml:lang="en"><r:Text>What is
> > your
> > understanding of the German word
> > "agfraggensicht"?</r:Text></r:Content></d:LiteralText><d:QuestionText>
> >
> > Which is an extra layer of detail than in the original schema! The Text
> > components for questions in DDI3.1 were a little verbose and could use
> > simplification, but they worked and were very descriptive. This not only
> > adds additional elements, it invalidates vast amounts of existing
> content,
> > and makes development against the schema harder.
> >
> > I vote that the bug is acknowledged as is, and the QuestionText element
> is
> > rolled back to the 3.1 schema definition.
> >
> > Cheers,
> > Sam.
> >
> > --- Specificity is the soul of all good communication ---
> > --- When the game is over, the king and the pawn go into the same box ---
> > Find out more about me: http://about.me/legostormtroopr
> >
> >
> > On 2 June 2013 20:48, Wendy Thomas <wlt at umn.edu> wrote:
> >>
> >> Minor update....note that in example below "r:Content" should be
> >> d:Text of type d:TextType.
> >>
> >> d:TextType uses the extension base of r:Content and adds an attribute
> >> that allows for the recognition of leading and trailing spaces.
> >>
> >> My error...I should have looked at the schema closer and not relyed on
> >> my memory...which was close but clearly not firing on all cylinders. I
> >> have edited the note in Mantis to relay corrected information. Also
> >> added the missing end tags and start tag for the second literal text
> >> segment in the second example. Clearly had a fun and exhausting time
> >> at IASSIST!
> >>
> >> Wendy
> >>
> >> On Sun, Jun 2, 2013 at 12:28 PM, Wendy Thomas <wlt at umn.edu> wrote:
> >> > I am sending this out as it seems to be a general interest question
> >> > and I'd like broader feedback. There is a specific question regarding
> >> > the resolution of this issue stated within the Note below. The brief
> >> > answer to the issue as stated is that you can declare language in a
> >> > QuestionItem and other DynamicText, its just that the language and
> >> > translation tags lie within the Content tag (which is the language
> >> > specific string in a StructuredStringType). The question is whether or
> >> > not we need a top level "primary language" attribute to clarify when
> >> > the content of a single language example contains foreign text. See
> >> > details below.
> >> >
> >> > Please make your comments known as soon as possible. -- Wendy
> >> >
> >> >
> >> >
> >> > Summary  0000602: QuestionText no longer has xml:lang. Cannot specify
> >> > the language of questions.
> >> >
> >> > Description: The QuestionText element no longer has xml:lang, so it is
> >> > impossible to specify the language of question text, or to specify
> >> > questions with translations.
> >> >
> >> > Apologies if this has already been resolved as part of some other
> >> > issue. Or am I missing something here? This seems quite serious.
> >> >
> >> > Proposed Solution: Restore xml:lang on QuestionText. This would be
> >> > consistent with the documentation for QuestionText, which states "Note
> >> > that when using QuestionText, the full QuestionText must be repeated
> >> > for multi-language versions of the content
> >> >
> >> > NOTE 1654
> >> > In all cases of DynamicText we decided that the object itself must
> >> > repeat to clearly provide a language alternative. All XxxxText objects
> >> > of DynamicTextType reside in a parent complex object that is the one
> >> > carrying the ID. The documentation states that the XxxxText object is
> >> > repeatable for the purpose of expressing multiple languages and that
> >> > the assumption is that the content of each repetition within the
> >> > parent object is equivilent content in an alternate language.
> >> >
> >> > LiteralText is no longer a StructuredStringType but contains the
> >> > repeatable object Content which is the language specific subelement of
> >> > a StructuredString.
> >> >
> >> >
> >> > So in a QuestionItem:
> >> >
> >> > < d:QuestionText><d:LiteralText><r:Content xml:lang="de">Kommen Sie
> >> > mit?</r:Content></d:LiteralText><d:QuestionText>
> >> > < d:QuestionText><d:LiteralText><r:Content xml:lang="en">Do you want
> >> > to come with?</r:Content></d:LiteralText><d:QuestionText>
> >> >
> >> > This was done because a question could have multiple language
> >> > segements and because the dynamic text may fall in different locations
> >> > in various language strings. We felt it was confusing to mix multiple
> >> > language strings into a single QuestionText under such conditions and
> >> > could even be impossible to parse out.
> >> >
> >> > So at the moment it is a matter of digging further into the
> >> > DynamicText content to determine language. The question we should
> >> > address is the following:
> >> >
> >> > Do we need to provide information on the primary language of the
> >> > DynamicText content at the parent object level?
> >> >
> >> > Pro: Saves digging into question and also clarifies the primary
> >> > language for mult-language content within a questions, e.g. the
> >> > following:
> >> >
> >> > < d:QuestionText><d:LiteralText><r:Content xml:lang="en">What is your
> >> > understanding of the German word </r:Content><r:Content
> >> > xml:lang="de">"Kölsch"?</r:Content></d:LiteralText><d:QuestionText>
> >> >
> >> > Con: What is the rule for language identification conflicts between
> >> > primary language information at DynamicText level and Content level?
> >> > For example I could be asking a question in one language for a
> >> > questionnaire that was intended for use in another language group. In
> >> > short resolving conflicts is not a one answer fits all situations.
> >> >
> >> > Note that "Content" has the full set of language and translation
> >> > information found in any international or structred string. Also note
> >> > that for ALL other string types that support multple languages the
> >> > language and translation information is contained in the sub-element.
> >> > The object that is of InternationalStringType or StructuredStringType
> >> > is a means of binding multiple language equivilencies together.
> >> >
> >> > --
> >> > Wendy L. Thomas                              Phone: +1 612.624.4389
> >> > Data Access Core Director                 Fax:   +1 612.626.8375
> >> > Minnesota Population Center             Email: wlt at umn.edu
> >> > University of Minnesota
> >> > 50 Willey Hall
> >> > 225 19th Avenue South
> >> > Minneapolis, MN 55455
> >>
> >>
> >>
> >> --
> >> Wendy L. Thomas                              Phone: +1 612.624.4389
> >> Data Access Core Director                 Fax:   +1 612.626.8375
> >> Minnesota Population Center             Email: wlt at umn.edu
> >> University of Minnesota
> >> 50 Willey Hall
> >> 225 19th Avenue South
> >> Minneapolis, MN 55455
> >>
> >> _______________________________________________
> >> DDI-SRG mailing list
> >> DDI-SRG at icpsr.umich.edu
> >> http://lists.icpsr.umich.edu/mailman/listinfo/ddi-srg
> >
> >
>
>
>
> --
> Wendy L. Thomas                              Phone: +1 612.624.4389
> Data Access Core Director                 Fax:   +1 612.626.8375
> Minnesota Population Center             Email: wlt at umn.edu
> University of Minnesota
> 50 Willey Hall
> 225 19th Avenue South
> Minneapolis, MN 55455
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.icpsr.umich.edu/pipermail/ddi-users/attachments/20130603/0fb6fed3/attachment-0001.html 


More information about the DDI-users mailing list