[DDI-users] Storing XML

Joachim Wackerow ddi-users@icpsr.umich.edu
Fri, 08 Nov 2002 20:08:46 +0100


Hi I-Lin,

thank you for your detailled answer to my Oracle questions. It's very 
interesting for me, now I got some more questions.

Do you store the DDI files as external LOBs (BFILE)?
Do you store the variable information in a XMLType as CLOB or in the 
object-relational way?

Do I understand you right, that, with your approach, it is not possible 
to make a search in every DDI field, but only in the var field?

For our project I'm wondering, if it would be nice to give the authors 
the possibility to update DDI elements in the database. So I read a bit 
in the documentation of Oracle Release 2 (9.2). As a new feature of XML 
DB is mentioned a piecewise update possibility of XML via XPath, if the 
XMLType instances are stored object-relationally (p. 3 [1]). Perhaps it 
would be possible to build an update feature with WebDAV for XMLSpy or 
similar tools. But if the XML schema changes, it is necessary to change 
the object-relational XMLType, not the XMLType stored as internal or 
external CLOB (Details to piecewise update p. 5-70 [2]).
Propably your way is easier to realize: to edit the DDI file first and 
then to store it as a whole file in the database.

In the new documentation is also mentioned the extraction of a XML 
fragment of a XMLType (p.4-21 [3]). If it's really as documented, than 
it would be possible, to get only a part of the DDI file stored in 
XMLType as a result of a query via an Xpath expression (example on p. 
4-23 [3]).

It sounds powerfull and not to far away from XML standards. Probably I 
will try these features in some tests.

I-Lin Kuo wrote:
>  >- Do you store the whole DDI file of a study in one XMLType field?
>         The entire DDI will be stored in files in a special area outside 
> the DB and a link to the file will be stored in the DB. While the entire 
> DDI can be stored in one XMLType field, that type of storage implies 
> that a search result would return the entire DDI Document -- the study. 
> For our project which searches for variables between various studies, we 
> want a search to return the variable within the study. Therefore, that 
> type of storage would be just plain wrong, and so instead, we store the 
> variable information <var> in an XMLType.
>         While Oracle Text can do indexing within an XMLType via an 
> XPath-like syntax, I'm uncertain about its performance, so I'm also 
> pre-extracting certain data from the var and placing it in fields -- but 
> I'm only doing this for a very few fields which I think will be stable 
> as the DDI evolves. This design has the advantage that I retain all the 
> information in the XMLType field without having to explicitly remap 
> everything if the schema evolves. The other nice consequence is that I 
> don't have to worry so much about not allocating wide enough fields....
> 
>  > - Is it possible to update only some DDI fields inside the XMLType?
>         No. Currently, the contents of the XMLType must be updated as a 
> whole, I think, but I'm not sure. However, it doesn't apply to be 
> because of the reason below:
>         I have also made the conscious decision that only entire DDI 
> documents are to be uploaded to our database, and therefore users will 
> not be able to edit individual entries in the database without editing 
> the entire document. This is to resolve synchronization problems that 
> can arise regarding which version of the data -- the document or the 
> database content -- is the master version. With this decision, the DDI 
> document itself is always the master.
> 
>  > - Is the XPath capability of Oracle enough for your needs?
>         If they are as advertised, then they are sufficient. I'll find 
> out more as I get deeper into the coding.
> 
>  > - How is the performance doing queries to the XMLType?
>         I'll find out once I have actual DDI documents to search on, but 
> I don't anticipate any problems. As I sort of mentioned before, I'm 
> pre-extracting the fields to be displayed in a search summary and I'm 
> indexing the other search fields within the XMLType. This way, a search 
> doesn't really involve querying within the XMLType. I only goI use the 
> XMLType at the end to pull up search result details for display.

Regards, Achim

[1] Oracle XML DB : Key Features in Oracle9i Database Release 2
http://otn.oracle.com/tech/xml/xmldb/pdf/xmldb92tfovw.pdf
[2] Oracle9i XML Database Developer's Guide -- Oracle XML DB 
http://otn.oracle.com/docs/products/oracle9i/doc_library/release2/appdev.920/a96620.pdf

PS: I'm not sure, if this Oracle details are of interest for other DDI 
mailing list participants.