[DDI-users] Storing XML

Mark R. Diggory ddi-users@icpsr.umich.edu
Wed, 06 Nov 2002 11:33:44 -0500


Andrew,

Both eXist and Xindice are developed off of the original dbxml codebase. 
Which I think is still available from the www.xmldb.org website 
somewhere. Originally, eXist was the first to move into the XML-RPC 
technology and away from CORBA, but then again Xindice has also made 
that move to XML-RPC within the last year.

Both have the following support
XML-RPC service runs as a server
XML-DB Client that can be used to connect to both local database 
instances (internal to an application) and remote XML-RPC interfaces 
(running on either a local or remote server somewhere).

1.) In terms of features:
Both support XQuery and a certain subset of XPath
Xindice has XUpdate where with eXist it is still in development.
Both Xindice and eXist have a number of "api's available to access it 
from within a webapplication (Cocoon XML Generator, JSP Taglib, XMLDB API).

2.) In terms of performance:
eXist can support larger document sizes than Xindice, This has alot to 
do with the developer of eXist focusing on improving the "paging memory" 
implementation that both eXist and Xindice use from the old dbxml codebase.
Xindice can support multiple databases in it configuration on one server.

Basically your looking a two different development branches for the 
dbxml codebase. eXist development has moved towards an internal support 
library for XML databases (like dbm). Xindice has moved more towards 
being an independent service (like http). I think both packages have the 
same capabilities, it just each does a particular task better than the 
other.

In my opinion, I like the performance of eXist, and I like the ease of 
installation and customization I can do to eXist over that of Xindice. 
However, I've been waiting on the development of XUpdate in eXist to 
really move forward with my projects.  Currently, rather than install 
the application, I've been just using the jar libraries within my own 
web applications running on Apache Jakarta Tomcat. This way I can secure 
the access to it from within my webapplication instead of in the 
external operating system. I also make installation of my 
webapplications (the front end of the VDC) much simpler.

 In terms of installation, I'd say that eXist is "lighter" than Xindice. 
I've been able to drop the eXist Jars into a webapplication and fire up 
my own instance of the eXist db internally for my own use with minimal 
development on my part.

I would recommend getting familiar with both of them through some 
experimentation. As a Opensource Development Advocate, I'd have to say 
both are novel and cutting edge solutions to the problem of XML Storage 
and retrieval. But, as Opensource projects, getting support is inherent 
in your getting into "the community" and not through some watered down 
"service contract" you'd have to pay cash for. Don't be afraid of 
getting involved with their user/developer lists or to open up the code 
and take a look.

Cheers,

Mark Diggory
Harvard MIT Data Center


Andrew Dzhigo wrote:

>Thank you for your reply. I will definitely take a look at the Virtual Data Center web site.
>Since you have already done some research on eXist and Xindice, could you, please, share with me 
>your opinion of these two systems?
>
>With regards,
>
>Andrew Dzhigo
>Applications Programmer
>Cultural Policy and the Arts
>National Data Archive (CPANDA),
>Princeton University
>adzhigo@princeton.edu
>1-609-258-7561
>
>----- Original Message -----
>From: "Mark R. Diggory" <mdiggory@latte.harvard.edu>
>Date: Monday, November 4, 2002 11:44 am
>Subject: Re: [DDI-users] Storing XML
>
>  
>
>>Hi Andrew,
>>
>>This is a shameless plug for a system we have been developing.
>>
>>We are working on an Opensource Digital Library System that 
>>utilizes the 
>>DDI are its primary storage XML format. This system is primarily 
>>for 
>>Social Science Reseach data such as that published at the ICPSR. 
>>It 
>>provides Indexing/Searching capabilities and server side data 
>>analysis 
>>tools for subsetting and manipulating the datasets. We are 
>>preparing for 
>>a software release towards the end of this year.
>>
>>Feel free to find out more about this project at the following sites:
>>
>>http://www.hmdc.harvard.edu
>>http://www.thedata.org
>>http://thedata.sourceforge.net
>>
>>We have a couple production systems running at Harvard. This is an 
>>older 
>>version of the "Virtual Data Center". But will be upgraded to our 
>>release version once it is completed.
>>http://vdc-prod.hmdc.harvard.edu
>>
>>-Mark Diggory
>>Project Manager / Software Engineer
>>Harvard MIT Data Center
>>http://www.hmdc.harvard.edu
>>
>>p.s. We currently archive the DDI's in a custom repository backed 
>>by 
>>Postgresql. I have been researching eXist 
>>http://exist.sourceforge.net 
>>or Xindice as a possible future Repository Implementations and 
>>have some 
>>rudimentary Tests/Implementations in the works.
>>
>>
>>Andrew Dzhigo wrote:
>>
>>    
>>
>>>Hi, all
>>>
>>>I am an applications programmer for the Cultural Policy and the 
>>>      
>>>
>>Arts National Data Archive (CPANDA). Here at CPANDA we use DDI DTD 
>>to create XML codebooks for the datasets we archive.
>>    
>>
>>>With the number of datasets growing we are now facing the problem 
>>>      
>>>
>>of storing and managing our XML files. At this stage of 
>>development we are investigating the possibilities of using some 
>>sort of XML-aware database management system. Currently, I am 
>>considering Xindice system (formerly dbXML) developed by Apache 
>>Software Foundation as a possible tool for our needs. However, I 
>>would very much like to know how other developers address the 
>>similar problem. What tools, database management systems and 
>>languages do they use?
>>    
>>
>>>Any help would be greatly appreciated.
>>>With regards,
>>>Andrew Dzhigo
>>>Applications Programmer
>>>Cultural Policy and the Arts
>>>National Data Archive (CPANDA),
>>>Princeton University
>>>adzhigo@princeton.edu
>>>1-609-258-7561
>>>
>>>_______________________________________________
>>>DDI-users mailing list
>>>DDI-users@icpsr.umich.edu
>>>http://www.icpsr.umich.edu/mailman/listinfo/ddi-users
>>> 
>>>
>>>      
>>>
>>
>>
>>_______________________________________________
>>DDI-users mailing list
>>DDI-users@icpsr.umich.edu
>>http://www.icpsr.umich.edu/mailman/listinfo/ddi-users
>>
>>    
>>
>
>_______________________________________________
>DDI-users mailing list
>DDI-users@icpsr.umich.edu
>http://www.icpsr.umich.edu/mailman/listinfo/ddi-users
>  
>