Main

XML Parsers Archives

June 28, 2007

Apache Xerces

The Xerces Java Parser 1.4.4 supports the XML 1.0 recommendation and contains advanced parser functionality, such as support for the W3C's XML Schema recommendation version 1.0, DOM Level 2 version 1.0, and SAX Version 2, in addition to supporting the industry-standard DOM Level 1 and SAX version 1 APIs.

Apache Xerces is a collaborative software development project dedicated to providing robust, full-featured, commercial-quality, and freely available XML parsers and closely related technologies on a wide variety of platforms supporting several languages. This project is managed in cooperation with various individuals worldwide (both independent and company-affiliated experts), who use the Internet to communicate, plan, and develop XML software and related documentation.

Apache Xerces exists to promote the use of XML. We view XML as a compelling paradigm that structures data as information, thereby facilitating the exchange, transformation, and presentation of knowledge. The ability to transform raw data into usable information has great potential to improve the functionality and use of information systems. We intend to build freely available XML parsers and closely related technologies in order to engender such improvements.

The Apache Xerces parsers support standard APIs (formal, de facto, or proposed). They are designed to be high performance, reliable, and easy to use. To facilitate easy porting of ideas between languages, the API's supported should be as similar as possible, given the constraints of the languages and existing architectures. Apache Xerces parsers should also be designed to work efficiently with other Apache projects that deal with XML whenever possible.

We believe that the best way to further these goals is by having both individuals and corporations collaborate on the best possible infrastructure, APIs, code, testing, and release cycles. Components must be vendor neutral and usable as core components for all.

In order to achieve a coherent architecture between Apache Xerces parsers and other components and applications, standards (formal or de facto) will be used as much as possible for both protocols and APIs. Where appropriate, experiences and lessons learned will be fed back to standards bodies in an effort to assist in the development of those standards. We will also encourage the innovation of new protocols, APIs, and components in order to seed new concepts not yet defined by standards.

Apache Xerces Project Home Page
http://xerces.apache.org/xerces-j/

Download Apache Xerces
http://archive.apache.org/dist/xml/xerces-j/

Apache XMLBeans

XMLBeans is a technology for accessing XML by binding it to Java types. XMLBeans provides several ways to get at the XML, including:

* Through XML schema that has been compiled to generate Java types that represent schema types. In this way, you can access instances of the schema through JavaBeans-style accessors after the fashion of "getFoo" and "setFoo".

The XMLBeans API also allows you to reflect into the XML schema itself through an XML Schema Object model.
* A cursor model through which you can traverse the full XML infoset.
* Support for XML DOM.


XMLBeans provides intuitive ways to handle XML that make it easier for you to access and manipulate XML data and documents in Java.

Characteristics of XMLBeans approach to XML:

* It provides a familiar Java object-based view of XML data without losing access to the original, native XML structure.

* The XML's integrity as a document is not lost with XMLBeans. XML-oriented APIs commonly take the XML apart in order to bind to its parts. With XMLBeans, the entire XML instance document is handled as a whole. The XML data is stored in memory as XML. This means that the document order is preserved as well as the original element content with whitespace.

* With types generated from schema, access to XML instances is through JavaBean-like accessors, with get and set methods.

* It is designed with XML schema in mind from the beginning — XMLBeans supports all XML schema definitions.

* Access to XML is fast.

The starting point for XMLBeans is XML schema. A schema (contained in an XSD file) is an XML document that defines a set of rules to which other XML documents must conform. The XML Schema specification provides a rich data model that allows you to express sophisticated structure and constraints on your data. For example, an XML schema can enforce control over how data is ordered in a document, or constraints on particular values (for example, a birth date that must be later than 1900). Unfortunately, the ability to enforce rules like this is typically not available in Java without writing custom code. XMLBeans honors schema constraints.

Note: Where an XML schema defines rules for an XML document, an XML instance is an XML document that conforms to the schema.

You compile a schema (XSD) file to generate a set of Java interfaces that mirror the schema. With these types, you process XML instance documents that conform to the schema. You bind an XML instance document to these types; changes made through the Java interface change the underlying XML representation.

Previous options for handling XML include using XML programming interfaces (such as DOM or SAX) or an XML marshalling/binding tool (such as JAXB). Because it lacks strong schema-oriented typing, navigation in a DOM-oriented model is more tedious and requires an understanding of the complete object model. JAXB provides support for the XML schema specification, but handles only a subset of it; XMLBeans supports all of it. Also, by storing the data in memory as XML, XMLBeans is able to reduce the overhead of marshalling and demarshalling.

XMLBeans Project Home Page
http://xmlbeans.apache.org/

Download XMLBeans
http://xmlbeans.apache.org/documentation/conInstallGuide.html