Comments
Patrick Collands wrote: collands (AT) gmail com I'd be very grateful for an invitation. Thank you.
Cloud Expo on Google News

SYS-CON.TV

2009 East
PLATINUM SPONSORS:
IBM
Smarter Business Solutions Through Dynamic Infrastructure
IBM
Smarter Insights: How the CIO Becomes a Hero Again
Microsoft
Windows Azure
GOLD SPONSORS:
Appsense
Why VDI?
CA
Maximizing the Business Value of Virtualization in Enterprise and Cloud Computing Environments
ExactTarget
Messaging in the Cloud - Email, SMS and Voice
Freedom OSS
Stairway to the Cloud
Sun
Sun's Incubation Platform: Helping Startups Serve the Enterprise
POWER PANELS:
Click For 2008 West
Event Webcasts
Designing an Open, Standards-Based Reporting System - XML meets the challenges and design goals of a business reporting system
Designing an Open, Standards-Based Reporting System - XML meets the challenges and design goals of a business reporting system

As XML has grown more prevalent as a data delivery mechanism, so too has the need to use it for presentation in a wide variety of reporting formats. XML is useful for more than just the delivery of information, however. It can be used to help solve a wide range of problems encountered when designing a business data reporting solution, from specifying the layout of the reports themselves to controlling where the data used in the report comes from.

At Panscopic, we develop an enterprise-class data reporting and analytics product that consists of two main elements: the Panscopic Scope Server, which takes report definition files and executes them to produce finished output, and the Scope Creation Suite, a set of client-side authoring tools that create the report definitions that the server executes (reports are called "Scopes" in Panscopic's parlance). Some of the challenges and design goals that we faced when designing the product architecture were:

  • Report definitions needed to be in a form, preferably text-based, that was familiar to developers and could be edited outside of our tools if necessary.
  • The report definition syntax needed to maintain a clear separation between the report data content and its presentation, and promote the reuse of basic report objects such as queries, layouts, and parameters.
  • The system had to be extensible, so that new data sources, layout components, etc., could be introduced over time and from outside of our development organization.
  • The product needed to be capable of extracting information directly from XML-formatted data in a natural, standardized way.

    We found that XML was particularly well suited to solving these design challenges. For example, the reports themselves are created and stored as XML files, which are then loaded and executed by the Scope server. Also, each report file contains a reference to one or more data sources that are exposed by the server. The server maintains this list of available data sources in another XML file, which can be edited by an administrator to point to whatever data source is desired. Since the reports' references to these data sources go through this abstraction layer, administrators can change the data source pointers as often as they like without affecting the reports (as long as they follow the same schema, of course). Finally, we used XPath, the W3C standard for referring to tags within an XML structure, to extract information from within XML-formatted data.

    The Report Definition Language
    To achieve our higher-level design goals, such as separating content from layout, promoting object reuse, and extracting specific information from XML data, we needed to go a little further with our design than just finding a representation for everything as XML tags. The solution we came up with was the Report Definition Language, or RDL (pronounced "riddle"). RDL is an XML-based file format that describes how a report retrieves its data, manipulates it, and realizes the result. RDL files are divided into top-level sections, each represented by an XML tag, that contain the different parts of the report. The most important sections of an RDL file are <rdl:parameters>, <rdl:content>, and <rdl:layout>. Listing 1 shows an example RDL file.

    The parameters section contains descriptions of the reports parameters, which act essentially like variables passed to the report at runtime. Parameters can either be fixed-form, meaning that the value is restricted to one from a predefined list of values, or free-form, in which case the value can be anything. Each of these parameters can be assigned to a form control on a Web page that supplies its value, or the value can be directly assigned in the URL that is used to request the report from the server. Parameters can also be assigned a default value to be used if none is supplied by the user, and can be marked as mandatory, indicating that the user must supply a value for it.

    The content section indicates where the data for the report will be drawn from. Inside the content section are one or more <rdl:data> tags, each of which specifies a data source that supplies data to the report. These data sources are maintained in a list by the server (as described earlier); Listing 2 shows an example of a data source entry that might appear in the server's configuration file. Each data tag contains sub-tags that are specific to the type of data source being accessed. The example shown in Listing 1 is using a connection that resolves to a relational data source (as indicated by the <rdl:rdbms> tag). The type of data source and the way it exposes data columns is kept abstracted from the layout by the <rdl:return> section contained within the data section. It is the job of the <rdl:return> section to expose the columns of data returned by the data source to the layout section in a uniform way. This approach allows vastly different types of data and layout components to be hooked together seamlessly.

    The layout section determines how the report will be visualized for the user. Of course, not all reports are necessarily consumed by humans: the report may be delivered in XML format for consumption by another service. Inside the layout section are one or more <rdl:useComponent> tags, which refer to layout components used to format the data specified in the content section. Layout components are specified and configured in XML, but are implemented on the back end as JSP pages. Each component has the built-in ability to realize the report as one of a number of different formats, such as HTML, XML, or PDF.

    By keeping these sections distinctly separate, the report is broken up into its constituent parts, each of which can be saved and reused in other reports. For example, a query that is written for one report can easily be saved and stored in the server's network-accessible catalog for use by another developer in a different report, possibly with an entirely different layout. Similarly, a particular layout can be used again and again with other data queries.

    Extracting Data from XML Data Sources
    To address the design requirement of being able to extract information directly from an XML data source, we turned to XPath, the W3C standard for navigating among the nodes of an XML structure.

    The example shown in Listing 3 illustrates a content section that is using an XML data source (indicated here by the <rdl:xmlsource> tag). You can see from this example that the <rdl:return> section mentioned earlier makes use of certain attributes that contain XPath syntax within them. The <rdl:return> tag itself has a "selectNode" attribute, and each of the <rdl:column> tags has a "fieldPath" attribute. These attributes contain XPath expressions that refer to specific tags in a returned XML data structure.

    The selectNode attribute identifies a set of nodes in the XML data that corresponds to repeating, "record-style" information from which data is to be extracted. The Scope server iterates over the set of nodes that matches this expression and evaluates the <rdl:column> tags' fieldPath attribute expressions against each of those nodes. In this way, the data is extracted from the XML and presented to the layout section in the same way that two-dimensional data from traditional JDBC sources is, reducing the complexity and required learning curve for the developer. In addition, the XPath expressions can be written to provide further filtering and processing on the returned data.

    Conclusion
    Using XML to solve our design requirements had several beneficial results. First, we were able to take advantage of a wide range of available open-source code to perform common XML operations, such as parsing the code to build DOM trees for editing and using SAX to process the files on the server. Second, using XML allowed us to keep the format of our reports open and text-based, which in turn allows developers to use whichever tools they are comfortable with and to work with a syntax with which they are already familiar. This also made it easy for us to define extensibility APIs that allow customers to add their own components to the product in a uniform, easily understood way, and that simplify administration tasks such as adding new data sources to the system. Finally, we are better able to take advantage of new XML technologies as they become available, such as XPath and XQuery, for working with native XML data.

    About Joe Marini
    Joe Marini is a senior engineer at Panscopic Corporation (www.panscopic.com) an XML- and J2EE standards–based reporting solution provider. Joe has written and collaborated on a series of books about Web development.

  • In order to post a comment you need to be registered and logged in.

    Register | Sign-in

    Reader Feedback: Page 1 of 1

    Couldn't agree with Joe more strongly. We have just completed design of a complete legislative information system fo a large state Senate, using XML as the foundation of a unified information life cycle. The Senate took the innovative step of also mandating integrtation of many of its legacy systems in the XML environment. We found appropriate software to support this and have developed the entire enterprise information model based on XML. My own conviction is that this the future of complex data.


    Your Feedback
    Barry Schaeffer wrote: Couldn't agree with Joe more strongly. We have just completed design of a complete legislative information system fo a large state Senate, using XML as the foundation of a unified information life cycle. The Senate took the innovative step of also mandating integrtation of many of its legacy systems in the XML environment. We found appropriate software to support this and have developed the entire enterprise information model based on XML. My own conviction is that this the future of complex data.
    Latest Cloud Developer Stories
    CloudBench Applications, Inc. announced its financial results for the three months and nine months ending September 30, 2009. All amounts are stated in Canadian dollars unless otherwise noted. Revenues from BasicGov, the Company's cloud computing solution for local government, gr...
    The new contract is an industry first, with CSC being the first Microsoft partner to lead and win a cloud computing services agreement of this scale. Under terms of the contract, CSC will provide Royal Mail Group's 30,000 employees with access to new IT services using Microsoft's...
    Operates in over 170 countries and is one of the world’s leading providers of communications solutions and services. Richard Tarboton talks for MeettheBoss.TV on his role as Head of Energy & Carbon for BT and what they are doing towards reducing carbon emissions.
    CA is going to put its Agile Planner software on salesforce.com’s Force.com platform in the first half to accelerate development time and give users visibility over their development initiatives to reduce time-to-market. Customers are supposed to be able to accelerate the deploym...
    Despite its uncertain fate Sun soldiers on. Monday it trotted out a cloud-based multiplatform desktop as a service for K-12 and community colleges that can run Windows, the Mac OS, Linux and Solaris applications to nearly any client device, including its own Sun Ray thin clients....
    Subscribe to the World's Most Powerful Newsletters
    Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
    Click to Add our RSS Feeds to the Service of Your Choice:
    Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
    myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
    Publish Your Article! Please send it to editorial(at)sys-con.com!

    Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

    SYS-CON Featured Whitepapers
    ADS BY GOOGLE

    Breaking Cloud Computing News
    CloudBench Applications, Inc. announced its financial results for the three months and nine months e...