Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News

SYS-CON.TV
Cloud Expo & Virtualization 2009 East
PLATINUM SPONSORS:
IBM
Smarter Business Solutions Through Dynamic Infrastructure
IBM
Smarter Insights: How the CIO Becomes a Hero Again
Microsoft
Windows Azure
GOLD SPONSORS:
Appsense
Why VDI?
CA
Maximizing the Business Value of Virtualization in Enterprise and Cloud Computing Environments
ExactTarget
Messaging in the Cloud - Email, SMS and Voice
Freedom OSS
Stairway to the Cloud
Sun
Sun's Incubation Platform: Helping Startups Serve the Enterprise
POWER PANELS:
Cloud Computing & Enterprise IT: Cost & Operational Benefits
How and Why is a Flexible IT Infrastructure the Key To the Future?
Click For 2008 West
Event Webcasts
Populating Word Documents on the Server with Microsoft .NET
Using VSTO to create and manipulate data islands in Office documents

Consider the following portion of an all-too-common server scenario. An authenticated user, perhaps a salesperson, requests a Word document from a server. The document is an expense report, and the server is an ASP, ASP.NET, or SharePoint Server. The server code looks up some information about the user from a database, Active Directory, or Web service. For example, perhaps the server has a list of recent corporate credit card activity that it will prepopulate into the expense list. The server starts up Word but keeps it "invisible" because there is no interactive user on the server. It then uses the Word object model to insert the data into a table, saves the result, and serves up the resulting file to the user.

This is a very suboptimal document life cycle for two reasons. First, it is completely unsupported and strongly recommended against by Microsoft. Word and Excel were designed to be run interactively on client machines with perhaps a few instances of each running at the same time. They were not designed to be scalable and robust in the face of thousands of Web server hits creating many instances on "headless" servers that allow no graphical user interfaces.

Second, this process thoroughly conflates the "view" with the data. The server needs to know exactly how the document is laid out visually so that it can insert and remove the right fields in the right places. A simple change in the document format can necessitate many tricky changes in the server code.

However, automatically serving up documents full of a user's data is such a compelling scenario that many organizations have ignored Microsoft's guidelines and built solutions around server-side manipulation of Word and Excel documents. Those solutions tend to have serious scalability and robustness problems.

What can we do to mitigate these two problems?

Data-Aware VSTO Documents
One way to solve this problem is to move the processing onto the client. Visual Studio Tools 2005 for Microsoft Office 2003 allows you to use Visual Studio to associate customization written in C# or VB 2005 with a Word or Excel document. You can serve up a blank document that has a VSTO-managed customization behind it that runs on the client. The customization can connect to the database server when the document is opened and retrieve data from the database and place it in the document. When the client is ready to send the data back to the database, it connects again and updates the database by reading the updated data in the document. No special document customization has to happen on the server at all, and the database server is doing exactly what it was designed to do.

This solution has a major drawback, however: it requires that every user have access to the database. From a security perspective, it might be smarter to only give the document server access to the database, thereby decreasing the "attack surface" exposed to malicious hackers. What we really want to do is have the document ready to go with the user data in it from the moment the user obtains the document, but without having to start up Word or Excel on the server.

XML File Formats
Avoiding the necessity of starting up a client application on the server is key. Consider the first half of the scenario above: the server takes an existing on-disk document and uses Word to produce a modified version of the document. Word is just a means to an end; if you know what changes need to be made to the bits of the document and how to manipulate the file format, you have no need to start up the client application.

The Word and Excel binary file formats are "opaque," but Word and Excel now support persisting documents in a much more transparent XML format. It is not too hard to write a program that manipulates the XML document without ever starting up Word or Excel. Word provides mechanisms to map an XML schema into the document and then create an XSLT that can transform XML data that matches that schema into the original mapped document.

However, the XML file formats have some drawbacks. Although it is certainly faster and easier to manipulate the XML format directly, parsing large XML files is still not blazingly fast. XML files tend to be quite a bit larger than the corresponding binary files. Word's schema mapping capability is sometimes too constraining for certain solutions - for example, Word's schema mapping is element-centric and doesn't do very well when mapping schemas that are attribute-heavy, such as a typed dataset schema.

We need a way to solve these additional problems. We need a solution that works on binary, non-human-readable files, works with VSTO-customized documents, handles cases that are difficult to achieve with Word's XML and XSLT transform techniques, and a solution that cleanly separates view from data.

The VSTO Data Island
VSTO allows you to associate a managed class called a "host item" with a Word document. VSTO allows you to cache the state of public host item class members that contain data in a "data island" so that they are persisted into the Word document as XML, independent of their user-interface representation. The document format can be either the standard Word binary DOC file format or the new Word XML format.

You can cache almost any kind of data in the XML data island. To be cacheable by the VSTO run time, you must meet the following criteria:

  • The data must be stored in a public member variable or property of a host item (e.g., a customized Word document class)
  • If stored in a property, the property must have no parameters and be both readable and writable
  • The run-time type of the data must be dataset (or a subclass), data table (or a subclass), or any type that is serializable by the System.Xml.Serialization.XmlSerializer object
To tell VSTO that you would like to cache a member variable, just add the Cached attribute to its declaration. Before using the member variable in your code, you can check whether the member was filled in from the data island - use the NeedsFill method provided in a VSTO host item class.

Creating a Word VSTO Customization
Listing 1 shows a simple Word VSTO customization that has two cached member variables called EmpName and Expenses and a Bookmark called EmployeeNameBookmark. To create this customization, launch VSTO or Visual Studio Team System. Choose File >New > Project. Click the Office category under Visual C# as shown in Figure 1. Then click on Word Document as the project type and click OK. Name the project ExpenseReport.

A second dialog will appear prompting you to pick a document to use. Select the "create a new document" option to have VSTO create a new empty Word document. In the newly created project you will see the host item called ThisDocument.cs. Double click on ThisDocument.cs to display a Word editing view inside of Visual Studio. While in the Word editing view, select a place in the Word document where you want to insert the bookmark. From the Insert menu choose Bookmark and name the bookmark EmployeeName. Figure 2 shows the editing experience inside of Visual Studio. You can edit both the Word document itself and the host item code associated with the Word document without leaving the Visual Studio environment.

Now, right click on the ThisDocument.cs host item and choose view code. Edit the code to look like Listing 1. The code declares two cached member variables - EmpName and Expenses. It checks if these cached member variables have been filled from the cache in the ThisDocument_Startup handler. If the string EmpName is filled, the bookmark we created is accessed to set the text to the value of the EmpName string. If the data set Expenses is filled, we iterate over the dataset and put the data into a Word table - the code to do this is omitted for brevity.

Press F5 to run the document customization and verify the cached data feature. Word will start up and the ThisDocument_Startup method will be called. On the first run, the data island will be empty so the first call to NeedsFill will return true. The code sets EmpName to the string "Unknown Employee" but does nothing more. Save the document and close it. As the document is saved, the VSTO run time detects that a member variable marked as cached was changed and saves the state of that variable into the data island in the document - in this case the value of the variable EmpName. Next, reopen the document. On the second run, the call to NeedsFill will return false as the member variable EmpName is found in the data island. The code will then run to set the EmployeeName bookmark's text to contain the string readout of the data island.

About Eric Carter
Eric Carter is the development manager for the Visual Studio Tools for Office (VSTO) team at Microsoft. He helped invent, design, and implement many of the features that are in VSTO today. Previously at Microsoft he worked on Visual Studio for Applications, the Visual Studio Macros IDE, and Visual Basic for Applications for Office 2000 and Office 2003. For more information about VSTO, visit Eric?s blog at http://blogs.msdn.com/eric_carter/default.aspx.

About Eric Lippert
Eric Lippert's primary focus during his nine years at Microsoft has been on improving the lives of developers by designing and implementing useful programming languages and development tools. He has worked on the Windows Scripting family of technologies, Visual Studio Tools for Office, and most recently, the new Language Integrated Query features of C# 3.0. For more information about VSTO, visit Eric's blog at http://blogs.msdn.com/ericlippert/.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

The need to customize each client before one can view the word documents makes this feature useless. Also the xml loads extreeeemely slow. A waste of developer-time so far.

Consider the following portion of an all-too-common server scenario. An authenticated user, perhaps a salesperson, requests a Word document from a server. The document is an expense report, and the server is an ASP, ASP.NET, or SharePoint Server. The server code looks up some information about the user from a database, Active Directory, or Web service. For example, perhaps the server has a list of recent corporate credit card activity that it will prepopulate into the expense list. The server starts up Word but keeps it 'invisible' because there is no interactive user on the server. It then uses the Word object model to insert the data into a table, saves the result, and serves up the resulting file to the user.

Consider the following portion of an all-too-common server scenario. An authenticated user, perhaps a salesperson, requests a Word document from a server. The document is an expense report, and the server is an ASP, ASP.NET, or SharePoint Server. The server code looks up some information about the user from a database, Active Directory, or Web service. For example, perhaps the server has a list of recent corporate credit card activity that it will prepopulate into the expense list. The server starts up Word but keeps it 'invisible' because there is no interactive user on the server. It then uses the Word object model to insert the data into a table, saves the result, and serves up the resulting file to the user.

Consider the following portion of an all-too-common server scenario. An authenticated user, perhaps a salesperson, requests a Word document from a server. The document is an expense report, and the server is an ASP, ASP.NET, or SharePoint Server. The server code looks up some information about the user from a database, Active Directory, or Web service. For example, perhaps the server has a list of recent corporate credit card activity that it will prepopulate into the expense list. The server starts up Word but keeps it 'invisible' because there is no interactive user on the server. It then uses the Word object model to insert the data into a table, saves the result, and serves up the resulting file to the user.


Your Feedback
Frans wrote: The need to customize each client before one can view the word documents makes this feature useless. Also the xml loads extreeeemely slow. A waste of developer-time so far.
SYS-CON India News Desk wrote: Consider the following portion of an all-too-common server scenario. An authenticated user, perhaps a salesperson, requests a Word document from a server. The document is an expense report, and the server is an ASP, ASP.NET, or SharePoint Server. The server code looks up some information about the user from a database, Active Directory, or Web service. For example, perhaps the server has a list of recent corporate credit card activity that it will prepopulate into the expense list. The server starts up Word but keeps it 'invisible' because there is no interactive user on the server. It then uses the Word object model to insert the data into a table, saves the result, and serves up the resulting file to the user.
.NET News Desk wrote: Consider the following portion of an all-too-common server scenario. An authenticated user, perhaps a salesperson, requests a Word document from a server. The document is an expense report, and the server is an ASP, ASP.NET, or SharePoint Server. The server code looks up some information about the user from a database, Active Directory, or Web service. For example, perhaps the server has a list of recent corporate credit card activity that it will prepopulate into the expense list. The server starts up Word but keeps it 'invisible' because there is no interactive user on the server. It then uses the Word object model to insert the data into a table, saves the result, and serves up the resulting file to the user.
SYS-CON India News Desk wrote: Consider the following portion of an all-too-common server scenario. An authenticated user, perhaps a salesperson, requests a Word document from a server. The document is an expense report, and the server is an ASP, ASP.NET, or SharePoint Server. The server code looks up some information about the user from a database, Active Directory, or Web service. For example, perhaps the server has a list of recent corporate credit card activity that it will prepopulate into the expense list. The server starts up Word but keeps it 'invisible' because there is no interactive user on the server. It then uses the Word object model to insert the data into a table, saves the result, and serves up the resulting file to the user.
Latest Cloud Developer Stories
Can you bring services from the cloud to your customers faster and have them adopt it with ease of use or bring the power of bundled services to the fingertips of your clients without creating new rigid ‘apps stove pipes'? Do you want to prevent your business running away to publ...
OCZ Technology Group, a provider of high-performance solid-state drives (SSDs) for computing devices and systems, on Tuesday announced the Z-Drive R4 CloudServ PCI Express (PCIe) flash storage solution, designed to accelerate cloud computing applications and reduce operating expe...
Many organizations have embraced, or are considering, the benefits of cloud computing – speed, flexibility, increased expertise, shared workload, reduced costs, etc. The benefits are many – but so are the risks. What are the threats to cloud security? Which parties assume respons...
In August 2011, SHI Enterprise Solutions (ESS) division launched the SHI Cloud, offering reliable and cost-effective industrial-grade cloud computing platforms. That same division achieved an 82 percent increase in revenue over 2010.
SoftLayer Technologies on Tuesday announced the immediate worldwide availability of SoftLayer Object Storage, a redundant and highly scalable cloud storage service that allows users to easily store, search and retrieve data across the Internet, with optional CDN connectivity, or ...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers
ADS BY GOOGLE

Breaking Cloud Computing News

MEDFORD, Mass., Feb. 14, 2012 /PRNewswire/ -- Recognizing the accelerating global proliferation o...