Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News

SYS-CON.TV
Cloud Expo & Virtualization 2009 East
PLATINUM SPONSORS:
IBM
Smarter Business Solutions Through Dynamic Infrastructure
IBM
Smarter Insights: How the CIO Becomes a Hero Again
Microsoft
Windows Azure
GOLD SPONSORS:
Appsense
Why VDI?
CA
Maximizing the Business Value of Virtualization in Enterprise and Cloud Computing Environments
ExactTarget
Messaging in the Cloud - Email, SMS and Voice
Freedom OSS
Stairway to the Cloud
Sun
Sun's Incubation Platform: Helping Startups Serve the Enterprise
POWER PANELS:
Cloud Computing & Enterprise IT: Cost & Operational Benefits
How and Why is a Flexible IT Infrastructure the Key To the Future?
Click For 2008 West
Event Webcasts
rsync and the Unsung Command Line
How to use rsync to keep data on your Unix computers synchronized perfectly

(LinuxWorld) -- This week's topic is a salute to the command line. It was inspired by a reader named Kevin, who recently brought to my attention some interesting limitations of Windows XP's new feature called "fast user switching." In case you missed the hype, "fast user switching" is Microsoft's name for "multi-user system." It lets more than one user log into Windows XP at the same time on the same machine.

For more details on Kevin's observations, see the resources section for a link to his comments posted on VarLinux.org. You'll also find a link to a column I wrote about fast user switching, in which I assumed that Microsoft had finally delivered a multi-user system with Windows XP. I've also written up some speculation on the matter in my Computerworld column for January 14, so when the 14th rolls around you may want to pay a visit to www.computerworld.com and browse through the columnists section for that particular article. The bottom line is that I admit I was wrong to assume fast user switching is Microsoft's delivery of multi-user capabilities. Based on Kevin's observations, it's obvious Microsoft still hasn't turned Windows NT into a true multi-user system.

How does this relate to the command line? Rather tangentially, I must admit. Bear with me as I walk you through the twisted thought process that led me there.

One of Kevin's observations is that you cannot use fast user switching if you are also using the Offline Folders feature of Windows XP. Offline Folders is a Microsoft Exchange feature that works something like the Briefcase in Windows 9x. It allows you to work on documents that are normally stored on a network even when you are disconnected from the network. When you reconnect, the documents are synchronized through local replication. You can check the resources section for links to the full descriptions of Offline Folders on Microsoft's site, but to quote the relevant section (OST refers to an Offline Storage file):

"If both the offline information store and the information store on the server have changed at the time they are synchronized, changed data on the OST is first copied to the server, and then changed data on the server is copied to the OST. If this is an automatic synchronization -- that is, it is occurring because the user has reconnected to the server -- data is copied in only one direction from the OST to the server, regardless of whether data on the server has changed."

 

Money for nothing

There are two things that strike me as odd about the Microsoft approach to this problem as compared to Linux or any other Unix. First, why does one have to purchase Microsoft Exchange to get this feature? I can accomplish the same task several different ways in Unix with free software that isn't even remotely related to a message store or even a database.

The obvious answer is that Microsoft is using the Offline Folders features as one of Microsoft's many crowbars that are designed to crack open its customers doors (not to mention their wallets) and shove Exchange into their enterprise.

There is no such agenda in Linux, so there is no need to complicate the process of synchronizing documents by wedging the folder replication feature into a message store. There are several ways you can accomplish the same goal in Linux, but the one that seems closest to duplicating the function of Microsoft's Offline Folders is rsync, which is the utility Kevin mentioned in his VarLinux.org comment.

I don't often find the need to work offline, but I happen to use rsync to synchronize my entire home directory across three machines. The rsync utility is handy for this purpose because it's very quick after the first synchronization is done. That's because rsync only copies information that has changed. This is similar to using the command cp -u, which only copies files that have been updated, except that cp checks the file date and rsync uses a checksum algorithm to see which files have changed.

I also use rsync to make a local backup of the entire Documents directory tree on my file server. While I don't happen to work in disconnected mode very often, I could certainly do so thanks to this rsync process. If for some reason my file server failed, I could continue to write new columns and modify old ones with the assurance that all these files would be synchronized properly when my server came back up.

 

How it is done

One of the advantages of using command-line utilities over GUI configuration tools is that GUI tools tend to confine your options to whatever the GUI designer imagined you would want. The command line gives you almost unlimited options as to how you want to manage any given administrative task.

For example, I could create a shell script that synchronizes the Documents directory tree and place that script in my startup folder for KDE. That way I would be assured that the local Documents would always be synchronized before I could start up the KDE word processor, Kword.

If you're only interested in synchronizing the files once per day, you could instead create a cron job that runs the script once daily. In the case of Debian (and probably many other distributions), you can simply place the shell script in the directory /etc/cron.daily.

The shell script might look something like this:

 

#!/bin/bash

PATH=/usr/bin:/usr/sbin:/bin:/sbin USER="me" RSYNC_PASSWORD="secret-password"

echo Synchronizing Documents

rsync -bHlpogtr /var/Documents/* myserver::Documents rsync -bHlpogtr myserver::Documents/* /var/Documents

This script does a two-way synchronization that mimics the behavior of Microsoft's Offline Folders. It first copies any new files or changed files from the client to the server, and then copies anything that has changed at the server back to the client. Personally, I do the synchronization in reverse of this order, since I tend to work with files on the server and only store copies on my client as a backup.

The long list of command-line switches (-bHlpogtr) tells rsync to do things like recurse through directories, preserve the user and group ownership, and other options. You can browse through the various options with the command man rsync.

You may be uncomfortable with the fact that the rsync password is integrated into the script itself, and justifiably so.

This is only necessary because the process of synchronization is automated. If you use rsync interactively, you probably want to use secure shell (SSH) as your transport, in which case rsync does not allow you to automate the process of entering a password. If you use SSH (or even rsh - remote shell) with rsync, you need to be there to type it in yourself. You can only automate the process of entering a password if you are using rsync to talk to an rsync server.

If you're going to automate the process, you'll have to set up an rsync server at the other end, configure the rsync server to recognize passwords, and then store the password somewhere on your local machine. Fortunately, rsync has an option called --password-file that allows you to store the password in a file that you can restrict to root access and hide somewhere. That isn't a perfectly secure solution, but you may prefer it to including the password in the cron script itself. If so, then you probably want to configure your script to look more like the following:

 

#!/bin/bash

PATH=/usr/bin:/usr/sbin:/bin:/sbin USER="me"

echo Synchronizing Documents

rsync --password-file=/home/me/.rsyncpwd -bHlpogtr /var/Documents/* myserver::Documents rsync --password-file=/home/me/.rsyncpwd -bHlpogtr myserver::Documents/* /var/Documents

In the above example, you'll have to create a file called /home/me/.rsyncpwd that contains the text secret-password.

 

The rest of the rsync configuration

I only wanted to make the point about the superiority of command-line processes versus GUI administration, but in case you're interested in using rsync and haven't learned how to use it, here's the rest of what you'd have to do to make the above script work.

First, you need to make sure that your server (called "myserver" in this example) is set up to run rsync as a daemon. You can simply run rsync from an initialization script with the --daemon option, but I prefer to use the inetd approach. If you do, too, here's the line you want to add to /etc/inetd.conf (assuming, of course, that rsync is located in the /usr/bin directory).

 

rsync   stream  tcp     nowait  root   /usr/bin/rsync rsyncd --daemon

You need two more files to make this work: /etc/rsyncd.conf and /etc/rsyncd.secrets. The first file will look something like this (run man rsyncd.conf for more details):

 

[Documents]
uid = me
gid = me
path = /var/stuff/Documents
comment = All server-stored documents
secrets file = /etc/rsyncd.secrets

Then you'll need to specify a user and password for the rsync user called me in rsyncd.secrets. The entry should look something like this:

 

me:secret-password

That's all there is to it. Restart inetd and you should be able to run your synchronization script at the client.

 

Getting really twisted

My coverage of rsync was inspired by the need to duplicate what Microsoft offers through Exchange, but it's hardly an ideal example of how flexible and powerful the command line can be when compared to a GUI administration tool. You can do so much more at the Unix command line. While it may be possible to duplicate the functions by designing a flexible GUI interface, I can't imagine why anyone would bother doing so, since it would require a significant effort that wouldn't pay off in the end.

For example, here's a command I used to extract list of unique host names of computers that probed my Web servers for nimda and other similar Windows Internet Information Server security holes. (My servers run Linux, so they are immune to such probes, but I was curious as to how many probes I was receiving per day. During the height of nimda's popularity, I received at least 30,000 probes the first weekend.)

 

egrep --regexp="^.*\.(exe|dll|ida).*"  \
/var/log/apache/access.log | cut -f 1 -d ' ' | sort | uniq

The above command searches the file /var/log/apache/access.log for log entries that contain any of the following file extension strings: .exe, .dll, or .ida. If it finds a match, it will output only the first field from that log entry, which is the host name of the computer that probed the web site for vulnerability to the nimda worm. It then sorts the output of all those sites, and eliminates any duplicate entries.

The power here lies in the ability to pipe the output of one command to another, then to another, and so on, so that you end up with the results you want using a single command. One of the coolest portions of this command line is the cut -f 1 -d ' ' part. As the egrep command finds matching text lines from the access.log file, this cut command cuts out the desired "field" from the text line. The -f 1 tells it to grab the first field, which is where the host name can be found. The -d ' ' portion tells it that all the fields in this text line are delimited by a space. Obviously, cut is a very powerful command, since it allows you to grab just about any information imaginable from a text line as long as you know how that text line is formatted. And if cut doesn't cut it for you, then there's always sed, a more powerful stream editor that lets you apply very complex search conditions to extract text from the output of a prior command.

In conclusion, I'll admit that GUIs are wonderful and I enjoy using the extremely powerful KDE for most of my work. I'll even confess that I'm a sucker for things like the "mosfet" theme for KDE that adds features like translucent menus and Macintosh-like liquid components. (For information on how to get these features, see resources.) When it comes to getting serious work done, there's no administration tool that compares to the Unix command line.

About Nicholas Petreley
Nicholas Petreley is a computer consultant and author in Asheville, NC.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

You can setup non-interactive ssh connections by storing key material locally (in a .ssh directory) which rsync can use. Then at least your key material or password is stored within your home directory, and multiple users could use the rsync script.


Your Feedback
Quentin Neill wrote: You can setup non-interactive ssh connections by storing key material locally (in a .ssh directory) which rsync can use. Then at least your key material or password is stored within your home directory, and multiple users could use the rsync script.
Latest Cloud Developer Stories
Rackspace Hosting, the service leader in cloud computing, on Thursday announced its acquisition of SharePoint911, an industry leader in SharePoint consulting, training, and "JumpStart" services within SharePoint. The unification of both companies provides capabilities to deliver ...
With Cloud Expo 2012 New York (10th Cloud Expo) now under four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have techn...
Nimble, the social CRM platform has announced the launch of Nimble 2.0, billed as the “most social” CRM platform on the market today. Nimble was designed entirely with social CRM in mind and is the first social business platform that empowers companies with the ability to get clo...
2011 was a year of rapid adoption for public and private cloud services. Instant and on-demand server provisioning was the driving force behind the massive growth. On top, cloud server templates and script automation simplified application installation for simple and pre-defined ...
"Having been in the IT field for many years, I believe the cloud computing chapter in the industry is an exciting one and I am proud to be a part of it," said National Reconaissance Office (NRO) Chief Information Officer Jill T. Singer Tuesday, as it was announced that she was on...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers
ADS BY GOOGLE