Book Excerpt: Making Software a Service
Part 1 - Instead of developing a simple desktop application, you can develop your software as a service
By: Chris Moyer
Jul. 21, 2011 10:00 AM
Developing your Software as a Service (SaaS) takes you away from the dark ages of programming and into the new age in which copyright protection, DMA, and pirating don't exist. In the current age of computing, people don't expect to pay for software but instead prefer to pay for the support and other services that come with it. When was the last time anyone paid for a web browser? With the advent of Open Source applications, the majority of paid software is moving to hosted systems which rely less on the users' physical machines. This means you don't need to support more hardware and other software that may conflict with your software, for example, permissions, firewalls, and antivirus software.
Instead of developing a simple desktop application that you need to defend and protect against pirating and cloning, you can develop your software as a service; releasing updates and new content seamlessly while charging your users on a monthly basis. With this method, you can charge your customers a small monthly fee instead of making them pay a large amount for the program upfront, and you can make more money in the long run. For example, many people pirate Microsoft Office instead of shelling out $300 upfront for a legal copy, whereas if it were offered software online in a format such as Google Docs, those same people might gladly pay $12.50 a month for the service. Not only do they get a web-based version that they can use on any computer, but everything they save is stored online and backed up. After two years of that user paying for your service, you've made as much money from that client as the desktop version, plus you're ensuring that they'll stay with you as long as they want to have access to those documents. However, if your users use the software for a month and decide they don't like it, they don't need to continue the subscription, and they have lost only a small amount of money. If you offer a trial-based subscription, users can test your software at no cost, which means they're more likely to sign up.
Tools Used in This Book
Signing Up for Amazon Web Services
After you create your account, log in to your portal by clicking Account and then choosing Security Credentials. Here you can see your Access Credentials, which will be required in the configuration section later. At any given time you may have two Access keys associated with your account, which are your private credentials to access Amazon Web Services. You may also inactivate any of these keys, which helps when migrating to a new set of credentials because you may have two active until everything is migrated over to your new keys.
The next step will be to actually install the boto package. As with any Python package, this is done using the setup.py file, with either the install or develop command. Open up a terminal, or command shell on Windows, change the directory to where you downloaded the boto source code, and run
$ python setup.py install
Depending on what type of system you run, you may have to do this as root or administrator. On UNIX-based systems, this can be done by prepending sudo to the command:
$ sudo python setup.py install
On Windows, you should be prompted for your administrative login if it's required, although most likely it's not.
Setting Up the Environment
You can make this the active credential file by setting an environment variable AWS_CREDENTIAL_FILE and pointing it to the full location of this file. On bash-based shells, this can be done with the following:
You can also add this to your shell's RC file, such as .bashrc or .zshrc, or add the following to your .tcshrc if you use T-Shell instead:
For boto, create a boto.cfg that enables you to configure some of the more boto-specific aspects of your systems. Just like in the previous example, you need to make this file and then set an environment variable, this time BOTO_CONFIG, to point to the full path of that file. Although this configuration file isn't completely necessary, some things can be useful for debugging purposes, so go ahead and make your boto.cfg:
# File: boto.cfg
# Set the default SDB domain
# Set up base logging
The first thing to do here is set up an [Instance] section that makes your local environment act like an EC2 instance. This section is automatically added when you launch a boto-based EC2 instance by the startup scripts that run there. These configuration options may be referenced by your scripts later, so adding this section means you can test those locally before launching an EC2 instance.
Next, set the default SimpleDB domain to "default," which will be used in your Object Relational Mappings you'll experiment with later in this excerpt. For now, all you need to know is that this will store all your examples and tests in a domain called "default," and that you'll create this domain in the following testing section.
Finally, you set up a few configuration options for the Python logging module, which specifies that all logging should go to standard output, so you'll see it when running from a console. These configuration options can be custom configured to output the logging to a file, and any other format you may want, but for the basics here just dump it to your screen and show only log messages above the INFO level. If you encounter any issues, you can drop this down to DEBUG to see the raw queries being sent to AWS.
Testing It All
>>> import boto
The preceding code can test your connectivity to SimpleDB and create the default domain referenced in the previous configuration section. This can be useful in later sections of this excerpt, so make sure you don't get any errors. If you get an error message indicating you haven't signed up for the service, you need to go to the AWS portal and make sure to sign up for SimpleDB. If you get another error, you may have configured something incorrectly, so just check with that error to see what the problem may have been. If you're having issues, you can always head over to the boto home page: http://github.com/boto/boto or ask for help in the boto users group: http://groups.google.com/group/boto-users.
What Does Your Application Need?
Think about this application as a typical nonstatic website that requires some sort of execution environment or web server, such as an e-commerce site or web blog. When a request comes in, you need to return an HTML page, or perhaps an XML or JSON representation of just the data, that may be either static or dynamically created. To determine this, you need to process the actual request using your compute power. This process also requires fast temporary storage to store the request and build the response. It may also require you to pull information about the users out of a queryable long-term storage location. After you look up the users' information, you may need to pull out some larger long-term storage information, such as a picture that they may have requested or a specific blog entry that is too large to store in a smaller queryable storage engine. If the users request to upload a picture, you may have to store that image in your larger long-term storage engine and then request that the image be resized to multiple sizes, so it may be used for a thumbnail image. Each of these requirements your application has on the backend may be solved by using services offered by your cloud provider.
If you expand this simple website to include any service, you can realize that all your applications need the same exact thing. If you split apart this application into multiple layers, you can begin to understand what it truly means to build SaaS, instead of just the typical desktop application. One major advantage of SaaS is that it lends itself to subscription-based software, which doesn't require complex licensing or distribution points, which not only cuts cost, but also ensures that you won't have to worry about pirating. Because you're actually providing a service, you're locking your clients into paying you every time they want to use the service. Clients also prefer this method because, just like with a cloud-hosting provider, they don't have to pay as much upfront, and they can typically buy in a small trial account to see if it will work for them. They also don't have to invest in any local hardware and can access their information and services from any Internet access. This type of application moves away from the requirements of having big applications on your client's systems to processing everything on your servers, which means clients need less money to get into your application.
Taking a look back at your website, you can see that there are three main layers of this application. This is commonly referred to as a three-tier application pattern and has been used for years to develop SaaS. The three layers include the data layer to store all your long-term needs, the application layer to process your data, and the client or presentation layer to present the data and the processes you can perform for your client.
Another large part of this layer is the small, fast, and queryable information. In most typical systems, this is handled by a database. This is no different in cloud-based applications, except for how you host this database.
Introducing the AWS Databases
RDB is Amazon's solution for applications that cannot be built using SDB for systems with complex requirements of their databases, such as complex reporting, transactions, or stored procedures. If you need your application to do server-based reports that use complex select queries joining between multiple objects, or you need transactions or stored procedures, you probably need to use RDB. This new service is Amazon's solution to running your own MySQL database in the cloud and is actually nothing more than an Amazon-managed solution. You can use this solution if you're comfortable with using MySQL because it enables you to have Amazon manage your database for you, so you don't have to worry about any of the IT-level details. It has support for cloning, backing up, and restoring based on snapshots or points-in-time. In the near future, Amazon will be releasing support for more database engines and expanding its solutions to support high availability (write clustering) and read-only clustering.
If you can't figure out which solution you need to use, you can always use both. If you need the flexibility and power of SDB, use that for creating your objects, and then run scripts to push that data to MySQL for reporting purposes. In general, if you can use SDB, you probably should because it is generally a lot easier to use. SDB is split into a simple three-level hierarchy of domain, item, and key-value pairs. A domain is almost identical to a "database" in a typical relational DB; an Item can be thought of as a table that doesn't require any schema, and each item may have multiple key-value pairs below it that can be thought of as the columns and values in each item. Because SDB is schema-less, it doesn't require you to predefine the possible keys that can be under each item, so you can push multiple item types under the same domain.
In Figure 1, the connection between item to key-value pairs is a many-to-one relation, so you can have multiple key-value pairs for each item. Additionally, the keys are not unique, so you can have multiple key-value pairs with the same value, which is essentially the same thing as a key having multiple values.
Figure 1: The SDB hierarchy
Connecting to SDB
>>> import boto
This returns a single item by its name, which is logically equivalent to selecting all attributes by an ID from a standard database. You can also perform simple queries on the database, as shown here:
>>> db.select("SELECT * FROM `my_domain_name` WHERE `name`
The preceding example works exactly like a standard relational DB query does, returning all attributes of any item that contains a key name that has foo in any location of any result, sorting by name in descending order. SDB sorts and operates by lexicographical comparison and handles only string values, so it doesn't understand that [nd]2 is less than [nd]1. The SDB documentation provides more details on this query language for more complex requests.
Using an Object Relational Mapping
from boto.sdb.db.model import Model
This code creates two classes (which can be thought of like tables) and a SimpleObject, which contains a name, number, and multivalued property of strings. The number is automatically converted by adding the proper value to the value set and properly loaded back by subtracting this number. This conversion ensures that the number stored in SDB is always positive, so lexicographical sorting and comparison always works. The multivalue property acts just like a standard python list, enabling you to store multiple values in it and even removing values. Each time you save the object, everything that was in there is overridden. Each object also has an id property by default that is actually the name of the item because that is a unique ID. It uses Python's UUID module to generate this ID automatically if you don't manually set it. This UUID module generates completely random and unique strings, so you don't rely on a single point of failure to generate sequential numbers. The collection_name attribute on the object_link property of AnotherObject is optional but enables you to specify the property name that is automatically created on the SimpleObject. This reverse reference is generated for you automatically when you import the second object.
boto enables you to create and query on these objects in the database in another simple manor. It provides a few unique methods that use the values available in the SDB connection objects of boto for you so that you don't have to worry about building your query. To create an object, you can use the following code:
>>> my_obj = SimpleObject("object_id")
To create the link to the second object, you have to actually save the first object unless you specify the ID manually. If you don't specify an ID, it will be set automatically for you when you call the put method. In this example, the ID of the first object is set but not for the second object.
To select an object given an ID, you can use the following code:
>>> my_obj = SimpleObject.get_by_id("object_id")
This call returns an instance of the object and enables you to retrieve any of the attributes contained in it. There is also a "lazy" reference to the second object, which is not actually fetched until you specifically request it:
You call next() on the other_objects property because what's returned is actually a Query object. This object operates exactly like a generator and only performs the SDB query if you actually iterate over it. Because of this, you can't do something like this:
This feature is implemented for performance reasons because the query could actually be a list of thousands of records, and performing a SDB request would consume a lot of unnecessary resources unless you're actually looking for that property. Additionally, because it is a query, you can filter on it just like any other query:
>>> query = my_obj.other_objects
In the preceding code, you would then be looping over each object that has a name ending with Other, sorting in descending order on the name. After returning all matching results, a StopIteration exception is raised, which results in the loop terminating.
Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week