Finally: Crash-consistent multiple-volume snapshots in AWS

Last week, Amazon Web Services (AWS) announced that they now support crash-consistent snapshots of multiple EBS volumes: https://aws.amazon.com/blogs/storage/taking-crash-consistent-snapshots-across-multiple-amazon-ebs-volumes-on-an-amazon-ec2-instance/. This is a big deal, especially for database workloads, as this makes it much easier to take a consistent snapshot of all volumes (RAID, data + log, file groups, etc.) involved in a single database instance. This makes it much easier to create safe disaster recovery solutions for databases running on EC2. Big step forward for AWS!

SQL Server on Amazon Web Services: 2014 in review

I have a particular interest in the combination of Microsoft SQL Server and the cloud services of Amazon Web Services, for the following reasons:

  • For almost all web applications (and for all SaaS applications in particular) the database system is by far the most business-critical component, where absolutely nothing should go wrong, both in performance and reliability.
  • Microsoft SQL Server has always been an incredibly robust and user-friendly relational database system, and of course a no-brainer if you work within the Microsoft ecosystem. Its (unavoidable) move to core-based licensing has driven up licensing costs (and lost it some friends), but especially the licensing costs of SQL Server Web Edition are still very low (the Enterprise edition is far too expensive, though, keeping it and its features out of reach of many companies).
  • Amazon Web Services (AWS) dwarves all other cloud-service providers, and is in most cases the default choice, even when using Microsoft products. 2014 has seen a very rapid pace of innovation from Amazon. The only time that AWS hasn’t been the leader was when in march 2014 Google drastically reduced the prices for its cloud storage (to $0,026/GB/month). Only a day later AWS followed with a similar price reduction of around 60% for cloud storage.

So, in review, some of the important features and innovations of AWS in 2014, relating to SQL Server:

  1. Cloud storage price reduction (march): The prices of Amazon S3 have been reduced by around 60% (to $0,03/GB/month). This makes it even easier to store a very generous amount of SQL Server backups and snapshots in S3. It will still be a negligible amount on your monthly bill from AWS. My backup strategy is now to to make EBS snapshots of the primary SQL Server data disks every 10 minutes, make transaction-log backups every 15 minutes and snapshot the disks with the log-backups, and make nightly full SQL Server backups, and store the full backups in S3 separately.
  2. New memory-optimized instances (april): The R3 instance types use new Intel Ivy Bridge processors, and go up to 32 cores and 244 GB RAM. These instances should now be the default choice when setting up your own SQL Server instance. Some important issues, though, with these instances:
    • The R3 instances have no direct competition from Google, so they have not been included in any price reductions and are not very cheap. r3.2xlarge, with 8 cores and 61GB RAM and SQL Server Web Edition is $1.555 per hour.
    • SQL Server Web Edition (the best choice for starting web companies) can only use 64GB of RAM, so any instance type bigger than r3.2xlarge is wasted.
  3. Multi-Availability zone (mirroring) support for SQL Server in Amazon’s Relational Database Service (RDS) (may): Although I’m personally moving away from using Amazon RDS  (Amazon’s RDMS PaaS solution), mainly because its lack of backup options on the database level, and its lack of storage scalability, RDS is still a great way to start with Amazon’s cloud services. This Multi-AZ support is based on SQL Server database mirroring, which brings a few issues:
    • SQL Server mirroring is not supported for the SQL Server Web Edition, so no Multi-AZ if you use this version of SQL Server.
    • Microsoft seems to be moving away from mirroring as a disaster recovery method, towards AlwasOn availability groups.
  4. SSD-backed Elastic Block Storage (june): Previously only the instance storage (that is lost on stopping and starting an instance) could be an SSD. Now all virtual drives can (and should be) SSD storage. This is a huge gain for SQL Server, which benefits tremendously from the increased seek time of SSD. When using SSD drives on Amazon I don’t even bother anymore with reserved IOPS, because a SSD of 1 TB (currently the maximum size) already gives you 3000 IO per second.
  5. New Amazon region in Frankfurt (october)Until now, Ireland was the only region (a set of datacenters) of Amazon within the EU. With this second region, it has become an option to spread out your SQL Server instances, backups, and snapshots over multiple regions, while still keeping all data within the EU.
  6. Private DNS (november). You can now give your SQL Server instance easy-to-use network names.
  7. New payment options for reserved instances (december)In most cloud solutions, the database servers are by far the most expensive instances. This also makes the upfront costs for reserving these instances a big hurdle, even if the costs over the full term are lower. There are now new payment options for reserved instances without upfront payments, where you still get a 30% price reduction if you reserve the instance for a year.

Announced, but no yet available:

  • EBS volumes larger than 1TB. I have always been skeptical of software-based RAID-ing of virtual drives in a cloud environment, so it’s great to see that EBS volumes of up to 16TB have been announced, making life a lot easier for administrators of large SQL Server databases.

Curious about:

  • Amazon Aurora – new “cloud-optimized” relational database engine. I’m always a bit skeptical when cloud-companies feel the need to build their own “cloud-optimized” relational database engine. As we have seen with Azure SQL, companies are reluctant or unable to re-architect their applications to effectively use these database engines, and also really need the full feature set of the full database product (such as database backups, point-in-time restores, etc.). But we’ll see how this turns out.

Wishlist for 2015:

  • SQL Server Enterprise as “license-included”. Currently, Amazon does not offer server instances that includes SQL Server Enterprise. You have to buy your own licenses, that need to include Microsoft Software Assurance, and somehow certify these licenses to use them in the cloud. The upfront cost and hassle of this is simply too great. Microsoft Azure offers instances with SQL Server Enterprise included, so is there a reason why Amazon can’t or is not allowed to?
  • Faster EBS snapshots. I really like the snapshot mechanism of EBS, but at the moment they often take such a long time, that in practice you can take them (at best) only every 15 minutes (I would prefer every 5 minutes), making them less than ideal as a disaster recovery solution.
  • Make RDS SQL Server more useful. Currently, RDS SQL Server has some limitations that make it impractical for quite a few use cases:
    1. Make storage scalable. Currently the storage used by a SQL Server RDS instance can not be increased.
    2. Make database-level backups available.
    3. Lift the limit of 30 databases.
  • Improve the AWS console UI. This is a common complaint from almost all AWS users: The web-based console UI is still very primitive (especially for such crucial cases as IAM security), and it’s always a guess which feature works in which browser.

All in all, it’s been a pretty good year for all of us running SQL Server in AWS, with the general availability of SSD-drives as the highlight. Happy 2015 everyone!

The accumulating cost of digital certificates

The last few months I’ve had to go through several of the annoying certificate-renewal processes that recur every year. It’s always a painful process, not only because you always start the process too late, it’s a lot of effort and paperwork (which never goes smoothly), but you also realize that you’re paying quite a lot of money, for what exactly?
We are a small SaaS (Software as a Service) company, and we have the following digital certificates:

  • Two wildcard SSL certificates. Every SaaS company that offers a separate sub-domain for each customer needs these. We pay €775 for each wildcard certificate, with a 2-year validity (from Thawte).
  • A code-signing certificate. We offer a few rich-client applications from our web-applications, using ClickOnce technology. These applications need a digital signature. Cost of the certificate is €385 for a 2-year code-signing (Authenticode) certificate from Thawte.
  • A BAPI certificate. This is needed for (e-mail) communications for the Dutch tax authorities. Costs €240, with a validity of 3 years.
  • A PKI certificate from the Dutch government, for SOAP-webservice communications with the Dutch authorities (Government, Chambers of Commerce, Banks). Cost: €600 for a certificate with a validity of 3 years.

In my case, the renewal of Thawte’s code-signing certificate was especially painful, as it didn’t work on Windows 8, and I was send into several dead-ends by Thawte’s support, such as “try it in Firefox” (Authenticode certificate renewal only works in IE), or “start IE9 in Windows 8” (how exactly?).

All in all, it may not seem a lot of money in total, but if you include the effort it takes every time, I find it quite a burden. It would be nice if in the future you would only need a single digital certificate for the company as a whole.

Why are there no out-of-the-box high-availability solutions for SQL Server in the cloud?

When selecting any cloud service, you would like to get some guarantees about the reliability (uptime) and durability (what is the chance that your data is lost) of the service. Now, let’s see what we can find on cloud relational database services, either as a PaaS (Platform as a Service), or installed as part of an IaaS (Infrastructure as a Service, i.e. virtual machines) configuration.

  • Windows Azure SQL Database gives an uptime SLA of 99.9%. The database is replicated treefold (within the same datacenter), but Microsoft gives no SLA for durability of the data. It indicates that to secure your data against widespread datacenter failure, you will need to export your complete database yourself (as a SQL script, or a BACPAC file).
  • Amazon RDS (Relation Database Service)  has no SLA at all for uptime or durability. In the past, RDS has turned out to be the most unreliable of Amazon’s services, because it is build on top of the EC2 services (virtual machines) and the EBS service (virtual drives). RDS uses SQL Server logfiles to be able to restore a complete server instance (not separate databases) to a maximum time of 5 minutes ago. It also maintains daily snapshots of the instance, which can be retained for a maximum of 35 days. RDS also offers live replication and failover to another datacenter (Multi-AZ), but this is not available for SQL Server.
  • SQL server running on virtual machines of Amazon (EC2) or Microsoft (Azure Virtual Machines) have no SLA. There is only an SLA for the underlying virtual machines from both Amazon and Microsoft for 99,95% uptime.

So, from this I have to conclude there’s really a lot of uncertainty about the reliability and durability of any database service in the cloud. It also means that for SaaS applications where you can not afford to a have a few hours (or more) downtime you have to look into separate high-availability solutions. What is available in this area, specifically for SQL Server?

  • Windows Azure SQL Database has no built-in high-availability solution. You can copy a database to a new database in the same datacenter, but to copy it to another datacenter you have to export/import all data to a new database. Failover to a database copy will have to be managed manually.
  • Amazon RDS has a Multi-Availability-Zone option (Multi-AZ), where a database replica is maintained in another datacenter in the same region. This option, however, is only available for Oracle and MySQL, not for SQL Server. For SQL server your only option is to make instance snapshots, but you can not copy RDS snapshots to another datacenter.
  • On Amazon or Microsoft Azure virtual machines, you can implement any of the standard high-availability solutions of SQL Server (SQL Server 2012 AlwaysOn, database mirroring, log shipping). But keep in mind the following:
    • SQL Server 2012 AlwaysOn, and asynchronous database mirroring,  are only available in the Enterprise Edition of SQL Server. You will have to buy your own license for this, which is extremely price ($7000 per core).

    All of these high-availability solutions that you implement yourself on top of cloud virtual machines require a lot of expertise, ongoing management, and more than double the cost of the solution.

Conclusion: In general, for SaaS suppliers durability of data is much more important than uptime. In most cases 99,9% uptime would be sufficient (less than 1 hour downtime each month), but data durability has to be “Nine Nines” (99,9999999%), because any permanent data loss is unacceptable. Currently, none of the cloud relational database suppliers can offer this reliabiliy “out of the box” for SQL Server. Amazon RDS’s Multi-AZ solution comes closest (at almost double the price), but it is not available for SQL Server. Suggested manual solutions (based on export/import of data) are in many cases completely unrealistic for large amounts of data, because it could take days to complete an import of terabytes of data.

With all this,  are cloud relational database services currently an option for business-critical applications? I think they are, but the situation with regards to high-availability needs to improve rapidly, so that high-availability is simply an option to check. In the mean time, especially when you need to keep costs in check, we will need to rely on old-fashioned methods like log-shipping and data-export to safeguard our own data.

Amazon RDS for SQL Server has some ridiculous storage constraints

When you think about cloud computing, you think scalability, flexibility, etc. Right? Not really when you look at the Amazon Relational Database Service (RDS) for SQL Server, at least the storage part of that service. RDS is Amazon’s database PaaS (Platform as a Service) offering for relational databases, and it does a lot of things right: New instances running with a few clicks, backups fully-automated with restore to point-in-time functionality, good monitoring tools, etc. However when it comes to the storage part of the SQL Server part of RDS, there are some serious constraints:

  • You can not change the allocated data storage after creating the SQL Server instance. That’s right, you have to allocate, and pay for, the maximum storage that you will ever use in the future. Not only is this impossible to predict, but it means that you pay for a lot of unused storage, something “the cloud” was supposed to prevent.
  • When selecting guaranteed IO performance (provisioned IO per second, or IOPS) additional constraints are present:
    • The storage must be an exact multiple of 100GB. Storage costs $0,138 per GB/month.
    • The selected IOPS must be exactly 10 times the storage in GB (so with 200GB storage you must select 2000 IOPS). IOPS costs $0,11 per IOPS/month.

Now, for any serious databases use, you need reliable performance, which means that you almost always have to select the provisioned IOPS option. So if I create a database instance with 1 TB storage (the maximum, which I have to select because later expansion is not possible), I have to select the 10,000 IOPS option, and I am paying $138 per month for the storage and $1100 per month for the IOPS. In addition to that, of course, you also pay for the instance itself.

Strangely enough, the storage constraints when using Oracle and MySQL are much less strict: You can scale the storage up at any time, without downtime, and when selecting provisioned IOPS the storage can be any value between 100GB and 3TB, and the IOPS/storage factor can be any value between 3 and 10.

So what is going on here? Do Windows Server and/or SQL Server have such inherent constraints that the same storage flexibility as for Oracle/MySQL can never be achieved? Or has the Amazon RDS team just not yet had the time to fully come to grips with the Microsoft systems?

SQL Server connection timeout in .NET client also used as transaction timeout? Mmm…

Sometimes you spend several days (or nights) trying to solve a problem, only to find out your problem is caused by some inexplicable design decision you couldn’t possibly anticipate. In this case, I was investigating errors we’ve head on peak hours from our SQL Server database server, where we got “Timeout expired” errors on commit or rollback of database transactions, for the last two months.
Now, our database server probably doesn’t have enough capacity to deal with its load on peak hours, but still, we hadn’t had any errrors like this before, even at the busiest of times. Completion of transactions in SQL Server (and especially commits) have always been very fast on SQL Server, in my experience over the last 20 years.

When searching the internet, it turns out quite a few people have this same problem, but nobody knows exactly why. In the end, there’s exactly one place to find the surprising answer, provided by Matt Neerincx of the SQL Server team in a forum post:

The value used in the .NET SQLClient class for the client-side transaction timeout is the value set for the connection timeout of the database connection.

In his post, Matt explains that in this case they had no other values available in the class interface, so they decided to re-use the existing value for the connection timeout. Now, for me these two values are completely unrelated. Sometimes you want to set the connection timeout pretty low, so that un unavailable server, or incorrectly provided server name, is detected without a very long wait. I had done just that, by reducing our default connection timeout from 30 to 20 seconds, never suspecting this would have these drastic side-effects on the completion of transactions!

Ah well, problem solved, on to the next: This morning we had one of the infamous “Deadlock detected” errors in our ASP.NET web application. That’s another can of wurms, trying to make sense of the myriad of threadpool-related settings in ASP.NET.

I have to take a test before I’m allowed to give Microsoft money?

Today I was trying to renew the Microsoft Action Pack (MAP) subscription for my company. Now, the MAP really is a good deal: For €380 per year, any small company (less than 100 employees) that creates or integrates IT-solutions based on Microsoft technology gets access to almost all Microsoft software, with a small number of licenses for each product, that can be used for regular business use, including 3 full MSDN subscriptions. I had to do this renewal during my vacation, because Microsoft uses some pretty strong language for what happens if you let the subscription slip for only one day (you have to destroy all media, remove all licensed software, and even if you subscribe again later, you can never get the rights back to use older software versions).

The “interesting” part is, that just before you’re trying to use your credit card to pay, you get the message that you first have to complete an online seminar, with an accompanying assessment (test) for which you have to score 70%, before being allowed to proceed. So you have to go to 30 or so slides of an online seminar (which seems to be pretty outdated, with no mention of Windows 8 at all), trying to remember the differences between the various versions of Windows Sever 2008. This takes maybe 30 minutes, including the test. And after that, as it turns out, you have to take another online seminar and test, this time with specific questions about the Development & Design MAP (with questions like wat LINQ stands for). Not only is it a little bit absurd that to be allowed to pay money to Microsoft you have to complete some tests first, but also that it is assumed that the person who does the ordering and payment is able to follow these seminars and do the tests.

So, after an hour I passed all the tests (mostly answering “All of the above” to questions about what a specific piece of Micosoft software can do), and finally I was allowed to enter my credit-card details. Well, at least now I know that Windows Server 2008 Foundation can do nothing useful, and that the Microsoft licensing options are too complex for anyone to understand. Still love Visual Studio 2012 though (after removing the all-caps menus), so all’s good!

Amazon EC2 monthly price tables for SaaS-relevant configurations, in euros

EC2 is Amazon’s cloud service for flexible use of virtual machines. One of the big advantages of this service is that Amazon’s hourly prices also include the cost for the Windows and/or SQL Server license. Amazon has just announced lower prices for Windows EC2 instances.

It’s not always easy to calculate total monthly costs for various configurations, so in this post I will try to give an indication of montly costs for configurations that would be relevant for SaaS applications with 1,000 – 50,000 customers.
Note: In the monthly prices below I have incorporated the upfront costs for 1- and 3-year contracts in the monthly prices (spread out over the full term).
All prices are calculated for the EU region (Ireland).

Windows web server. No additional storage.

Configuration Resources On-demand (hourly) p/m 1-year term p/m 3-year term p/m
M1 large instance 2 cores, 7.6 GB memory €200 €131 €99
M1 extra large instance 4 cores, 15 GB memory €400 €262 €198
M3 double extra large instance 8 “second-generation” cores, 30 GB memory €859 €554 €419

Traffic out (to the internet) is an additional €92 per TB.

Load-balanced high-activity Windows web server.

Amazon Elastic Load Balancer
3 M3 double extra large EC2 instance on 3-year terms
5 TB of internet traffic per month
Total cost per month = €1763

Database server, EC2 instance with Windows server 2008/2012 and SQL Server Web Edition 2008/2012.

Prices below include 1TB of database storage on EBS (Elastic Block Storage, Amazon’s virtual drives). For instances without guaranteed I/O I have included 500 million I/O requests per months. Upfront costs have been spread over the months of the term.

Configuration Resources Guaranteed I/O On-demand (hourly) p/m 1-year term p/m 3-year term p/m
High-memory extra-large instance 2 cores, 17 GB memory (no guaranteed I/O) €463 €292 €249
High-memory double extra-large instance 4 cores, 34 GB memory 500 Mbps €798 €467 €384
High-memory quadruple extra-large instance 8 cores, 68 GB memory 1000 Mbps €1491 €829 €666

Note that you would also need additional storage for database backups, which could add €100-500 per month, depending on backup methods and backup retention.

Meta-database design in the cloud

When designing the (relational) database infrastructure for a SaaS (Software as a Service) application, one of the most important decisions is how to store the data of each SaaS customer:

  1. Store the data of each customer in a separate database
  2. Store the data of all customers in a single database
  3. Store the data of each customer in a separate schema in the same database
  4. Store groups of customers in the same database, using several (but a relatively small number of)  databases

Here I’d like to discuss my thoughts and preferences about some of these options.

Database terminology

I am using the terminology as it is used for Microsoft SQL Server (which mostly is the same for MySQL),  but which is different for Oracle:

  • A database-server instance is the complete database service. In Oracle this is called a database.
  • A database  is a group of data with its own database schema, storage, security, and backup/recovery. By default in SQL Server each database has as its storage one database file and one log file.
    Oracle doesn’t have the concept of a database: what comes closest is a schema in Oracle, but this is not exactly the same. Nevertheless, for simplicity most people use: “database in SQL Server/MySQL” = “schema in Oracle”.
  • A schema is a group of data-objects within a database, with its own definition. A database can contain multiple schemas. Usually is schema is tied to a user (in Oracle the correspondence between a user and a schema is still strict), but in SQL Server (since SQL Server 2005) a schema is not strictly tied to a specific user anymore. MySQL doesn’t have the concept of multiple schemas within a database.

Advantages of having a separate database for each customer

Now, in my experience, this subject of how to design the meta-database structure for a SaaS application is something that is not always well thought out, or not even thought about at all. Also, in internet discussions no agreements or guidelines on this subject can be found at all. Personally, I am in favour of solution 1 above, giving each customer a separate database, for these reasons:

  1. Security of database queries. The data of each customer is separated for all other customers, and there is no need to think about that each SQL query or statement must always include a customer identifier. This is a huge issue, because small mistakes, like adding OR-clauses to an SQL query without adding the correct parentheses, can have a dramatic impact, if, for example, you delete data for all customers instead of just one.
  2. Straightforward scalability. The databases of each customer can be spread out of over as many servers as you like (if the SaaS applications are designed in such a way that they use a separate database connection for each database), resulting in very easy scaling when the number of customers grows. Also, it is very easy to redistribute databases to different servers.
  3. Easy backup and restore of data of a single customer. With each customer having its own database, restoring data is simply a case of restoring the backup of a single database (which is pretty easy). Having all customers in a single database, this is very complicated, needing custom tools, or even impossible.
  4. Separate SQL query plans for each customer. Having all customers in the same database can give huge problems with the query plans the database system sets up. Basically, the first query of a specific type run after startup of the server determines the query plan. If other customers have a different data-distribution, this will not work well at all.
  5. Perfomance tuning per customer. There a customers with just a few MBs of data, and customers with TBs of data. Data will be distributed differently for each customer. In each separate database you can tune the indexes, views, and stored procedures to give optimal results for that customer. For example, in SQL Sever, very small databases might need no indexes at all, whereas very large databases might need extensice index tuning, or even things like materialized views.
  6. Version-migration. Updating the SaaS software to a newer version almost always requires changes to the database schema. Having all customers in the same database requires that all customers are always migrated simultaneously. Having a separate database for each customer, you can upgrade customers as you see fit, giving, for example, options such as piloting changes on specific customers.
  7. Database customization. Usually, application customizations for specific customers also require extra tables or columns in the database. Again, having a separate database for each customer, this is much easier to realize, than when all customers are in the same database (and share exactly the same database schema).

Drawbacks of having a separate database for each customer

Of course, almost just as many drawbacks can be found:

  1. Maintenance on many databases. If you are used to having a DBA (Database Administrator) that does lots of manual, daily, maintenance tasks on a single, large, database, this will not work if you have hundreds or thousands of databases. All database-maintenance tasks will need to be full automated (which is a good thing anyway, I think).
  2. Limit on number of databases per server. SQL Server has a hard limit of 32,767 databases per server instance. Long before you reach this number, the number of databases will be limited by the amount of memory and other server resources. However, I have had 30,000 databases (lightly used) running on a single server with 8GB of RAM, so this hard limit is not as theoretical as some people think.
  3. Cross-customer queries. Having all customers in the same database, makes it easier to do queries over all customer (or groups if customers, such as all customers managed by the same accountant). With separate databases you will need to have a separate data-warehouse that collects all data needed over all customers (but having a separate data-warehouse for this is not a bad idea anyway).
  4. Database mirroring. Trying to do real-time database-mirroring on large amount of databases will not work (because of network resources and threads needed for each databases). Microsoft gives approximately 200 as the maximum number of databases you can mirror on a single server. If you have thousands of separate databases, fully-automated log-shipping is currently probably the best way to minimize data-loss.
  5. Direct SAN-level snapshots can not be used. Many SANs (or similar network-based storage solutions) offer quick storage backups on the block-level of the storage. For this to give consistent backups of a database server, all database I/O to the storage needs to stop for a moment. SAN snapshot tools achieve this in cooperation with the Windows Volume Shadow-copy Service (VSS). However, this will stop all I/O to the volume for a few seconds for each database. For thousands of database this is not an options.
    Best solution for this, when having thousands of separate databases, is to first make standard database backups to a separate storage volume, and then do a SAN-snapshot of that backup volume.

Cloud-based relational database service options in regards to number and size of databases

OK, let’s now look at how all of this is relevant when moving the database infrastructure to the cloud. Currently, the cloud-based relational database options are:

  • Amazon Relational Database Service (RDS). This is a service build on top of Amazon’s EC2 (virtual machines) and EBS (virtual hard drives). With it, you get a ready-to-use SQL Server, Oracle, or MySQL instance, but without access to the underlying OS or file system.
    • Maximum number of databases per instance is 30.
    • Maximum storage per instance is 1TB for SQL Server, and  3TB for MySQL or Oracle
    • Default maximum number of instances is 10. Need to ask Amazon for more.
    • Price (for SQL Server Web edition) is €100-1500 per instance per month (depending on CPU and memory)
  • Microsoft Azure SQL Database. This cloud-service offers “databases as a service”. So you don’t get a full server instance, but you get completely separate databases.
    • Maximum number of databases is 150. Credit check by Microsoft for more.
    • Maximum database size is 150GB.
    • Price is €4 per month for a database < 100 MB to €170 per month for a database of 150GB.
  • Cloud-based virtual machines running a database server. You can, of course, always run a full database server instance on any cloud-base virtual machine. You then have to set up the storage (RAID), backup, and failover systems yourself, which requires a lot of expertise.

With this, it is clear that the current cloud-based database services have problems for any SaaS meta-database infrastructure:

  • The number of databases you can have in a cloud-database service is quite low, and/or prohibitivly expensive for very many (thousands) of databases.
  • The maximum database size is also quite low, so putting all SaaS customers in a single database is also not an option (besides negating the advantages of having a separate database for each customer that I have mentioned above).

Then, what should you do, if you have thousands of customers for your SaaS application, and want to move your database infrastructure to the cloud? Right now the solution would be:

  • Put your biggest databases (databases with master data, central customer data, or databases of biggest customers) into a cloud-based database service, for biggest reliability and performance.
  • Run all other customer databases on various cloud-based virtual machines with your own DMBS (Database Management System)  configuration.

Conclusion

All in all, the situation with respect to cloud-based relational databases services I not ideal yet, and I hope that both the manageability and pricing of having very many databases in a cloud-based relational database service improves soon.

Iain Banks, world’s greatest living science-fiction writer, is terminally ill

Most of his readers probably already know this, but I only read about it yesterday: Iain Banks, in my opinion, the greatest science-fiction writer of the last 25 years, has announced that he’s terminally ill, with possibly only a few months to live. Sad news. Best wishes too all.
Having discovered his books 4 years ago, Banks’ books made me read science-fiction again after many, many years of disinterest (I read all the classic science-fiction works by Heinlein, Herbert, Vance, etc. in my teens). Nobody in human history has described in so much detail a utopian society, with all its challenges and possibilities as Banks has done with his “Culture”. His books are in turn very clever, witty and uplifting, and bleak and even hopeless at other times.

Some Banks trademarks, which have become science-fiction standards:

  • The AI characters (in the form of spaceships or drones) are the funniest and smartest in the book.
  • The spaceship names are poetry in themselves. Some of my favorite spaceship names in Banks’ books:
    • “Attitude Adjuster”
    • “All through with this Niceness and Negotiations Stuff”
    • “Me, I’m Counting”

    Too many to mention, this is a full list.

My favorite Banks novels:

  • Against a Dark Background. Great adventure (a classic quest, the search for a superweapon, the “Lazy Gun”), superb heroine and supporting characters. Great, if bleak ending.
  • Use of Weapons. Banks’ masterpiece. Very complex (but extremely rewarding) story. Has Banks’s funniest AI (the drone Skaffen-Amtiskaw), and his most enduring heroine (Diziet Sma). A must-read.
Microsoft SQL Server Blog

Thoughts about SaaS software architecture

Brent Ozar Unlimited®

Thoughts about SaaS software architecture

Visual Studio Blog

Thoughts about SaaS software architecture

Microsoft Azure Blog

Thoughts about SaaS software architecture

AWS News Blog

Thoughts about SaaS software architecture