“Going Big” With a Website, Part 1: Infrastructure that Scales
Editor’s Note: Once again, it’s time to address a common question that is beyond the scope of even our managed support — how to design a website that scales well. To address this popular topic, we’ve asked web developer, writer and ServInt customer, Larry Ullman, to share his thoughts.
As a web developer, writer, and public speaker, I often interact with people of various skill levels, talents, and interests. This is one of the joys of my career: I’m fortunate enough to bear witness to the thoughts and experiences of other programmers, “idea” people, and just plain dreamers.
One of the common topics that comes up, or that I am directly asked about, is how one “goes big” with a website. A reader will email me, as she intends to create the next Facebook, and wants to know how many servers to buy and of what type. I appreciate the enthusiasm, but these interactions send shivers down my spine.
The frequency with which these kinds of questions comes up suggests there’s a lack of good knowledge as to how one “goes big” with a website. In this three-part series, I will explain everything I believe and know (or think I know) when it comes to this subject. Here in Part 1, I cover the myths of going big, what infrastructure you’ll eventually need, and how one should start developing a new project. In Part 2, I will discuss how one writes code that can scale well should your site “go big”. And in Part 3, I’ll turn to designing databases that can handle the traffic of a “big” site.
The Myths of Going Big
One of the most common, and most potentially ruinous, statements I see beginning web developers make is:
I have this great idea for a website, and it’s going to be huge, so I need to start with the infrastructure that can support a huge site.
I’ve seen this far too many times, often with people who only have an idea: no site, no development skills, sometimes not even the domain name yet!
Now, to be clear, dreaming is fine. And pursuing a project because you have a great idea is certainly justifiable. But the reason sentences like the above frighten me is that they’re expensive. Putting the cart before the horse is one way businesses fail and people lose lots of money. The resources required by a busy site–hardware, networks, and staffing–are especially expensive. These are resources you shouldn’t spend money on until you absolutely have to, or almost so.
This brings me to what I consider to be the two biggest myths of going big:
- Your website will ever go big.
- Going big is a difficult transition to make on the fly.
The first myth sounds pessimistic, but is statistically true. The top two websites (in terms of pageviews per day) are Google and Facebook. They each average around 400–450 million pageviews per day. That is the highest echelon for “big”.
Amazon is extremely popular, ranked around the #10 busiest site, and it gets around 45 million pageviews per day. This means that the #10 ranked site gets about one-tenth the traffic of the top site. ESPN, one of my favorite sites, is also extremely popular, but it ends up being around #100, and gets around 12 million pageviews per day. Evernote, a popular software company, is ranked around #500, and gets about 500,000 pageviews per day. The #500 ranked site gets about one one-thousandth of the traffic of the top site. This is a good place to stop, as 500,000 pageviews per day reasonably counts as “big”. (Although a proper, definitive number for “big” is somewhere lower than that, and would depend upon what, exactly, the site does: 100,000 views of mostly text content is a different beast than 100,000 views of mostly video.)
With those numbers in mind, what are the odds that your site will end up in the Top 500 of all websites in the world (regardless of how good your idea is)? The odds are not good. The assumption that your site will be a huge success is a bad one to make. Spending money based upon that assumption is a catastrophe.
I suspect beginners make the mistake of thinking they need to support “big” from the outset due to a lack of knowledge, a misunderstanding of what’s possible and what’s required. This is the second myth: that it’s hard to transition from little to moderate traffic to handling a lot of traffic. Since 1999, I’ve switched hosts (a few times), domain names (twice), and hosting packages (several times), all with no down time, or almost none. In the process, I’ve gone from spending $5/month (all prices US) to spending around $65/month (now). Clearly, it would have been foolish to have spent $65/month (let alone the hundreds per month that a dedicated server costs) all those years ago before I needed those kinds of resources.
So what should you do? Keep reading and I’ll tell you. But first, let me just add that I don’t think having a high traffic site should ever be a goal. Create a great website that addresses a need, that solves a problem, and sufficient popularity (and income, if that’s a hope), will follow.
A “Big” Infrastructure
To clear up some of the lack of knowledge, let’s look at what kind of infrastructure a “big” site needs. Again, in terms of traffic, let’s look at sites in the 100,000 – 10 million pageviews per day category. Under that number, you probably only need a single server (or a virtual server or a shared host, depending upon how far under that you are). Above that number of pageviews, like an Amazon, YouTube, Twitter, Google, or Facebook, it’s not a question of how many servers are required, but how many buildings of servers are needed.
As a specific example, a friend of mine is responsible for a site that receives between 10 and 20 million pageviews per day. This is definitely a Top 50 site. The site is a mixture of text content, images, and video. What do you think is required to handle that kind of demand? The answer may surprise you.
In this particular case, two and a half people maintain the site. The site runs on eight (8) web servers, with two database servers, all in one location. They also use a Content Delivery Network (CDN). That’s it. And these are actually Windows servers, running .NET (I don’t know how the performance of Windows servers vs. Unix servers compares, and I’m not trying to start that debate, I’m just saying…)
Servers are very capable machines, when configured properly, a little bit of hardware can go a long way. One server, or even one VPS, can handle a fair amount of traffic.
For my own site, www.LarryUllman.com, I use a VPS at ServInt (and have for years), plus a CDN. This easily handles around 20,000 pageviews per day (definitely not a “big” site, but not insignificant, either). Moreover, many of those are requests use WordPress. I mention that fact as WordPress is a notoriously poor performer. I’m sure that if I spent some time tweaking things, and possibly integrate Varnish, I could get even better performance out of this single VPS. With the proper software and tuning, I expect I could gracefully handle upwards of 40,000 pageviews per day with a VPS.
How To Start a New Site
If what I’m saying is you shouldn’t go out and buy multiple servers from the get-go (and I am saying that), then how should you develop a new site? If you were a complete beginner starting today, I would suggest you (all prices in USD):
- Buy the domain name for the site (cost: $10/year or so).
- Develop the entire site on your own computer (cost: $0!).
- Have your friends and family try out your site on your computer (cost: $0!).
- Read more, practice more, study more.
- Rebuild the site from scratch on your computer. This time the result will be at least 20% better (cost: $0!).
- Move the site to a quality shared hosting situation (cost: $15/month or so).
- When your site is busy enough or you need a level of customer service that shared hosting simply can’t offer, move to a basic VPS package with a quality host like ServInt (cost: $50/month or so).
- When your site’s traffic grows, add on a CDN (cost: $5–15/month or so).
- When your site’s traffic outgrows the VPS, get a better VPS package (cost: $80/month or so).
- When your site’s traffic outgrows any VPS, move to a dedicated server (cost: $200/month and up).
Using this sequence, you’ll see that you’ve only spent a total of about $10 for the first several months of the project. This is as it should be for most people. Even for a somewhat busy site, such as mine, I’m still only talking about a few hundred dollars per year. Let your infrastructure grow with the demand. If you work with quality hosting companies, there will be little or no downtime, even as you switch from one hosting plan to another, or from one hosting company to another. This is how it’s done all the time.
There are many things I love about web development and software development in general. One is how democratic it is: you don’t need to go to a special school or know someone important; If you have a computer and internet access, you can give it a go. Second, web and software development is affordable. Tremendously affordable. Take advantage of these truths and save yourself as much money as you can.
You’ll also notice that I’ve added an iterative component to this process: create the site, get some input, get better yourself, and recreate it. Obviously I don’t do this myself on new projects, but the fact is that you’ll learn and become more skilled on every project you work on. The site I begin tomorrow will be slightly better than the one I finished yesterday (hopefully). When you’re just getting started, the quality difference between that first project and that second will be exponential.
Finally, it would be disingenuous not to mention Amazon’s web services as a global, and easily-expanded solution. But AWS takes time and skill to master and use, in my opinion and experience. This article, and my recommendations, are for the common person, implementing what should be a good idea. If you’re on your second or third good idea, have a track record of success, have a cloud-savvy programming team, and have some funding, the game plan may be different.
There you have my premise for “going big” with a site, including what I see the myths to be, and how most people should begin new projects. In the second and third parts of this series, I’ll talk more specifically about how you code and design a site that’s capable of “going big”.