Upgrading to 100 Billion.

I forgot to turn on my auto delete  for the “Who viewed me” section of the website after making some changes and I managed to accumulate over 1 Billion records in a very short time.    I still get awesome performance  with the SQL Query selecting who viewed you and applying  search criteria in under 7 Milliseconds.    I’m pretty sure no one else has a table with a billion records in SQL server and is insane enough to have it on One Machine,  with  only a hand full of internal drives.

So today I fired off an order to get a bigger storage array.   My biggest problem is all the space it takes up.  With a relatively small investment I should be able to store  100 Billion records.   My database is getting scary now,   every month or 2 when I look at the tables they end up being a lot bigger.     I’ve got a LOT of tables with hundreds of millions of records in them.   Last year they only had 10’s of millions.    At any rate i’m going to be ordering more servers and breaking up my database more so that I can scale past 100 Million daily pageviews.   Anyone else finding their original designs  just not cutting it anymore ?

16 Responses to “Upgrading to 100 Billion.”

  1. Trevor Says:

    Yes. Very much so. Although on a much much lower scale. I’m trying to get my PHP/MySQL application to scale beyond 100,000 pageviews a day and it’s not the simplest way.

    Caching is one of the trickiest aspects, along with SQL-statement optimization. Since I’m no MySQL guru I find it near random what works and what doesn’t in the area of tweaking MySQL. But that’s not a problem for you I take it.

  2. Nigel Says:

    Trevor I would suggest a book called “High Performance MySQL” by O’rielly if you are new to MySQL performance stuff. I found it gave me a good basic overview of MySQL and exactly how it works, and new way to think of queries, and debug them. It covers basic stuff like the different table types, and indexing and issues with indexing and such, but should be good read if you are new.

    Also I would suggest checking out the caching articles on how livejournal.com scaled. Also I went to a presentation on facebook.com’s use of APC when they came to Vancouver over a month ago. Check out tekrat.com for that presentation.

  3. Guy Ellis Says:

    Yes! Each time I hit a limit it requires a slight change to boost performance. A lot of the code that I write is initially prototype / proof-of-concept. I release it into the wild and it works and then later I need to come back and optimize it when it proves to be popular. The question is: Why didn’t I optimize it in the first place?

    Well if I have 10 ideas and each take 1 hour to prototype but 5 hours to write in an optimized format then I can do 10 prototypes in a day and it would take 5 days to implement them in an optimized format. If only 1 of those ideas takes off and creates a performance problem then I can come back and optimize it and I will have saved the time from the ideas that never worked out.

    Some of the best performance gains I’ve seen have been to move sql code to stored procs and optimize those procs.

  4. zach Says:

    @Nigel – Could you post a link to those livejournal articles, I’d find it interesting. Of other note is according to the latest blog at twitter, they are planning to move away from a database in some aspects.

  5. Marcel Says:

    Geee :)
    I hope i’ll have your problems one day ;)
    And ya one database server for 1 gazilion records is pretty insane, but i’m happy it’s still working for you!

  6. Trevor Says:

    @Nigel:

    Thanks. Sincerely thank you for this tips.

  7. Nigel Says:

    I would post the livejournal links if I could find them. I stumbled upon them a few months ago, but can’t remember where I found them linked. They were one of the developers day to day issues and investigations of different caching techniques. I think Livejournal was basically having the same sort of bad public headaches as twitter back in the day. The posts were a few years old, so things may have changed, but I believe they settled on using memcache.

  8. zach Says:

    http://glinden.blogspot.com/2007/05/slides-on-livejournal-architecture.html

    Found that, are those the slides you were talking about?

  9. Guy Ellis Says:

    Coincidentally I was just updating some notes that I took from the Desert Code Camp that I attended last year and one of the topics was SQL Server 2005 Do’s and Don’ts for Developers. There are only a few lines but you may find one of them useful: http://guyellisrocks.com/coding/sql-server-2005-do-s-and-don-ts-for-developers/

  10. Nick B Says:

    Well I keep having problems with email. I write an email, I click send, the page loads and says “Message Sent”. I then go into my Sent mailbox, and the message is not there.

  11. simonmelbourne Says:

    Hi can anyone recomend who to use to have web 2.0 style website designed for me (dating type thing)?? Any help would be great, thanks!

  12. Nick B Says:

    Alright the problem seems to have been corrected. *shrug*

  13. FunkySouth Says:

    Im just in the process of upgrading a social network im running, its a bandwidth sucker !

  14. KATIE Says:

    Someone on plentyoffish.com is using my picutre and there is NO way for me to contact anyone about this! I have searched over and over and it shouldn’t be so hard to do- if it’s such a HUGE successful business now, well then why can’t they get some freakin’ customer service!!!!

  15. Thomas Says:

    Hi Mark,
    Just reading your post here and from my understanding you’re running this off MS SQL 2005. I’m sure you get alot of requests but I would be interested in shadowing you on your database related work. The experience from such an intensive OLTP database would be amazing and could be beneficial to both, for bouncing ideas around. I work at a local college in Vancouver but we simply don’t have databases of that size. I am writing my SQL MCTS exam and this type of fine tuning and profiling is extremely interesting to me.

  16. Fish Killer Says:

    Hey Marcus what are you doing about the serial killer on your fish site that has been killing women. How is it that its not all over the place yet. Your system sucks and it can not stay free forever. You will be forced to charge a fee to stop these maniacs from killing again. I recommend a paid service. Marcus is only init for the money the ads provide for him. I have met marcus in person years back and he really needs to do something before he finds himself in a bad place.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 273 other followers

%d bloggers like this: