Tech Blog

Scalable MySQL Technologies

The recent emergence of scalable MySQL databases has a lot of developers and companies cheering. There are two companies, in particular, I would like to talk about and why they're essential in moving toward the future.

What is MySQL

My SQL is an open source relational database management system. A fun fact behind its name is that the name is a combination of "My," the name of co-founder Michael Widenius's daughter, and "SQL," the abbreviation for Structured Query Language. Some important things about it are that it is open source, published under GNU ( General Public License) and some other agreements. Generally, it is the default database configuration for most companies. To give anyone a good concept of it, it's a giant excel sheet.

CockRoachDB

Cockroach Labs has a cool product called Cockroach DB, which is a distributed network for a SQL. Meaning that there are a lot of different endpoints but whatever one you are hitting will be the fastest one at the time. What it also does it prevent manual sharding. Sharding is what happens when files get too big, us programmers "shard" a file, meaning we create a new one with all of the same properties. A copy of the excel sheet but empty, and then make that available for data ingestion. What is important to note about this is that SQL is pretty neat in the queries it can do. However, if you have a massive amount of data, you have a lot of shards (typically 10gbs each). So now you have to query each one of those to get your full data set, which can be cumbersome. What cockroach DB does is handle all of that for you and shards automatically, and you can also query all of the shards in one go as well. On top of the fact that it is on a distributed network, so your speed is always going to be fast. All in all, it's a pretty neat product, and company and I would highly recommend checking them out here.

Vitess

When I heard the story behind this company, I was impressed. Truly impressed by what the founders accomplished and did. Here is a brief.

Vitess is a technology developed by YouTube to shard large MySQL databases across multiple servers, Vitess, has become the 16th hosted project of the Cloud Native Computing Foundation. Which is a massive accomplishment as only critical tools and technologies associated with cloud development can fit the criteria to get accepted

Vitess was created for “people who love MySQL for its functionality, but have chosen not to use it because it does not scale well,” said Sugu Sougoumarane, one of the creators of Vitess who is now co-founder and chief technology officer at PlanetScale Data, a still-stealth startup centered around Vitess - interview with NewStack

The year was 2010, and Youtube had a problem, scalability for its databases, MySQL. Two early engineers at Youtube were contemplating these problems. They built the open source system Vitess to solve them. It was made to be open source, and they solved the problem of scale, as Youtube migrated towards Google's internal cloud system so did Vitess. These engineers now are running their own startup. These rockstars names are Sugu Sougoumarane and Jitendra Vaidya. I would highly recommend checking them out here, especially once their startup takes off.

Why Scale

In this day and age companies collect so much data. It is hard to create scalable database systems to capture all of that data. There are other technologies such as NoSQL databases (Dynamo...), but they don't query as nicely. Scaleability is essential in this day and age, and cloud technologies make it easier than ever to do so. It should be an exciting time to watch these companies move forward with their technologies that are developing. I would especially keep an eye on any of the Cloud Native Computing Foundation's approved technologies. Good luck and happy scaling!

dan flan