Why did I create my own database?

I know this sounds crazy and, a few months ago, I never would have imagined such a feat or even raised such a possibility.

Image for post
Image for post
Why, huh?

I have been developing software for a few years, and I have already used several database solutions in my projects, such as:

  • MySQL

Therefore, creating my own database was never a necessity in the projects that I participated in all these years, having always chosen some ready and specific solution for each case.

Until I started developing Vemto, my most recent project. Vemto is a code generator for the Laravel framework (a PHP framework), to facilitate the initial creation of Laravel applications, automating several things such as the generation of models, migrations, controllers, CRUD, the suggestion of Field Types, assembly of the Database Schema, etc.

Image for post
Image for post
Vemto demonstration

I started planning Vemto at the end of 2019 and the development started in April 2020. It is a software strongly focused on processing relatively complex data. So, it would need a database (instead of something simpler like saving a JSON or YAML file).

Also, there was another very important factor:

The Template Engine that I developed for Vemto processes the logic directly in the template files.

Image for post
Image for post
Vemto template example

Therefore, it was necessary that it was very simple to access related data, in a way similar to what ORMs do. For example, if I wanted to get all fields from a table I could do something like:

table.fields.forEach(field => {}) //…

Database choice …

As Vemto is a Desktop application developed with Javascript/Electron, I discarded solutions that need installation such as MySQL, MongoDB, etc. from the start. It needed to be a Built-in database.

At first, I considered using SQLite, a well-known relational database that I already have experience with. But during planning, I realized that I would need something that would offer a more Flexible Schema.

This would allow me to try and pivot the tool quickly while allowing me to quickly reach the competition (a very important requirement that I used in other parts of the software and I intend to talk about in another post).

I researched a lot at the beginning, tested several solutions, and the choice was with IndexedDB, as it is a database that comes practically ready for use in Electron (because of the Chromium used internally), it is fast and simple to use, in addition to having great flexibility, as it is a document-oriented database.

The headache …

Although IndexedDB is an excellent database, it was not the right choice for this project. And I only started to notice this around the second month of development, but at that moment, I decided to continue, since I had a lot of things ready.

Note to the Future: I had better stop and fix it soon, right?

I chose this database for its flexibility in data modeling, but I was wrong to disregard a very important detail:

Vemto handles Relational Data. A project has models and tables, a model or table has fields, a field has relationships with other tables, and so on …

And this weighs heavily on the choice because IndexedDB has two important characteristics: it is not relational, and it is completely asynchronous.

So when using it for a relational data model, you need to manually deal with things like:

  • Async / Await Hell — As all operations are asynchronous, you are required to make a chain of asynchronous calls. For example, if you click on a button to delete a project, it must delete the related entities. To do this, you need to call an asynchronous method that uses an await to obtain related entities, and this forces all previous methods to be asynchronous (from the method called at the click of the button to the more internal methods).

In short, when I started working on the most complex parts, like the Schema Editor and the Initial Code Generation, I realized that the choice was not correct.

I was wasting a lot of time, in addition to generating numerous bugs and going crazy from stress. Seriously, I was going crazy and losing sleep over the “workarounds” that started to appear, damaging even the testability of the project.

I urgently needed a solution!

Looking for a solution …

Although I needed a relational database, non-relational document-oriented databases have some characteristics that I could not give up like the flexibility of the data scheme.

According to everything I realized so far, I needed a database with the following characteristics (all important):

  • Support relationships — this includes things like indexes, CASCADE DELETE, constraints, foreign keys, etc.

In other words, I needed what we know as Hybrid Database, a little known category of databases, since use in common applications is quite rare.

More specifically, I needed an object-relational database (ORD) for Electron (something I didn’t know until recently).

I searched for weeks for a database with such characteristics. I found very little, and my last requirement, to be Built-in and compatible with Electron was the biggest obstacle to find something (I really believe that there was no database with all these requirements. If anyone knows one, let me know in the comments, please).

At this point, I was feeling quite pressured, as there was already a list of thousands of people registered on the Landing Page waiting to test Vemto, the months were passing and my schedule was very late, with Pre-Alpha postponed to September 2020 (Initially it would be in August).

In mid-June 2020, I had a sudden idea in a brief moment of rest.

Why not create a “parasitic” database layer, using a key/value database (like LocalStorage, Redis, etc.), and implement a hybrid database on top of it?

The idea seemed good, and at that moment I was able to visualize it “almost completely” in my head. I made some notes and created a prototype, but after focusing on it for a while, I thought it would be a lot of work and delay even more.

I decided to make a huge effort to finish the MVP with IndexedDB, at least to launch the Pre-Alpha version. It was very challenging and stressful, I had to make up most of the bugs with “lots of workarounds”, but I managed to release this version, which ended up attracting more people and making me even more desperate for a solution to the problem.

Image for post
Image for post
Me in September …

Developing the Database …

I couldn’t leave the software that way.

It would be impossible to maintain the code and beat the competitors promptly with so many difficulties, workarounds, and impaired testability.

So, in September I made the decision. I would finish implementing the database idea that I had in June and rewrite all the parts of the system that used IndexedDB to use this new database (with the provisional name that became definitive due to laziness: RelaDB).

Incredibly, finishing the database development was a very quick task, since I had a partially functional prototype.

If I knew I would have finished it before 😅

It took me about two weeks to finish the initial implementation. The entire development of the database was done through TDD (Test Driven Development), and I had a good idea of how the mechanisms should work.

Image for post
Image for post
Some database features test example

The database was created with an intrinsic ORM layer (strongly inspired by Laravel’s Eloquent ORM), thus facilitating its use.

The ORM currently supports “1:N” and “1:1” relationships, and “N:N” relationships have not been necessary so far so I left it to implement in the future (although the database internally supports them, since it is only necessary to create a Pivot table ).

It does not support query languages like SQL, as I decided to implement my own simple solution at the “low-level”, since I had well-defined requirements, and decided that the main interface for accessing the database would be the ORM itself, through Model classes.

Image for post
Image for post
Model example

RelaDB is basically composed of three layers:

  • The Storage Layer — following the original idea of a parasitic database, I ended creating a Storage layer with a behavior similar to key-value databases to store the basic low-level data. So, it is easy to write a driver to use an existing key-value database to be used to store the final data, but RelaDB has its own native Storage Logic

As it “can use” other key-value databases to store the basic low-level data, I developed it first by simulating a key-value database with arrays in memory.

It has the concept of “Drivers” (the Database Layer communicates with the Storage Layer using this driver), so, to adapt it to a new storage strategy or to a key-value database, just create a new Driver that has few lines of code (this makes it possible to use it with LocalStorage, Redis, or its own native driver).

With that in mind, I ended up creating my own storage strategy that saves the data to disk and then created a simple native Driver using this implementation.

Image for post
Image for post
The driver that allows saving JSON to the disk

In the short video below you can see an example of using the RelaDB ORM in an Electron application:

Changing the Database …

That part, yes, I would say it was very difficult!

I started refactoring code on September 28, 2020:

The day I started refactoring the Vemto database layer

It took me about three weeks. They were very, very intense days.

Probably one of the most difficult tasks of my professional life, since I wanted to launch until November 2020, so it was huge pressure.

I’m tired to this day hahaha

I had to review practically all of the application’s code, a lot of it was rewritten and a lot of it was thrown away (mainly the manual tasks I mentioned earlier since now the database took care of everything automatically). I was also discovering several small bugs in RelaDB in the meantime that were being solved in parallel.

And then, on October 20, 2020, I finally did this monstrous merge (29,815 insertions and 8,352 deletions only refactoring).

My journey in developing my own database seemed to be nearing its end:

And finally, on October 29, much more relaxed and happy for the result of all the work, I launched Vemto officially:

Post-launch issues …

I had only two problems with the database after launch:

  • Problem with numbers sorting — when performing a query sorting by a numeric field, for example, Project.orderBy(‘position’).get(), the database was sorting it alphabetically. It was a very quick bug to resolve.

After solving these two, I had no more problems with the database.

Conclusion

Developing my own database was certainly the best decision I made in the context of this project. I would never do that if I was developing a common application, but in this case, it was a game-changer.

It is so simple to add new data and reflect it directly in the Template Engine, that I realized that the set of these two parts of the system is the reason to be able to launch new features quickly (in a few months, Vemto not only can solve most of the problems that competitors solve, as it brings a few more “little things”).

For this reason, I decided to keep the Database as Closed-Source for a while. But in the future, with it improved and properly documented (my documentation today is TDD), I intend to leave it as Open-Source.

I finish my adventure as a much more experienced developer than a few months ago. I never imagined that I would do that, but in the end, everything worked out.

Thank you very much for having accompanied me here, I wanted to write a lot more details but the text is getting too long.

I’m always posting my code adventures on Twitter. If you want to stay inside, please consider following me: https://twitter.com/Tiago_Ferat

Cheers and until next time!!

And a great new year for everyone!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store