Over the last 10 years or so, the mechanisms for the management of data has exploded from the relatively standard Relational Database to a plethora of other options: Document Stores, Column Stores, etc. Where and how these databases are managed in terms of infrastructure has rapidly changed also, with cloud deployment and easy scaling being a key concern. This has left the technology companies and developers with some difficult decisions as it can be hard to choose the right data technology and deployment architecture for each job, with the CAP Theorem becoming an increasingly important consideration for data architects.
Having built an initial commercial project using a relational database, and then moved to a hybrid of relational database, column database and "No SQL" database, I'm in a good position to help and advise on the best choices for companies dealing with data. I've also dealt with data at all scales from the billions of ticks of stock market data, to helping visualise complex spreadsheet data in an intuitive way
I've worked with many databases over the years, including relational databases (e.g. MSSQL, MySQL), NoSQL databases (e.g. MongoDB, RavenDB) and other databases (e.g. KDB column database). There are some key questions that need to be asked when trying to work out what technology you should be working with:
- Scale: How much data is going to be stored? How fast might a solution need to scale?
- Reliability: How detrimental is lack of data access at any point to the business? How up to date does the data need to be to viewers?
- Speed: How important is the speed of the various queries that might be run on the data? Are there areas where speed is more important than others?
- Geography: Is the data to be used across multiple locations? Does the data from one location need to be immediately visible to all locations?
- Security: All databases can be made secure, but can the architecture of the data management be made in such a way as to increase security?
The above are just some of the questions that need to be considered when planning data architecture and determining what technologies to use and how to built out both the architecture and the infrastructure underpinning your data solution