By: Justin Harrigan, Account Executive and Chris Saso, SVP of Technology
This past March, we blogged about Big Data to give an overview of the space and a background of the problems our clients are trying to solve with Big Data tools. Today we are going to dive a bit deeper into one such example where one of our clients used HP Vertica to solve a Big Data Analytics problem.
HP Vertica is an analytics database that enables organizations to better store, analyze and understand their data. HP offers a Free Community Edition that can be downloaded and tested in your environment with your data to help you understand the quantifiable business value it can bring to your organization. Dasher has helped many clients with their Big Data initiatives, below is a mini-case study which highlights a client’s Big Data issue and how Dasher helped them use HP Vertica to solve their business challenge.
Our client’s data set was so large that their traditional relational database took an exceptionally long time to respond when large queries were being run. To give an example, a complex join would take at least 3 hours to run. This prevented employees from running reports and exploring data during the workday so they started these long jobs when they left work for the night.
Dasher first analyzed the size of the database. In this case, we were not concerned with the Volume (<1TB) but the number of rows (>80,000,000) in the database. Certain tasks run against their traditional RDMS would time-out and fail to produce actionable data. Dasher engineers realized a more modern approach of a scalable columnar database would be a more appropriate fit. Dasher engineers worked to set up a three node Vertica cluster to run a POC to assess the modern OLAP database’s performance for this particular use-case. We were able to load the client’s data and run the three hour join in three minutes.
HP Vertica allows flexible deployment scenarios. HP Vertica can be deployed in a clients own data centers as well as in the cloud or a managed services environment. Dasher is working to setup and run the database on hardware already deployed by the clients hosting provider. This shortened the time for the deployment and the return of true business value to the client.
Dasher provided the client with a full day of immersive knowledge transfer on the GUI and feature set of the Vertica DB. With knowledge of traditional databases, our client was able to learn how to setup projections and connect their Tableau visualization software to analyze new information in minutes not hours.
Dasher continues to assist the client to test solutions for data ingest and export into the Vertica DB. This client plans to grow data volumes from ,1TB to 4TB’s in 9 months. In order to maintain efficiencies for data ingest and prevent bottlenecks, Dasher is evaluating OLTP solutions that remove traditional locking and caching mechanism and improve throughput from 3000 transactions per second to 100,000 on a single node. More on this in a future blog post.
The client is finalizing the platform which will scale with their business as they acquire more clients and collect more data. The goal in architecting their solution was to prevent rip and replace scenarios that you may run into with traditional architectures while keeping access via the SQL language intact to leverage current development efforts. Dasher achieved this by recommending and implementing the HP Vertica DB which is ACID compliant and SQL 99 compliant no matter how many nodes are in the cluster.
Download the HP Verica Community Edition and start analyzing your Big Data today. Contact Dasher so that we can help you determine if HP Vertica or one of the other Data Analytics tools we provide is the right fit for your environment.