8 Proven Steps to Starting a Big Data Analytics Project

Feb 10, 2014
Scott Raspa

Initiating a big data analytics solution can vary in time from as little as a few week to a multi-year effort. This is dependent on many factors: your true understanding of your requirements, your perceived information technology (IT) hurdles or,  your access to the data the you require, the complexity of the analytics, and many, many more. So while each big data organizations may have differing opinions on how long it actually takes to start and implement a solution, it doesn’t mean that one is right and others are wrong.

Solutions Should Be Business-Driven

One thing that all vendors and organizations should agree on is that a big data analytics solution should be a business decision, not an IT decision. There are numerous resources providing information on how to start and implement big data solutions, such as: Intel’s planning guide for getting started with big data and IBM which has a big data hub that lists 10 big data implementation best practices, both of which are really informative. We would also like to share our thoughts on the topic based off our own experiences.

An Eight-Step Approach to a Big Data Analytics Project

We’ve created a high-level list that speaks to 8 different steps to consider when initiating a big data analytics project.

  1. Problem. Determine what the problems are you want to solve. Here you need to identify what issues your organization is facing and envision what solutions might be to those problems.

  2. Impact. Understand how these problems impact your business and then develop use case(s).  Are you losing millions? Is your staff wasting time by doing more data entry and less analysis? How is this problem impacting your organization?

  3. Success criteria. How will you measure the success? What are the top metrics you need to track throughout this process?

  4. Value & Impact. What you need to clearly understand is if this problem was solved, what would it mean for your organization? This is typically one of the most crucial steps as it helps determine the if, how, and when you should move forward with this project. It also provides context for determining the budget for your solution. For example, let’s say you work for a financial institution and your group is dealing with fraudulent activities that cost the company $5M per year. Your goal is to reduce that to $2 million or less in the next year. Spending $3 million on a solution wouldn’t really provide the ROI that you’d like to see, however, if you could implement a solution for, let’s just pick a number, $700k, that’s an ROI goal that’s definitely worth pursuing and implementing. If $5 million dollars of fraudulent activity has a significant impact on your business, then this is probably a high priority problem and you want to solve this as soon as possible. So understanding how your specific problem impacts your business is crucial in implementing the right solution.

If you can’t clearly define and articulate steps 1-4, there is no point in moving to step 5. Also, note that the first 4 steps have little or nothing to do with the technology. This is intentional as you don’t want to force technology to solve your business problems. You are starting with the business problem and will map the appropriate technology to solve it.

  1. Cloud or On-Premise. Decide where the solution should live and whether it should be a cloud, on premise, or hybrid solution.

  2. Data requirements. Evaluate your data requirement and understand what data is required to solve this problem. Is it data you already have? Is it data you need to go out and get? What is that data and what are the requirements that you need?  What is throughput / performance requirements for the data?  What are your retention and retrieval requirements?

  3. Identify gaps.  Determine if this is something that your organization can accomplish with existing or in-house resources and technologies or if you need help from vendors. Do you have enough staff to solve this problem? Are they capable of solving this problem? Will you need additional hardware or software to solve? Identify those gaps and make sure you plan accordingly.

  4. Agile or iterative approach. Start with a pre-production or a pilot implementation. Set goals and milestones and break them up into manageable chunks. Once the pilot is up and running and you see value from it, roll it out into production and enterprise-wide use.

Breaking Down Big Data Anaytics Into Smaller Components

We typically start off our new clients with smaller, very focused and targeted implementations to solve specific problems or issues. This allows for very manageable implementations that shorten the timeline to get to a valuable solution. Typically, that means a few weeks to a few months for these small, targeted, and focused implementations so our clients start to see value more immediately with iterative successes. The result is a pretty quick turnaround and a the ability to identify the ROI much faster.

We’re also able to significantly reduce risk for our clients with this approach not requiring multi year commitments and vendor lock-in .   They can implement smaller pilot solutions to make sure it will solve the problems at hand.

Another important benefit of this approach is its flexibility. Requirements change, needs change, the data that you’re looking at may change. All of these require a very flexible solution in order to be most effective. Once the implementation has demonstrated the capability to solve the problem, then it should be rolled out to production for enterprise-wide use.

There are many variables to implementing a big data analytics solution. Starting off with a very focused, targeted use case can break it into more manageable chunks and make it easier to realize the ROI and value from that solution. That’s our approach here at Ikanow.

Interested in receiving quarterly newsletters from IKANOW?

 Learn more from IKANOW:

Visit the Resource Center