Big Data with BizViz?

by Avin Jain (CEO,BizViz)

Background

In this classic Analytics space, we comprise companies based on analytics, where people execute algorithmic crunching and sharing the findings as a white paper simultaneously. However, these findings are dead in some time. We can certainly relate to top notch vendors like SAP, Oracle, MSFT and IBM in our knowledge, who charge licensing cost involved in database, platform and for approximately 25 other products, they have acquired in due course of over a decade of time span. The Big 4s are coming up with good solutions for Big Data but they all are extremely costly with few talent bases available. With the advent of Big Data and Hadoop based infrastructure, today's analytics companies need to have all the skills, they should be able to setup Big Data Engineering lab with engineers knowing the best method to deal with large amount of data and being data scientists as well. Finally you need a strong BI Visualization framework or tool where dashboards can be seen on all devices.Companies like BizViz, having all the hard-earned capabilities to work end to end utilizing home-grown utilities and open source software can execute this job in the best economical manner possible and with the best control over real time analytics.The motive behind the collection of huge amount of data is to ensure deeper insight into analytics and the identified correlation between the data which would be commenced in further phases. However the data collected had a very diverse and unstructured format which could be of no use in the analytics in the current form. The data was stored in Hadoop Clusters (in HDFS).

What is Big Data ?

Well, everybody knows this by now. Let’s represent this figuratively as explained by our experts, clubbing all sorts of data into a data cobweb.

Why Big Data Analytics is required ?

Everyone knows that data volumes are growing exponentially. What's not so clear is how to unlock the value it holds. To improve the health of a person we monitor all parameters but in case of an organization more than 50% data is unstructured or partially structured but we don't use that to check the health of organization. A health of an organization is always relative. In case of internet of things, we need to keep a close watch on what is happening in our business dimension vis-à-vis competition. The organizations which don't do Big Data Analytics will probably perish in next 10 years or we can say the organizations who did Big Data in last 5-10 years are ruling the internet world today. The internet of things will bring all business platforms on Internet and Big Data will decide the financial growth, competitiveness and target markets of any progressive organization.

Key Skills to provide Big Data Analytics

In this document there will be repetition around these 4 Skills. In my view all these 4 skills are important to deliver a profitable real time Big Data Platform to an enterprise.

BizViz either directly or through any one SME partner is able to provide all these 4 skills. This way customer can be completely sure that one vendor is able to take care of all 4 legs of Big Data Analytics & the vendor is able to get world class support for a long time.

What are the pre-requisites for Big Data Implementation

Different analysts have been talking about ROI from Big Data Implementations. Some says 25% return, some says 55 cents return out of 1$ investment. All these data baffles us. According to us minimum 30$ return out of $1 invested, in 5 years should be there else there is something grossly mistaken. One of the key points is to ensure all following pre-requisites are being met before you start a Big Data Project.

  1. Conversion of all the text data to lower cases for exact matches
  2. You have a full-fledged BI implementation system and you are happy with the ROI it has delivered to your organization.
  3. You have done 'What-IF' analysis of your key Financial Parameters and have been flexible enough to make changes in your organization.
  4. You have been taking regular feedback or Survey from your customers, partners, vendors etc.
  5. You have defined a clear problem which you want to solve by Big Data Implementation.
  6. You have allocated a specific budget to solve this problem. Part of the budget is to allocate internal resources time who can work with a Big Data partner/Vendor like BizViz.

Note: In case you have not done 1, 2 & 3. We suggest you to take BizViz's BI consultancy services and do complete BI implementation and ask your team to implement it.

Exception to Pre-Requisites

Suppose you have identified a separate module where you directly want to leverage Hadoop based data warehouse implementation to save cost/tb than you should go ahead but this is not a Big Data Implementation Project, it is like a POC.

1. Subject Matter Expert at Work

  1. The domain expert understands the problem statement and defines the problem in a simpler way that technical team use to solve it.
  2. You have a full-fledged BI implementation system and you are happy with the ROI it has delivered to your organization.
  3. Knows where the data is residing, which data is useful and what data can be used for which type of Analytics.
  4. You have been taking regular feedback or Survey from your customers, partners, vendors etc.
  5. Can design different Scorecards, Algorithmic charts, Benchmarking Analysis reports & dashboards.
  6. Work with end client to define the real problem, define scope and agree on a scope & drives technical team.

2. Unlock Big Data - Working on Data Collection Layer
  1. The domain expert understands the problem statement and defines the problem in a simpler way that technical team use to solve it.
  2. Understand existing data sources.
  3. Search and navigate data within existing systems.
  4. Reading Web Data - Crawling or Scraping of data. Data can be even scrapped from Image, PDF, Doc, Audio file, Video file.
  5. Reading Social Media Data - we have a connector through which we can read data from FB, Twitter, Linked-in etc. - This is a tool developed by BizViz. More details on this in another blog on BizViz website.
  6. Reading Structured or Unstructured data from WEB Data Apps like Sales Force, Google Analytics etc.
  7. Providing end to end Survey Services where data can be trapped as a normal survey or as text based Survey. BizViz has complete end to end Survey Platform and Services [www.BizVizsurvey.com] & dashboards.
  8. We have dynamic HTML5 forms where metric data can be entered from Mobile devices which will be directly used for dashboards - This is tool provided by BizViz which is part of our HTML5 portal which runs on all devices.

3. Data Processing Layer

Data Clean Up - The unstructured data can be too large and most of it might be meaningless. But its collective message is meaningful and impactful. One needs to filter this data. Convert data to lower case, remove Punctuation marks, stem words for exact matches etc. Use NLP - Natural Language processing techniques at this step.

Categorization & Classification of Data - Use Machine Learning Tools like Apache Mahout or Enterprise R. Both are different tools that provide different Algorithms on Clustering and Classification. Automated Text Conversion is also used here for proper classification of unstructured data.

Finding the relations of different data and pushing this into Hadoop File System. It uses various tools and paves the way for modern data warehousing that will change the manner in which we think of a conventional database.

Hadoop Framework -

Hadoop based DWH Implementation Benefits:-

  1. In Hadoop, you don’t need to know what questions are needed to be asked before designing data warehouse – Hadoop
  2. Simple Algorithms on Big Data outperform complex models
  3. Powerful ability to analyse unstructured data
  4. You can save Millions in TCO
  5. 10x Faster, 100X Cheaper long term solution
  6. Maintains the same SLAs as you have been maintaining
  7. Changes can be implemented without impacting users

Data Organization Layer

  1. The relevant data can now be moved into HIVE [Again a Row- Column DB like MySQL or Oracle].
  2. A query can be written on this data [Hive Query Language to get relevant data].
  3. The latency is high at Hive so a Data Mart Layer is being created. In memory like Spark/Shark are in R&D stage but will come out soon to give in-memory flavour to open source database.
  4. Here different tools like Cloudera, Impala etc. can also be used as a MPP [massively parallel processing] query engine.
  5. Here we can arrange data from other Structured Databases as well.

Data Connector Layer of Analytic Engine

  1. Once we have the relevant data, now using a connector we can read the data in an analytics Engine - it can be R Server or Any third Party Analytic Application like Tableau or Jasper or SAP Business Objects or BizViz?s BizViz.
  2. A query can be written on this data [Hive Query Language to get relevant data].
  3. The data is fed into R Server and data is pushed back into Server layer of BI Visualization framework.

The BizViz or UI Layer - Completely developed by BizViz.

  1. The entire business can be visualized using our dashboarding product called BizViz. This actually represents the entire Big Data Analytics Framework.
  2. BizViz has a Designer which will help us to select different Charts and design the UI layer as envisaged by SME.
  3. The Analytics data is passed to server of UI and passed to relevant charts where Predictive Components are brought handled.
  4. The normal data can be passed to relevant charts by writing query services.
  5. The dashboard can be made on Benchmarking data, Individual customer data, Group data, Survey data or a combination of all of them.
  6. We have HTML5 Designer and HTML5 based technology where the dashboards can run on all Devices, latest Browsers etc.

The Hosting and Display of Analytics/Dashboards - BizViz has HTML5 Portal

  1. To display dashboards we need a portal that can host dashboards.
  2. This portal can be hosted on Premise or as well as on Cloud.
  3. The Portal is developed in HTML5 so that it can also run in all devices and all latest browsers. The portal and dashboards runs seamlessly on iPad and other mobile devices.
  4. The portal has strong Security, User Admin & Audit features.

Note: Setting up infrastructure involves setting Hadoop clusters, Hadoop administration, setting up other servers for various activities like BI/Reporting/Analytics/R Server etc. This is part of our Big Data Services.

TCO of a Big Data Project

With BizViz the TCO is very low as compared to any other Big Data Vendor; we are able to achieve this due to following reasons -

  1. Quick time to Value due to all 4 Big Data Analytics capabilities at our hand.
  2. Work Class Support at offshore rates.
  3. Full Open Source Compatibility & Integrated installation of other components.
  4. Enhanced Business Knowledge with flexible BizViz Analytics platform.
  5. Reduced Operational Risk since we have exposure to work on different Big Data Projects.
  6. Strong Delivery platform - HTML5 dashboards with world class UI [Already being using by few Fortune 500 Clients].

How to ensure ROI from a Big Data Project

In any organization based out of different continents, currently following is the ratio of structured data, Semi-Structured data and unstructured data. One can easily make out that when we decide business based out of 50% data most of our decisions are going to fall flat in some numbers of years.

With world getting digitized and huge data getting added on a daily basis, this % will further reduce from 50% to 20-25% in few years to come. Now think of 2 competitor companies one having invested in Big Data and doing analysis on 100% data and other company which is doing traditional BI. The change in business dynamics will be so high that the second company will be certainly wiped out in few years. So first is to survive. To survive one will have to do Big Data Analytics.

  1. Ensure that your team is not lethargic to changes. One needs to intelligently apply the new findings.
  2. There is lot of iterative and exploratory analysis - take small steps and once you get results - increase the intensity.
  3. Start with a proper POC that doesn't involve too much of cost. The real high cost is of real time Analytics. Before moving real time, ensure the findings are working for you.
  4. Do module by module. It is better to implement Big Data for a new module.
  5. Once solution is giving ROI, invest on making it real time.

When do you stop from a Big Data Project

  1. If the Big Data is not resulting in any ROI than look at the problem statement & the recommendation from vendor.
  2. In case you have not implemented the recommendation of the vendor than that needs to be done first for ROI. If you can't do it than stop the Big Data Project.
  3. In case you have implemented the recommendations and you are either not getting or getting negative results than change the vendor immediately.

Note1: Generally one should never stop from a Big Data Project. One can change the vendor if they are not getting the signs of ROI in 12-18 months itself.

Note2: Currently since Big Data is new therefore one will find thousands of Big Data Vendors who are just selling Big Data services without actually knowing it. They might be strong on Sales but actually don't have implementations capabilities. Be aware, Big Data is just not Hadoop & every new Big Data Start-up may not know its complexities.

Big Data Comparison Table

The above table will give you an excellent idea of who is providing what. Most of these are licensed products and some of them are extremely high cost.

BizViz Value Statement

  1. No Software Licenses - No need to buy any software licenses from any third party. BizViz does sell licenses of its dashboarding product but those licenses come free along with a large Big Data Project.
  2. Everything from BizViz - Get Big Data Analytics from a company which does everything from requirements to setting up a Big Data Lab to efficiently manage and upgrade the system in any technology.
  3. Get up and running quickly - With different tools built by BizViz on Big Data, any customer can expect results in few weeks/months.
  4. Highly Cost effective - Please see the TCO section for more details.
  5. Experiment with analysis on different data and combine them with other sources.
  6. Perfect merging of Traditional BI and Big Data Approaches.
  7. Top end Business Visualization that runs on all devices.
  8. Top end services from a team which has exposure of BI from last 15 yrs.
  9. BizViz team already has exposure on Big Data projects in Financial Services, Marketing Intelligence, Education, Banking and Automobile Industry.