Data growth curve: Terabytes -> Petabytes -> Exabytes -> Zettabytes -> Yottabytes -> Brontobytes -> Geopbytes. It is getting more interesting.
Analytical Infrastructure curve: Databases -> Datamarts -> Operational Data Stores (ODS) -> Enterprise Data Warehouses -> Data Appliances -> In-Memory Appliances -> NoSQL Databases -> Hadoop Clusters
In most enterprises, whether it’s a public or private enterprise, there is typically a mountain of data, structured and unstructured data, that contains potential insights about how to serve their customers better, how to engage with customers better and make the processes run more efficiently. Consider this:
- Online firms–including Facebook, Visa, Zynga–use Big Data technologies like Hadoop to analyze massive amounts of business transactions, machine generated and application data.
- Wall street investment banks, hedge funds, algorithmic and low latency traders are leveraging data appliances such as EMC Greenplum hardware with Hadoop software to do advanced analytics in a “massively scalable” architecture
- Retailers use HP Vertica or Cloudera analyze massive amounts of data simply, quickly and reliably, resulting in “just-in-time” business intelligence.
- New public and private “data cloud” software startups capable of handling petascale problems are emerging to create a new category – Cloudera, Hortonworks, Northscale, Splunk, Palantir, Factual, Datameer, Aster Data, TellApart.
Data is seen as a resource that can be extracted and refined and turned into something powerful. It takes a certain amount of computing power to analyze the data and pull out and use those insights. That where the new tools like Hadoop, NoSQL, In-memory analytics and other enablers come in.
What business problems are being targeted?
Why are some companies in retail, insurance, financial services and healthcare racing to position themselves in Big Data, in-memory data clouds while others don’t seem to care?
World-class companies are targeting a new set of business problems that were hard to solve before – Modeling true risk, customer churn analysis, flexible supply chains, loyalty pricing, recommendation engines, ad targeting, precision targeting, PoS transaction analysis, threat analysis, trade surveillance, search quality fine tuning, and mashups such as location + ad targeting.
To address these petascale problems an elastic/adaptive infrastructure for data warehousing and analytics capable of three things is converging:
- ability to analyze transactional, structured and unstructured data on a single platform
- low-latency in-memory or Solid State Devices (SSD) for super high volume web and real-time apps
- Scale out with low cost commodity hardware; distribute processing and workloads
As a result, a new BI and Analytics framework is emerging to support public and private cloud deployments.
Data overload is becoming a huge challenge for businesses and a headache for decision makers. Public and private sector corporations are drowning in data — from sales, transactions, pricing, supply chains, discounts, product, customer process, projects, RFID smart tags, tracking of shipments, as well as e-mail, Web traffic and social media.
I see this data problem getting worse. Enterprise software, Web and mobile technologies are more than doubling the quantity of business data every year, and the pace is quickening. But the data/information tsunami is also an enormous opportunity if and only if tamed by the right organization structure, processes, people and platforms.
A BI CoE (also called BI Shared Services or BI Competency Centers) is all about enabling this disciplined transformation along the information value chain: “Raw Data -> Aggregated Data -> Intelligence -> Insights -> Decisions -> Operational Impact -> Financial Outcomes -> Value creation.” A BI CoE can improve operating efficiencies by eliminating duplication and streamlining processes.
In this posting we are going to look at several aspects of executing a BI CoE:
- What does a BI CoE need to do?
- Insource or Outsourcing the BI CoE
- Why do BI CoE’s Fail?
- BI CoE Implementation Checklist
Everyone has data, but the more elusive goal is getting value out of that data The growing challenge in corporations is how to organize for “data as a platform.” What is the right organizational structure that will help monetize data?
John Wanamaker, considered a pioneer in modern advertising, said: “Half the money I spend on advertising is wasted; the problem is I don’t know which half.” Today, we can say the same of enterprise investment in business intelligence (BI), analytics, and big data.
Even after doing their best for over 20 years to build centralized, scalable information architecture, I found that only a small percentage of organizations’ data is actually converted to useful information in time to leverage it for better insight and decisions.
At both strategic and tactical levels, much of this gap can be explained by the fundamental disconnect in goals, objectives, priorities, and methods between IT professionals and the business users they should ideally serve.
The other challenge facing leadership is the rapid evolution of the data platform (see below.) How do you create strategies that adapt to a changing landscape?
How do you become a world-class data-driven firm? What portfolio of projects do you execute to mature the capabilities?
If you’re an executive, manager, or team leader, one of your toughest responsibilities is managing and organizing your BI, Reporting or Analytics initiative. While the nuances – skillsets, toolsets and datasets — are different for each initiative, the fundamentals of managing, organizing and structuring are pretty much the same.
Almost every Fortune 1000 company’s management is increasingly focused on monetizing small data, big data or fast data, and how to gain a real-time competitive edge from their information. How can firms achieve positive returns on their analytic investments by taking advantage of the growing amounts of data?
So what’s the right organizational model that will help them achieve the “ten second advantage”? Competency Centers, Centers of excellence (CoE) or Shared Services models are execution models to enable the corporate or strategic vision to create an enterprise that uses data and analytics for business value.
The goal of every World-class CoE is the same – enable the right combination of toolsets, skillsets, mindsets and datasets for better, faster, cheaper and more repeatable analytics, reporting or platform development.
Evolution of BI/Reporting/Analytics
- Data is Growing Faster than Budgets
- Demand is Growing, Speed to Insight is Crucial
- Modifying large, existing applications is NOT the path forward.
- Skills are lagging.. New tooling
As a result, Enterprise BI and Analytics strategies need to evolve. The evolution tends to happen in 3 phases:
- Department Solutions – Many companies deploy Analytics (and BI) applications as departmental solutions, and in the process, accumulate a large collection of disparate BI technologies – SAP Business Objects, IBM Cognos, Microstrategy, Oracle OBIEE, Microsoft, Qlikview, Tableau, Spotfire etc. – as a result. Each distinct technology supported a specific user population and database, within a well-defined “island of analytics.” At first, these dept islands satisfied the initial needs of the business, but early success in departmental deployment sowed the seeds for new problems as the applications grew.
- Successful applications and platforms always expand. The second phase of Analytics (and BI) is where there is tremendous growth and platform solutions are longer isolated islands. Instead, they overlap in user populations, data access, and analytic coverage. As a result, organizations are now faced with an untenable situation. The enterprise is getting conflicting versions of the truth through the multiple disparate BI systems, and there is no way to harmonize them without an extraordinary ongoing manual effort of synchronization, validation and quality checks. Equally problematic is the fact that business users are forced to use many different BI tools depending on what data they want.
- The third phase of Analytics (and BI) is one where the executives had enough. They simply make a decision to rationalize to a single platform or a centralized model that is sold as a “magic nirvana” solution…delivers one version of the truth (golden source of data) to all people across the enterprise. It can access all of the data, administer all of the people, eliminate repetitive data access, reduce the administrative effort, and reduce the time to deploy new BI applications.
“Time to decisions, scope of decisions, disconnected toolsets and cost of decisions” is deemed unacceptable within & across functional areas. This typically drives a new phase… centralized BI, Reporting or Analytics CoE.
For example, at a Fortune 500 company, costly self-service environment, static reports, departmental solutions and other issues (shown below) forced them to re-think and re-engineer their enterprise BI solution. The firm set new target objectives…(1) Shorter time to insights; (2) Greater leverage for analytics team; (3) Accelerated product innovation and (4) 20% reduction in BI support costs.
While centralization of BI, Reporting and Analytics can enable organizations to reduce their IT delivery costs by up to 40%. However, a failure to align the level of BI, Reporting and Analytics centralization closely to long-term business and IT strategic goals and to manage the transition to centralized delivery carefully can not only erode expected savings from centralization, it can increase the cost of delivering IT services by up to 30-45% compared to a pre-centralization baseline. This where good management can make a big difference.
BI CoE Elements for Faster, Better, Cheaper Execution
BI CoE (could be Analytics CoE, Big Data CoE or Integration CoE) is an organizing mechanism to align People, Process, Technology and Culture. The target benefits include:
- Better collaboration between Business and IT
- Increased adoption and use of BI and Analytics in the lines of business.
- Better data management, quality and reporting
- Cost savings from eliminating redundant functions
CoE elements include:
The “Raw Data -> Aggregated Data -> Intelligence -> Insights -> Decisions” is a differentiating causal chain in business today. To service this “data->decision” chain a very large industry is emerging.
The Business Intelligence, Performance Management and Data Analytics is a large confusing software category with multiple sub-categories — mega-vendors (full stack, niche vendors, data discovery, visualization, data appliances, Open Source, Cloud – SaaS, Data Integration, Data Quality, Mobile BI, Services and Custom Analytics).
But the interest in BI and analytics is surging. Arnab Gupta, CEO of Opera states why analytics are taking center stage, “We live in a world where computers, not people, are in the driver’s seat. In banking, virtually 100% of the credit decisions are made by machines. In marketing, advanced algorithms determine messages, sales channels, and products for each consumer. Online, more and more volume is spurred by sophisticated recommender engines. At Amazon.com, 40% of business comes from its “other people like you bought…” program.” (Businessweek, September 29, 2009).
Here is a list of vendors who participate in this marketspace: