Friday, May 29, 2009

Prediction for Britian's Got Talent 2009

I predict that Susan Boyle will not win Britian's Got Talent.

My money is on Diversity. I am of less sure of them winning, than Susan Boyle losing.

I think the shock value (extraordinary voice of an ordinary looking spinster) has propelled Susan Boyle in super stardom. But the novelty has worn off and people have seen in the semi-finals that her isn't the greatest they have heard.

Diversity, on the other hand, have significantly upped the ante. They are radical, nimble, and eminently likeable. And they are so many (think of their family and friends all voting for them).

And I hope they win.

Update: 31/05/2009 - Hahahaha!! Hahahahaha!!

Tuesday, May 26, 2009

NEWS: News Effectively Worth Squat

Why do I need to know that Mike Tyson's daughter is critically ill.

Or that her name is Exodus for that matter.

Update - More of the same.
Chitra Singh's daughter commits suicide

Friday, May 22, 2009

Article published in Integration Consortium

My article on "More the data, less the business intelligence" is published here.
On my blog here

Friday, May 15, 2009

Respect vs Authority - IT view

In one of the training sessions I attended recently, I was involved in a fierce debate about the 'need to reign in IT programmers'. A lady said that on many occasions, the programmers go on and do things as they want and there was a need to check this. ANother gentlemen piped in saying that,"this is the only profession, where you don't need to have any qualification to be called an expert in". That's where he thought, "IT ALL WENT WRONG".

The disdain was rankling. For one, I think the advances that have happened in/using IT has far surpassed anything else in the last two decades. Think of home pcs, mobiles, gaming, internet for God's sake (the lady chose to differ!!). One of the primary reasons It has happened because of the immense freedom that the IT developers get. These programmers has been either working independently or have got enough freedom in their job to go away and do something that they thought was path-breaking, interesting and fun. Because the resources available to a programmer are essentially available free, he can go about this task all alone, with minimal supervision, and funding.

IT is a field that allows for the greatest exchange of ideas, simply because you can share the essential building block - the code - with anyone in the world. You can copy already available code and create something completely new. If what you come up with is good, people will use, approve, and promote it. If not, you get feedback to improve or bin it.

To say that management needs to always reign in these guys is not understand how fundamentally different things operate in the computing world. Creativity in any field is fostered through breaking down barriers and allowing maximum freedom. It's especially so in IT. Of course there needs to be checks on what the final output is, but even here, the old way of using authority to do that is obsolete and obstructive. Rather, we now have what is called respect. If you create something great - an article, paper, code, or design - you publish it to the world. The world studies it, reviews it, and scores you on your artifact. This is much more democratic, personality independent method of weighing the value of a creation.

You see it in blogs, movie rating,articles and code. Many times the rating is accompanied with enlightened comments from the reviewers, clearly articulating what aspects they liked or disliked. This feedback is much more useful than quality checks done through a single dedicated team of people.

Of course, sometimes the artifact you create, cannot be published to the whole world. But then, you can use private cloud / intranet to access vast pool of people in your organization to review your product.

But the moot point is, in World 2.0, RESPECT WORKS BETTER THAN AUTHORITY. Its high time we recognize and use it fact to our advantage.

Wednesday, May 13, 2009

CBIP recognition in UK

From my experience, it seems that there isn't enough recognition of the CBIP certification in the UK. The best measure of this is to compare the number of data warehouse / BI jobs advertised here in the UK and US.

In the US
Lets try Monster - It throws 3 jobs with CBIP as a requirement or prefered

At Jobster - it has 7


In the UK
Monster - 0

At IT Job Board - 0

Any given day, you will find similar trend - definitely recognized in the US, but not in UK. Wonder why that is? Not enough marketing/awareness building by TDWI

Friday, May 08, 2009

More the data, less the business intelligence

SUMMARY
Often, there is a quest to do more with the data we have. The notion is - "We have this piece of data, probably, it means something important. Lets find out!" This has led to creation of mammoth corporate data warehouses which are poor in imparting knowledge and big on maintenance. Most of the data that they contain is fractured, incorrect, and redundant. Good decision making doesn't need ALL the data that is available. In fact, good decision making demands limited data, i.e. KPIs. This means that data warehouses which are our decision support engines need to built as per specs (not as per what source systems contain). Moreover, we need to educate ourselves to look beyond data - recent global events have highlighted acutely the inherent limitations of data and people who use it.


DATA WAREHOUSES AS DUMPING GROUNDS
As organizations become more tech-savvy, they want to do more with the data they generate. Letting the data be just created and consumed for its fundamental need – facilitating a core business transaction - is unacceptable. Organizations want to store, examine, tweak and mix data, and hope that it will be useful in some way.

More and more buzzwords – Business performance management, digital dashboards, scorecards, web analytics, on-demand BI – have added to the confusion in the information management space, all the while, making the CIO worried that he is not doing enough with the data he owns. Organizations are trying to extract the last ounce of meaning from the smallest data element. The notion is - "We have this piece of data, probably, it means something important. Lets find out!" I wonder if it is the right approach.

Data stores are typically built for analysis and provide "business intelligence". Large ones - data warehouses - contain aggregate of all the organizational data. Data marts focus on specific subject areas or departments. Operational data stores cater to immediate needs - minute-by-minute or daily updates, and are smaller in size. The thing to note is that these stores are not core transactional systems themselves. They are decision support instruments that help the management in their decision making process and come up with business plans based on them.
Capturing all the available data in these stores is fraught with numerous problems.

1. More data we bring into them, the more noise we add. Some of this noise can be attributed to incorrect system entry or untidy manual processing. However, sometimes, there is a genuine anomaly e.g. sudden freak rains on a given day mean that footfall on the high street was reduced by 40%. But does this information help us in business planning? Not really.

2. It is difficult to sieve through loads and loads of data. It obscures rather than informs. You might organize all the data in neat multi-dimensional fancy reports. However, someone has to trawl through them to understand what the hell all those arrays, tags, and numbers are. By the time you do that, the time to make a critical decision may have already passed.

3. Capturing minutiae into a data warehouse requires huge spending. The infrastructure to clean the data, pump it into the warehouse, and store it adds massively to the costs. For companies already struggling to build a plain vanilla data warehouse, it's a plain vanilla foolishness to waste money on a real-time all singing, all dancing BI- behemoth.

4. Be prepared for poor performance from large data warehouse – normal databases cannot match the performance of data warehouse appliances, which are specifically targeted for large data volumes, and sophisticated data analysis. However such appliances are extremely expensive. Regular RDBMS databases struggle to process vast quantity of data, the load times and SQL queries are much slower, whatever the optimization strategy used. A lot of the user queries result in full table scans and fetching of millions of records at a time. This is thanks to the approach "Let's see what I can get out of the system" rather than "What do I really want?" It puts undue pressure on the system leading to poor performance. No wonder, it leaves users dissatisfied, who in the end, turn away from the system.

5. With the oceans of data available, the decision makers are free to glean information that fits their views or needs. This phenomenon is called confirmation bias. Managers can use it to justify the spending last year or budget for the next, demonstrate exceptional performance, or apportion blame for poor performance on some external factors. It doesn't matter whether people do this selective data filtering consciously or sub-consciously. The fact is we do it.

WHY DOES ALL THE DATA GET ASKED FOR?
Most of the projects are still run as IT change initiatives. This means that business is considered ancillary to the whole activity. The IT project managers don’t put sufficient demand for business resources. Unfortunately, the team managers do not want to take out key resources from their day-to-day activities to engage in something that provides no immediate value-add and may harm their team's performance. In the absence of clear business requirements, the IT team is forced to design a system to cater to any eventuality. Alas, that is not possible.

Some blame rests on the IT team as well. The business representatives don't know what features or functionality will be available in the new world. Business can do with some handholding about the capabilities of the new architecture – screen layouts, filters, collaboration, scheduling, security, flexibility etc. so business knows what to expect and provide their requirements accordingly. A few mock-ups go a long way in flicking the light bulb on, giving the business users a number of ideas about ways to use the new architecture effectively and also demonstrate its limitations. In the absence of any business-IT workshops early on, business will keep drifting and won’t knowing what to expect. They will keep ask for everything.

DATA AND DECISION MAKING
Decision support stores provide a chance to see through the convoluted, shifting data spaghetti and glean out the most pertinent information from it. The starting point always has to be 'what do we need to find out' backed by a strong 'why' and then move to 'How can we achieve this?'

We fail to understand the intrinsic difference between data, information, and knowledge. Data is unprocessed content. Information is a higher abstraction which provides additional meaning to the data. Knowledge is application of the underlying information or data, it's about true understanding of 'what lies beneath'. The important thing to note is that more data doesn’t mean more knowledge. And, more information doesn’t equal more knowledge. In fact, just the opposite may be true.

Good decision making doesn't need ALL the data that is available. In fact, good decision making demands limited data, what is known as KPIs - key performance indicators. E.g. KPI for recruitment team is recruitment increase percentage. It can be derived by number of new recruits this year over previous year. You might capture a whole lot of information about the new recruits, but most of it is peripheral to the success of the recruitment team. A doctor doesn't need to record all your physiological variables to correctly diagnose a problem. Just 3-4 key indicators are enough to tell him if you are having a heart attack. However, as we humans believe that more information one can acquire to make a decision, the better. This is known as information bias. However, extra information cannot affect our decision – what is not worth knowing is not worth knowing. Based on this, it is critical to limit the content of any design support system - be it a warehouse, or operational data store.

Some of the really good decision making is intuitive. With just handful of information and unconscious rapid cognition, we can arrive at an accurate judgement very quickly. It is what Malcolm Gladwell calls thin-slicing in his fascinating book, Blink. It is a powerful, sophisticated tool for taking quick decisions with minimum information. In fact, in the face of mountains of data, this ability no longer functions. Intuition, by definition is fragile and short-lived, and too much information can often paralyze it.

WHAT TRUE ANALYSIS DEMANDS
Real, meaningful analysis needs time and distance. There is no benefit in seeing all the variables right this second to be able to make a decision about the ‘next XYZ strategy’. Over a period of time, the data will be sufficiently stable to be able to give a reliable picture of current business trends and decide on how to tackle the future ones. Distance refers to the fact that one doesn’t need granular, detailed data for analysis. One needs to zoom out slightly to see it in a broader context. Summarized data rolled up at reasonable levels spliced across relevant dimensions will provide a more substantive, vivid depiction of the overall performance and help sharper, quicker decision making.

It’s a fallacy that ALL the answers are there in the data we capture. A school might be deemed to have exceptional teaching model based on the performance of the students at Grade 10 tests. However, this conclusion doesn’t take into consideration the fact that the school has a rigorous test based admission process, ensuring only the best students get admitted to the school. No school maintains un-admitted students on their rolls. This is called survivorship bias or problem of silent evidence.

The other trouble is inability to predict cataclysmic events like earthquakes, terrorist attacks, fires or for that matter, a recession. Such is the disruptive force of these events that the forecasting and predictive models just don’t stand up to scrutiny. This, at a time when you expect such tools to guide you the most. The need is to look beyond the obvious.

Massive data stores give us a sense of being all-knowing. We think we can just immerse into them and we will come out enlightened. The truth, as explained, is not that simple. In reality, the tools are as good as the data they contain and the people that use them. Unfortunately, both are inherently flawed. We need to remember two things - Data doesn’t tell everything, and you need to know what you are looking for.