Data Mining: Online Examples

Posted in Data mining, Online, VLAB with tags on January 21, 2010 by Shankar Saikia

Roughly three years ago, I received the following nugget of wisdom regarding entrepreneurship: do something at the “intersection of your interests, skills and where the market is going.” This came from a presenter at VLAB who gave the same advice to her son as he was heading to college and felt that the advice was appropriate for aspiring entrepreneurs as well.

With so much going on in the glamorous worlds of mobile, social networks, music etc., why would one pick something as boring, mundane, prosaic ( fill in your favorite synonym for “boring” here) _______ as data ??? For me it’s because I’m a numbers guy, I like the worlds of planning, forecasting etc. and it appears that data analysis is emerging as an area of growth – for me data mining is at the intersection of my skills, likes and where the market is going.

What is data mining? My informal definition is that data mining is the process of getting some benefit, whether economic or non-economic, from information. A simple example, courtesy of Roger Magoulus from OReilly Media,  is’s listing of each book’s “ Sales Rank” – that single piece of data helps buyers make decisions.

Everyone is aware of the tremendous growth of social networks. Anyone who uses LinkedIn has probably noticed the “People You May Know” box – that’s a case of LinkedIn using information on connections between your contacts and the contacts’ contacts  – another example of data mining. If you do a search for “data mining” on a job site like Simply Hired, you may find social networking companies like Yelp and Facebook advertising for data mining experts. Today I noticed that Simple Hired itself has added a neat capability – it can show your Linkedin contacts next to a job listing – the value being that you can ask your LinkedIn contact to possibly to give you a referral – another example of using data for your benefit.

What about those “other” companies?  Can a “normal” company, not just a social networking site, also mine online data? Sure. Take a look at this chart that shows trends for Google searches for rental car companies:

Trends in Google Searches

In this case each rental car company can investigate why there were relatively more searches for Enterprise Rental Car – was it because Enterprise advertised more in a specific location? This is an example of data mining of external information (i.e., information that does not reside within the corporate technology systems). You can view the actual chart here, and even drill into specific locations (for example, there were more searches for Dollar Rental Car in Hawaii).

Hal Varian, an economist who works for Google, recently said that statistics may be a good career choice in the future: “… The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill..”

I hope these examples gave you a better understanding of data mining as it pertains to online data. What do you think?


VLAB: Data Exhaust Alchemy Event (January 19, 2010)

Posted in Data mining, VLAB on January 19, 2010 by Shankar Saikia


I attended the VLAB event “Data Exhaust Alchemy: Turning the Web’s Waste Into Solid Gold” at Stanford GSB today. It was great – Bishop Auditorium was packed. Here’s a very brief summary of what I learned from each speaker:

1. Roger Magoulas, Director of Market Research, O’Reilly Media: He related the story of how, by simply adding the popularity ranking of books, was able to add a lot of value for customers – a great example of using the data exhaust.

2. JB (Mike John-Baptiste), CEO, PeerSet: Peerset has developed algorithms that mine web data to help advertisers target the right audience.

3. Mark Breier, General Partner, In-Q-Tel: The venture arm of the CIA has invested in the following companies: Visible Technologies, Palantir and Fortius One. There are many security-related applications of the data exhaust.

4. Jeff Hammerbacher, Vice President of Products and Chief Scientist, Cloudera: He left Facebook because he felt that he did not understand consumer technologies such as online advertising. Cloudera makes the open source version of Hadoop, which uses the Mapreduce algorithm developed by Google.

5. Dr. DJ Patil, Chief Scientist and Sr. Director of Product, LinkedIn: He preferred the word “ecosystem”  (over the phrase “data exhaust”) to describe the data created on the web. He mentioned that with every passing day there are fewer people who are not on Linkedin.

6. Pablos Holman, Futurist, Inventor, Security Expert, and Notorious Hacker: He was AWESOME. He stressed that, from a security perspective, everything that we do online and using mobile devices is in the cloud. He showed a cool demo of how our credit cards are NOT that secure.

My overall opinion is that the speakers and their respective organizations were working on some very difficult and exciting problems related to the growing volume of data. A point that Richard made was that mining the social graph (e.g., our Facebook friends and the things we do, like etc. as recorded on Facebook) is very challenging. The good news is that companies like LinkedIn have been able to extract value from its data, and added cool capabilities such as recommending people we can connect to.

Bottom line:  did the meeting give me any ideas for products or services that I can sell to my enterprise customers? ….. Yes.

Data Deluge!

Posted in Data mining, Enterprise Software Sales, Twitter on January 12, 2010 by Shankar Saikia

The year was 2006, George W was in the white house and Google was the king of search. If you wanted a restaurant review you probably either looked at a copy of Zagat (the book, not online!) or you did a Google search. Fast forward to 2010 and what’s changed? Beyond the obvious change inside 1600 Pennsylvania Avenue, now if you want to research a restaurant you do a Yelp search – what a difference 4 years makes! What else has changed?  The biggest change is the growth of data – mobile interactions, Google searches, Twitter tweets, Yelp reviews, Facebook friends and pokes and pictures …. it’s a data deluge out there!

I”m increasingly being convinced that the next big tech opportunity lies in being able to do something with the data deluge. A recent column in Gigaom mentions that Facebook’s greatest asset is it’s social graph (i.e., the connections between people and their friends, pokes, pictures etc.). Facebook is working very hard at extracting value from this data. Similarly, Yelp is trying to mine its own user-generated restaurant reviews.

The key to solving this data-mining challenge is to engage in “non-linear thinking.” It’s important to keep in mind that there is more to data mining than the IT-focused steps such as data collection, aggregration, modeling, validation etc.  One of the most difficult parts of data mining social media information is that the data is mostly unstructured (i.e., text, pictures etc.).

I’m really excited about the data challenges – this is the age of big data – for more on big data read this .

What do you think? Do you see an opportunity in the data deluge?

Think “Process” NOT “Software”

Posted in Business Process, Entrepreneurship, Pain versus Need on December 4, 2009 by Shankar Saikia

On Saturday evening I met a gentleman who had been attending football games at Stanford for 53 years. We started to talk about Stanford football which has always been one of my favorite topics. Later our conversation drifted to entrepreneurship and he told me a story about how he recently advised two former homeless guys that started a diaper and toilet-paper distribution company. The lesson he gave me is to “sell what the market needs” – go here for more on this fascinating story.

As someone in the enterprise applications software industry I often dream up ideas for new, innovative products. The goal is always to solve a business problem. Lately I have been interviewing business owners and asking them the following question:

“What are your two or three biggest pains”?

Here are the most popular answers:

  • CASH FLOW: having cash to run the business
  • PERSONNEL: while “personnel” is somewhat of an archaic term for human resource professionals (“human capital” seems to be the new phrase)  one business owner told me that people are important because “during bad times they will help you, during good times they will help you”
  • LEADS: one business owner told me that during good times generating leads is not as important as during bad times such as now

These answers are fascinating because, while they are challenging problems, solving them requires changing business processes. Software consultants and salespeople often like the peanut-butter strategy of answering every question with the words “yes, our software can do that” ;).  The reality is that software alone does not solve most  problems.

The next time you work on entrepreneurial ideas or are trying to sell enterprise software think of how you can solve your customer’s  problems with better business processes.

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to FurlAdd to Newsvine

10 Small Business Social Media Marketing Tips

Posted in Social Media Marketing, Twitter on November 19, 2009 by Shankar Saikia

10 Small Business Social Media Marketing Tips

Posted using ShareThis

Social Media Marketing – What is it?

Posted in Social Media Marketing on November 18, 2009 by Shankar Saikia

Earlier today I heard one of the client’s executives walking past our conference room say “oh wow, since you guys are talking about social media marketing, it must be cool stuff .. I want to be in YOUR meeting.” That was a revealing comment because it gave me more evidence that the term “social media” has entered the corporate lexicon. Later in the day, a friend of mine and also a top executive at a tech company, mentioned that he wanted to get a better understanding of social media marketing.  Let me see if I can explain social media marketing.

I like to compare “traditional media” with “social media”.  Traditional media is content created by companies – either media companies like newspapers, magazines, television broadcasters or non-media companies like your friendly automobile company, or hotel chain etc. Social media is content that is created by … society .. i.e., you and me!  … writing on a Facebook wall, writing or commenting on a blog are examples of social media.  Social media marketing is the use of socially generated content to build awareness of a product/service etc.  Writing a blog post about a product or service, posting updates on Twitter etc. are examples of social media marketing.

You may ask: what does web 2.0 have to do with social media or social media marketing?  Web 2.0 is a general term for technologies that enable normal ( ;)))  ) people like you and me to create our own content.  Technologists like to use the words “interactivity and interface” to describe the main characteristics of web 2.0.

Do you have a better understanding of social media and social media marketing now?

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to FurlAdd to Newsvine

E-mail Tactics for the Smartphone Age

Posted in Communication, E-mail on September 24, 2009 by Shankar Saikia


If most of your official (i.e., work-related, important etc.) e-mails have these two qualities, then you do not need to read any further.


I am beginning to realize the following regarding my own usage of e-mail:

– the e-mails that I send tend to be relatively long
– the e-mails that I receive tend to be relatively short (with words like “sounds good”, “agreed” etc.).

These days it seems that many read and reply to e-mails on their iPhones, Blackberries and other smartphones. In addition most readers do not have a lot of time to read or write e-mails. The challenge then is to craft an e-mail that is direct (to-the-point) as well as concise. Whether your e-mail contains a request or marketing communication, your goal is to convey the message quickly and with minimal usage of words.

It is not easy to be direct and concise in your written communication –  we need to develop the “direct and concise” writing style if we want our e-mail communication to be effective.

Do you agree?

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to FurlAdd to Newsvine