Understanding enterprise search
August 18, 2006, 10:41pm PDT | Length: 00:03:10
For businesses, conventional search engines often deliver too manyresults or irrelevant information. Brian Babineau of Enterprise StrategyGroup explains why enterprise search is a more dynamic approach toaccessing corporate data.
Hi, my name's Brian Babineau. I'm an analyst with theEnterprise Strategy Group and I've been covering the information managementsoftware market for the past four years, including enterprise search. Today I'dlike to talk to you a little bit about how enterprise search works; to trulyunderstand enterprise search.
Now what we're talking about today is not web searches, likeyour Googles and Yahoos. Let me give you an example of an enterprise searchactivity. Let's say we need to find the one email that Martha Stewart sent toher broker about Imclone. That'd be a very difficult task if we were usingbasic web tools to do that.
First, why do we need enterprise search tools? We need it toretrieve information. And why do we retrieve information? Well number one, welost it or we deleted it. The second, as in the Martha Stewart cases, someexternal party like a lawyer or a regulator wants us to produce it.
Now that we know why we need to retrieve information andunderstand our information a little bit better, let's talk about what we'regoing to be searching. What types of information formats? Now as I said, we allequate web pages to search, but in reality, enterprise information encompassesthree types: number one, emails; number two, structured data; and finally onethat we all know, our Microsoft Office and general purpose files. Theseinformation formats comprise our enterprise information, thus we're going toneed to search them.
Now let's talk a little bit about the back end of search.When we need to search something, we need to have an index to actually locateour data specific files. We can build an index in two ways, attribute index orcontent index. Now there are pluses and minuses to either of these formats andI'll go through those real quickly.
First an attribute of data descriptors. Data descriptors maybe the file name, the owner, when it was created. This index is very small. Andthe pluses are you can search it quickly, find things faster and you don't haveto store as much information. The minuses are you can't really do keywordsearches or number patterns. Full content indexing means you analyze all of thedata within the file, all of the words. It's a bigger index. The pluses are youcan do keywords. But searches take longer because we're going against a muchlarger subset of data.
Today I've talked about the why we need to search ourenterprise information, what we're going to search, the different files, andthen how we're going to search them. So the next time when you're looking forthat email or you need to produce that file as a part of discovery, you can doso quickly meeting your business requirements.