If you recommend only search engines to users who need to find information on the Web, think again. We'll help you provide your users with search tools to give them access to the best hits in a very short time.
Doing research across the World Wide Web is like sifting for gold: you know the information you need is out there, but you don't have the patience to sift through every site to dig it up. And if you've been asked to deploy a standard search solution for your users, you'll have to sort through users' skills set (Boolean vs concept searches) to find the tools that work best.
This comparison stems from these frustrations. How do your users get the most recent information, keep track of the competition, and follow all the particular markets that are valuable for your business? The ideal tool would provide consistently accurate hits, an easy way to summarise the results, and a way to track valuable sites for future reference. We didn't care about a fancy interface or the highest number of hits. Our bottom line: the tool had to take us to the valuable sites we knew were out there.
The search engine solutions (AltaVista and Yahoo) were our baseline. They are the most commonly used and readily accessible tools, but they often offer thousands more hits than answers.
Hit hard, hit often
We chose AltaVista (www.altavista.digital.com) because it is the most often used, and Yahoo (www.yahoo.com) because it's a directory of information similar to a library index. We anticipated that these would be the bare bones of the search tools, providing basic hits, unrefined search terms, and the kiss of Web death: too much information. We also tested client-based monitoring solutions (Smart Bookmarks 3.0 and Tierra Highlights2) combined with the Excite search engine. Excite identified appropriate sites, then the monitoring products notified us of any changes to those sites and to the Excite search lists.
Metasearch software solutions (Internet FastFind and WebCompass 2.0) use agents to find relevant Web sites and then update the lists of sites in the software on a schedule. The agents can search more than one search engine, store queries, and update links to relevant sites.
Metasite solutions (MetaCrawler and SavvySearch) are Web sites that let you search a broader range of search engines but with clearer results. MetaCrawler (www.metacrawler.com) has its own list of sites that it will search while SavvySearch (guaraldi.cs.colostate.edu:2000/) lets you choose which group of search tools you want to search.
Frittered away by detail
For building a query, you'll have to know if your users understand Boolean operators or if you have the time to train them. Some tools let users simply type in the concept to find what they need. For the best results, understand how you operate when you build a query. If you're intuitive, the concept search will probably be best. If you're deductive, you're probably comfortable with operators. Should this matter? Well, your own vocabulary is the best tool you have when you search. Know what you want, and make the tools do the work to find it.
When scoring our search results, we were very strict. Who wants to page through 1000 hits on the wrong topic? Getting the right information was not enough: most solutions came up with the same sites in one way or another. The key to solving our problem was organisation, not a fancy interface. We wanted to know what search engines were checked, and we wanted duplicates removed, links updated, and recent changes highlighted so we could find them easily. Basically, we wanted the search tool to give us the most relevant sites at the top of the list.
Because success is in the details, if you get stuck in the middle of a search, the online documentation should help, not hinder. If the only support available was to send e-mail to the Webmaster, then we tracked how long the wait was and if the problem got fixed. We were disappointed that only one solution had above-average support.
Tools of the trade
Access to a wealth of information is one of the Web's greatest attractions, but the tools that perform searches need more work. They waste a lot of your time as you sift through lists of inaccurate or misleading hits. Vendors should realise the only survivors in the gold rush were those who made money selling picks, axes, and Levi's, not the gold miners themselves. If information is gold, current vendors need to provide better solutions for users to find what they need. The thing is, as users we've been so pleased to get to all that information that we've settled for inadequate tools. It's time to raise our standards and force vendors to improve tools to perform comprehensive searches and deliver accurate hits as well as useful results.
Search engine solutions:
AltaVista - AltaVista Internet Software
Yahoo - Yahoo
Monitoring and search engine solutions:
Smart Bookmarks 3.0 and Excite - FirstFloor Software and Excite Tierra Highlights2 and Excite - Tierra Communications and ExciteMetasearch software solutions:
Internet FastFind - Symantec
WebCompass 2.0 - Quarterdeck
MetaCrawler - Go2Net
SavvySearch - Colorado State University
The question: Your users are spending too much time not finding the information they need on the Web. For a search tool that's easy for them to use and offers accurate results, should you buy or browse?
The issues: Ease of building a query, quality and accuracy of results, usefulness of online documentation, level of support; cost.
The options: Search engine solutions, monitoring and search engine solutions, metasearch software solutions, metasite solutions.
The answer: Save your money: metasites provide helpful hits, accurate results, and they're free.
Results at a glance
If you're frustrated using the same old search tools, try metasites. We tested four types of search technologies and found the best information with metasites. They forward your queries to many search engines simultaneously, then integrate and organise the results better than the other solutions we tested. Internet FastFind scored high with its monitoring tool, but metasites offer better results overall. And did we mention that they're free?
MetaCrawler (Metasite solution)
Bottom Line 6.4
MetaCrawler stands on the shoulders of giants to fulfil the metasite searching dream. By tapping many search engines at once, it provides consistently above-average and well-organised results. Previously run by the University of Washington, MetaCrawler has been acquired by Go2Net and since then has been in transition. We were forced to lower our scores with the interim version.
Pros: Many query building options Integrates results from its sourcesCONS Results span several pages with adsInternet FastFind (Metasearch software solution)Bottom Line 5.8 Internet FastFind's WebFind tool does a decent job of organising hits. The Notify tool only monitors individual pages, not search engines, but it provides many query options. The combination of the search tool and monitoring agent helped this Internet suite, but separately, they would not have scored as high.
Pros: The only solution that organised results by site or page Removes dead linksCONS: Not enough good hitsSavvySearch (Metasite solution) Bottom Line 5.7 SavvySearch consistently delivered accurate results. However, too many hits on its results page lacked summaries, which are critical to culling out irrelevant hits. Internet traffic slowed it down, too. If not for these crucial weaknesses, SavvySearch could lead the pack.
Pros: Offers many places to search Straightforward interface CONS: Can't build custom groupsYahoo (Search engine solution) Bottom Line 5.0 Yahoo organises sites into a logical hierarchy of categories, and if it covers what you're looking for, you can be assured many good hits. The Yahoo staff catalogues the content, and it's not as inclusive as other engines. But it's thorough with the topics it covers.
Pros: directory covers general topics
CONS: Unrelated sites show up in stem searchesSmart Bookmarks 3.0 and Excite (Monitoring and search engine solutions) Bottom Line 4.8 Smart Bookmarks used agents to reliably track changes. But Smart Bookmarks doesn't distinguish all changes from appropriate changes, interpreting frequently changing ad links as valid site changes. However, Smart Bookmarks is better than Highlights2 at monitoring conventional pages.
Pros: Excite is the strongest search engine we testedCONS: Smart Bookmarks doesn't highlight changesAltaVista (Search engine solution) Bottom Line 4.8 Once a leading Internet search engine, AltaVista has been eclipsed by innovative metasites and more agile search engines. We suggest the simple query, which is better than advanced searches at placing relevant documents at the top. It has a massive database of Web resources but returns far too many hits.
Pros: Live Topics search tables help reduce irrelevant hitsCONS: Returns far too many hits Ranking is inconsistentTierra Highlights2 and Excite (Monitoring and search engine solution) Bottom Line 4.3 Again, Excite is excellent to identify the sound sites that Highlights2 monitors. Excite has clean summaries and detailed results, but Highlights2 produces false alarms from rotating advertisements and is even less accurate when tracking changes to search engine pages. Its saving grace is that it highlights change, so you're drawn to what's new.
Pros: Highlights changes
CONS: Can't build a reusable search profile; clumsy stacked interfaceWebCompass 2.0 (Metasearch software solution) Bottom Line 3.2 WebCompass has received a great deal of praise, but it's overrated. The best thing that we can say is that it retains and organises queries. We retrained it to try to improve document ranking, but we still got only average results. It takes too long to move through summaries, too.
Pros: Can easily add new resources
CONS: Users can skew ranking Hard-to-use summary Search engine solutions Technology summary Search engine sites that once led the pack, such as AltaVista, are losing ground to more agile competitors that provide superior results and use interfaces that are more forgiving of user queries. And current leaders, such as Excite, are reinventing themselves with specialised content in order to maintain their lead.
Monkey see, monkey find. If a search engine doesn't know about a site, the users of that engine won't know about it, either. Each engine employs software procedures, called spiders, to traverse the World Wide Web in search of new pages. Variances in both the ranking methods and the spiders introduce discrepancies from one search engine to the next.
Although AltaVista deservedly earned accolades when it was first introduced, it has failed to keep up with other more agile solutions. As longtime users of AltaVista, what surprised us most was the disparity in results between its simple and advanced searches. In the simple - or concept - search, users enter a list of terms to describe the topics they are looking for. The advanced search, on the other hand, is more rigid - using complex Boolean expressions (and, or, not, near, and parentheses).
Simplify, simplify, simplify
Generally, the simple search returned much more accurate results for our test queries because it ranked results more effectively. We recommend the simple search as the best way to use AltaVista. Its free-form structure is more accessible to the average business person than the advanced search screen. AltaVista's internal ranking for advanced queries is either not ranking at all or ranking badly. Users can overcome it by manually inserting several ranking criteria from their query to push these documents to the top.
Live Topics is a feature that helps modify queries and cull out unwanted documents. It displays related search terms that can be included or excluded in the found documents. This feature is helpful in moderation, but if you check too many terms in one search, you risk excluding otherwise relevant hits.
AltaVista ranks its results but doesn't display confidence scores, which would provide more feedback. Most of the other solutions at least try to do this.
We were also irritated by AltaVista's inconsistent treatment of duplicates. At times, it recognised duplicates and displayed them together beneath a common summary. At other times, we found occurrences where more than one hit pointed to the identical URL and document.
Yahoo is an Internet directory whose superior organisation and high relevancy ranking are the result of 50 librarians rather than silicon brawn. But that's also its greatest weakness.
It organises sites in a hierarchy of categories, with some sites cross-referenced. Most sections of Yahoo are organised in an intuitive manner, such as "news and media: newspapers: regional: US States" and so on.
Working with Yahoo can be a grab bag. If Yahoo covers the topic you seek, you can be assured of many relevant hits. But if it doesn't cover your topic, you can come up empty or sorely lacking. Yahoo's a solid way to begin your search: just bear in mind that as good as it is, it's far from comprehensive.
If you need to search within Yahoo, drilling down for other topics can be less organised and a bit awkward. It's best to build a search of only a few terms; entering too many terms causes Yahoo's internal search to come up empty. It will then search AltaVista, and you won't derive any benefit from Yahoo.
It also treats all terms as stems, so the word "bald" is sought as "bald*" and captures unrelated hits such as "Baldy" and "Baldwin".
Point of no return
If its ranking were better, sites that included more terms would appear first, but they don't. The Options link near the search button is designed to narrow or expand a search, but it's not terribly useful. We were unprepared for the number of links that pointed to nonexistent Web pages.
Other solutions, such as Internet Fast-Find, were better at removing dead links. Yahoo should have an automated process to check on the health of its links.
Monitoring and search engine solutions
There's still a place in your search arsenal for monitoring tools. Once you find that relevant site - we teamed these tools with the Excite search engine - you'll want to stay on top of it. For example, if you want to instantly know when a competitor's Web site changes, you can create an agent to monitor it. Monitoring software acts as a catalyst between the Internet and the user by monitoring the sites that users select.
News or noise? However, monitor- ing agents report even insignificant changes, including advertisement changes, and some sites change frequently. It's important that users choose sites and schedules (hourly, daily, weekly) carefully, or blinking icons will alert them to every change. We were also intrigued with the prospect of defining agents to monitor pages from search engine queries. But the software didn't fulfill its promise. Periodically, it submitted saved queries to search engines to determine if there were new links, but the results were inconsistent.
Smart Bookmarks 3.0 and Excite
We paired Smart Bookmarks with the Excite search engine. We liked Smart Bookmarks better than Highlights2, but we still didn't feel that we were working smarter.
In contrast with AltaVista, Excite offers one intelligent search screen that processes both concept and complex searches. And with Excite's solid ranking, its accuracy doesn't falter nearly as much as AltaVista's when switching between the types of queries. Excite identified the sites we wanted to track, and Smart Bookmarks let us bookmark those sites and monitor them. The split-screen interface includes folders to organise and group queries on the left; hits and sites are on the right. We liked being able to organise these folders. The Results button instantly organised our results; it listed all the updated sites, so we didn't have to move through one folder at a time. Instead - bam! - they were listed right there for us. This feature is unique to Smart Bookmarks.
Preset agents, called Who's on First, are intended as examples if you install Smart Bookmarks on a company's intranet to monitor work-related sites. FirstFloor pushes the content from sites of general interest to keep in touch with its customers. But unless you deactivate the default sites, you'll get updates on things you never requested (such as gardening tips). You can click on the agent to grey it out and disable it. But FirstFloor received so many complaints, it posted instructions on its site to delete the agents.
Once the results are organised, you can click on the site name to launch your Web browser. We wished Highlights2 could do this. You can also place a local copy of the site on your hard drive. The site can be dynamically linked so that changes will show up locally when Smart Bookmarks updates results.
Smart Bookmarks assigns agents simply. You can drag a URL to Smart Bookmarks, and the site can be dropped directly into an existing agent. This one-step process made monitoring our sites easy.
Although Smart Bookmarks is better than Highlights2, too many false alarms from default sites lowered its score.
Tierra Highlights2 and Excite
We also paired Excite with Highlights2. But Highlights2 added little to the powerful search engine. Once relevant sites were located by Excite, we created assistants (Tierra's term for agents) to monitor those sites. We used Netscape Navigator's link feature to drag the site into Highlights2, and an assistant window popped up. We still had to do some work to set up the assistant for the site.
All assistants are stacked on the control bar and can be arranged to show changed sites at the top, but the vertical list is still long and cumbersome. When we had more than 18 assistants, scrolling through them was time-consuming.
Once a change in a site is found, Highlights2 places a check mark in the assistant label. An audible beep is optional for high-priority changes. To bring up the site on the local drive, we simply double-clicked on the assistant label. Unfortunately, Highlights2 acts like an offline browser, so there is no way to go directly to the site from the assistant. Highlights2 also comes with several default assistants, but they can easily be changed or deleted. Site changes are highlighted, which saves time when looking for them. However, Highlights2 notified us even when the change was an advertisement. Tierra's technical support confirmed that this is a bug, and the company is currently working on it. Tierra was the only vendor in this Comparison to offer above-average technical support.
We wish more vendors did the basics as well.
The highlighted changes caught our eye, but the query-building and monitoring capabilities leave more to be desired.
Metasearch software solutions
We expected stronger results from the metasearch client software; instead, we were disappointed by their weak performance. What good is the ability to organise queries from multiple search engines without solid hits? After all, these products submit your query to many Internet search engines simultaneously.
PC power? The metasearch solutions should harness the processing power on your desktop to relentlessly analyse and organise the returned hits, but the hits that are returned are mediocre. If a user wants 50 hits, for example, these products should spend whatever time it takes to visit 500 sites to return the top 50 available. Metasites use similar techniques but post consistently more accurate results. There's no reason why metasearch software can't do the same and add the strength of your PC to store, manage, and analyse hits in a well-formatted summary.
Internet FastFind is a suite of tools designed to help navigate the Internet; we only tested the search-agent (WebFind) and monitoring tool (Notify) to solve our business problem.
Learning to use WebFind took no time at all. We could change the search time and number of hits that we wanted returned. WebFind was the only one with a configuration tool as concise as MetaCrawler's, which boosted its score. It also sorts results by page or by site. One problem was that certain search engines returned no hits, but we knew from experience that there were relevant sites on that engine. Symantec is fixing this bug.
The Notify tool alerts you to changes in monitored sites when an icon flashes in the taskbar. Click the icon and a pop-up window appears on your screen to tell you what site has changed and to ask if you want to open it. None of the other solutions had this feature. Also, if there is a problem with a link to a monitored page, Notify will alert you. Internet FastFind discards invalid links to sites that it has found. Although this process took a few extra minutes immediately after the search results were found, it saved us time because all the sites we visited had active links. In terms of technical support, PC Anywhere and Norton's Utilities and AntiVirus products have a monopoly on the technical support line: there's no room for Internet FastFind.
Taken separately, WebFind and Notify return average results, but the combination, unlike the other solutions we tested, makes a more powerful tool.
Based on our research, WebCompass seemed like just the tool we needed. But every time we thought we were getting closer to the information we wanted, we'd run into a wall. WebCompass turned out to be a maze.
The three-pane screen was a familiar interface: query terms were on the left, hits appeared on the right, and summaries of the hits were on the bottom. But it wasn't easy to build a query. It took several searches that returned irrelevant information to learn that topic names were used as query terms. After adjusting our searches, we got better results. Also you can't count the number of hits without opening the browser.
In the document list panel, we clicked column headings to rearrange sites, search engines, and results alphabetically or numerically, which we found useful. The summary screen wasn't as helpful: we wanted to dump it to gain more room. In the summary, the text is small and difficult to read and includes keywords from the document, which we found to be incon- sistent and random. We were forced to open myriad documents to determine their relevancy.
Waste not, want not
Irrelevant documents were the bane of our searches.
WebCompass adjusts the ranking of documents based on whether the reader opens the site. It does this to train the software to recognise useful sites. Sites that are deleted indicate a negative response, so the software won't choose such sites again. But for us, if a summary didn't provide enough information to decide what to open, we inadvertently changed the ranking by opening the sites.
The results improved once we trained the software.
One good feature is that WebCompass retains an unlimited number of queries and searches, whereas Internet FastFind only retains your last 10 results. After an exhausting trip with WebCompass, we weren't any closer to the right information.
If you've got a good sense of direction, you don't need it.
Metasites draw on the strength of many search engines and are the most compelling way to search the Internet. They promise wider Internet coverage and deliver reliable results. And as if that weren't enough, they're also free.
No assembly required. These sites accept a user query, submit it to many search engines in parallel, then organise and display the results. In other words, they do the work for you. MetaCrawler and SavvySearch allowed us to choose the number of hits we could manage and found useful new sites that the other solutions missed. This process at once leverages the strength of relevant search engines while compensating for weak ones. Plus it eliminated duplicates from multiple engines - if only all the solutions could do this! Our ideal site would combine MetaCrawler's organisational strength with SavvySearch's consistently high hit rates.
Just when we thought it was safe to search the World Wide Web, our favorite site got a facelift. Created as a project at the University of Washington, MetaCrawler was purchased in February by Go2Net. During the current transition from freeware to a commercial vehicle, some of our favorite features have been redesigned or put on hold.
Building a query is simple: search any or all keywords or search a phrase (no operators are necessary) and go. But we really liked the options for configuring the search: we set a filter by country to search, the number of results per page or per source, and a time-out. We preferred the way the site used to let us set the configuration for all searches during our testing. The vendor confirmed that this option would return. We also couldn't save our queries for later use, which we had been able to save before. But this is a quibble: no site offers this feature.
As we've said, presentation is everything when trying to decide what hits to read. In the old version, a Java applet popped up a pie chart of search engines checked, engines without hits, and the total number of results. We had a visual map to assure us that the search was thorough, and the hits were easy to sort through. All the hits were on one page to make scanning and opening hits fast and easy.
Now the interface is redesigned, the server hardware is upgraded, code is enhanced, and banner advertisements are on several results pages. Either way, MetaCrawler has always integrated the results from multiple sources, and you still get an update of resources checked.
Have the changes broken the winner? No, but the transition is a good reminder that the Web is a mercurial environment. In fact, MetaCrawler is faster and cleaner than before. But most important, we trust that MetaCrawler finds the best sites that it has access to. Now it's the first place we go to search.
No advertisements, no long waits, no flashy animation - SavvySearch's interface is plain, and we trusted its results. Currently maintained by Colorado State University, SavvySearch has also changed recently. Be careful, there is now a site being built at www.savvysearch.com that has no affiliation with the SavvySearch site we tested.
There are configuration options to search all or any terms and search a phrase; to set the number of results to be retrieved; to display the summary as brief, normal, or verbose; to integrate results; and to set the language you want. There's even an experimental interface that groups Web resources for you to search into resources, indexes, directories, usernet, reference tools, tech reports, and entertainment.
Getting access to these databases is great, but there's one drawback that held back SavvySearch's score: it groups databases into subgroups, so you can only search a few engines at a time. The other drawback is that when the Internet itself is busy, SavvySearch slows down as it goes to all the search engines.
The search-results page doesn't have an applet or fancy interface: it lists the hits grouped by the source. The results also include links to the pages or the search engine source. The summaries come from the Web sites themselves, and if there isn't a summary on the site, SavvySearch doesn't provide one.
This makes it difficult to tell if the site is worth visiting. We also kept wondering whether the results were really the best available. If we could set which search engines we wanted to search or if we knew that SavvySearch brought back the best information from all of them, we would have raved. Currently, however, better results need to be collected through several searches, which loses the benefit of integrated searches.
Still, like MetaCrawler, SavvySearch found valuable sites that the other solutions missed. If you can do some of the legwork yourself, SavvySearch is a resourceful tool.