ARN

Web Content Mining with Java - Techniques for Exploiting the Worlds Biggest Information Resource

Unlock the potential of the world's biggest database.

This practical book shows you how to build portals, construct search engines and other knowledge-based applications to mine the information you need from the Web.

* Written by a developer for developers
* A practical, hands-on approach
* Illustrates how Java associated tools (XML, HTML) can be combined with database technology to display and manipulate Web-derived information more effectively.
* Demonstrates how to build a structure browser, portal, meta-search engine and how to make 'Talking Pages'

Table of Contents

Preface.

About the Author.

Acknowlegements.

Surveying the Scene

Language of the Web

HTML and XML Parsing

Data Filters and Structured Queries

Building a Portal with Java

Building a Search Engine with Java

Mail Mining with Java

Introduction to Text Mining

Introduction of Data Mining

Loose Ends and Looking Ahead

Appendix A: Software Installation and Configuration

Appendix B: Javadoc Extracts

Appendix C: Earlier Versions of JAXP

Appendix D: License and Copyright Statements

Appendix E: Census 1891Data XML

Appendix F: Share Price Cluster Data

Appendix G: Glossary of Acronyms

References

Further Reading

Index
rhs_login_lockGet exclusive access to ARN's news, research and invitation only events.
ARN Distributor Directory
ARN Vendor Directory

iAsset is a channel management ecosystem that automates all major aspects of the entire sales,marketing and service process, including data tracking, integrated learning, knowledge management and product lifecycle management.