Database Research Group (CIS)

Title

Building Intelligent Web Applications Using Lightweight Wrappers

Document Type

Journal Article

Date of this Version

March 2001

Comments

Postprint version. Published in Data and Knowledge Engineering, Volume 36, Issue 3, 2001, pages 283-316.
Publisher URL: http://dx.doi.org/10.1016/S0169-023X(00)00051-3

Abstract

The Web so far has been incredibly successful at delivering information to human users. So successful actually, that there is now an urgent need to go beyond a browsing human. Unfortunately, the Web is not yet a well organized repository of nicely structured documents but rather a conglomerate of volatile HTML pages. To address this problem, we present the World Wide Web Wrapper Factory (W4F), a toolkit for the generation of wrappers for Web sources, that offers: (1) an expressive language to specify the extraction of complex structures from HTML pages; (2) a declarative mapping to various data formats like XML; (3) some visual tools to make the engineering of wrappers faster and easier.

Keywords

web, XML, information extraction, wrappers

Date Posted: 08 June 2007

This document has been peer reviewed.