Building Intelligent Web Applications Using Lightweight Wrappers

Loading...
Thumbnail Image
Penn collection
Database Research Group (CIS)
Degree type
Discipline
Subject
web
XML
information extraction
wrappers
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Sahuguet, Arnaud
Azavant, Fabien
Contributor
Abstract

The Web so far has been incredibly successful at delivering information to human users. So successful actually, that there is now an urgent need to go beyond a browsing human. Unfortunately, the Web is not yet a well organized repository of nicely structured documents but rather a conglomerate of volatile HTML pages. To address this problem, we present the World Wide Web Wrapper Factory (W4F), a toolkit for the generation of wrappers for Web sources, that offers: (1) an expressive language to specify the extraction of complex structures from HTML pages; (2) a declarative mapping to various data formats like XML; (3) some visual tools to make the engineering of wrappers faster and easier.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2001-03-01
Journal title
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Postprint version. Published in Data and Knowledge Engineering, Volume 36, Issue 3, 2001, pages 283-316. Publisher URL: http://dx.doi.org/10.1016/S0169-023X(00)00051-3
Recommended citation
Collection