Abstract
The World Wide Web is a major source of textual information, with a human-readable semi-structured format, referring to multiple domains, some of them highly complex. Traditional ETL approaches following the development of specific source code for each data source and based on multiple domain / computerscience experts interactions, become an inadequate solution, time consuming and prone to error. This paper presents a novel approach to ETL, based on its decomposition in two phases: ETD (Extraction, Transformation and Data Delivery) and IL (Integration and Loading). The ETD proposal is supported by a declarative language for expressing ETD statements and a graphical application for interacting with the domain expert. When applying ETD mainly domain expertise is required, while computer-science expertise will be centred in the IL phase, linking the processed data to target system models, enabling a clearer separation of concerns. This paper presents how ETD has been integrated, tested and validated in a space domain project, currently operational at the European Space Agency for the Galileo Mission.
Original language | English |
---|---|
Title of host publication | ICEIS 2007: PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS: DATABASES AND INFORMATION SYSTEMS INTEGRATION |
Editors | J. Cardoso, J. Cardoso, J. Filipe |
Publisher | INSTICC-INST SYST TECHNOLOGIES INFORMATION CONTROL & COMMUNICATION |
Pages | 199-205 |
Number of pages | 7 |
ISBN (Print) | 978-972-8865-88-7 |
Publication status | Published - 1 Dec 2007 |
Event | 9th International Conference on Enterprise Information Systems, ICEIS 2007 - Funchal, Madeira, Portugal Duration: 12 Jun 2007 → 16 Jun 2007 |
Conference
Conference | 9th International Conference on Enterprise Information Systems, ICEIS 2007 |
---|---|
Country/Territory | Portugal |
City | Funchal, Madeira |
Period | 12/06/07 → 16/06/07 |
Keywords
- Declarative language
- ETD
- ETL
- IL
- Semi-structured
- Text files