Disentangling the Structure of Tables in Scientific Literature

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Within the scientific literature, tables are commonly used to
present factual and statistical information in a compact way, which is easy
to digest by readers. The ability to “understand” the structure of tables is
key for information extraction in many domains. However, the complexity
and variety of presentation layouts and value formats makes it difficult to
automatically extract roles and relationships of table cells. In this paper,
we present a model that structures tables in a machine readable way and
a methodology to automatically disentangle and transform tables into the
modelled data structure. The method was tested in the domain of clinical
trials: it achieved an F-score of 94.26 % for cell function identification and
94.84 % for identification of inter-cell relationships.

Bibliographical metadata

Original languageEnglish
Title of host publicationNatural Language Processing and Information Systems
Subtitle of host publication21st International Conference on Applications of Natural Language to Information Systems, NLDB 2016, Salford, UK, June 22-24, 2016, Proceedings
Place of PublicationSwitzerland
PublisherSpringer Nature
Pages162-174
Number of pages13
Volume9612
ISBN (Electronic)978-3-319-41754-7
ISBN (Print)978-3-319-41753-0
DOIs
Publication statusPublished - 17 Jun 2016
Event21st International Conference on Applications of Natural Language to Information Systems - Media City, Salford, United Kingdom
Event duration: 22 Jun 201624 Jun 2016
Conference number: 21
http://www.salford.ac.uk/conferencing-at-salford/conference-management/current-conference/nldb-conference

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume9612

Conference

Conference21st International Conference on Applications of Natural Language to Information Systems
Abbreviated titleNLDB 2016,
CountryUnited Kingdom
CitySalford
Period22/06/1624/06/16
Internet address

Related information