TDAC, the First Time-Domain Astrophysics Corpus: Analysis and First Experiments on Named Entity Recognition - Information, Langue Ecrite et Signée Access content directly
Conference Papers Year : 2022

TDAC, the First Time-Domain Astrophysics Corpus: Analysis and First Experiments on Named Entity Recognition

Abstract

The increased interest in time-domain astronomy over the last decades has resulted in a substantial increase in observation report publication leading to a saturation of how astrophysicists read, analyze and classify information. Due to the short life span of the detected astronomical events, information related to the characterization of new phenomena has to be communicated and analyzed very rapidly to allow other observatories to react and conduct their follow-up observations. This paper introduces TDAC: a Time-Domain Astrophysics Corpus. TDAC is the first corpus based on astrophysical observation reports. We also present the NLP experiments we made for named entity recognition based on annotations we made and annotations from the WIESP DEAL shared task.
Fichier principal
Vignette du fichier
Alkan_WIESP2022.pdf (356.24 Ko) Télécharger le fichier
Origin : Publisher files allowed on an open archive
licence : CC BY - Attribution

Dates and versions

hal-04046837 , version 1 (26-03-2023)

Licence

Attribution

Identifiers

  • HAL Id : hal-04046837 , version 1

Cite

Atilla Kaan Alkan, Cyril Grouin, Fabian Schüssler, Pierre Zweigenbaum. TDAC, the First Time-Domain Astrophysics Corpus: Analysis and First Experiments on Named Entity Recognition. Workshop on Information Extraction from Scientific Publications, Nov 2022, Taipei (Online), Taiwan. ⟨hal-04046837⟩
119 View
23 Download

Share

Gmail Facebook X LinkedIn More