New approaches for extracting heterogeneous reference data

  • Start: May 15, 2023 01:00 PM (Local Time Germany)
  • End: May 16, 2023 06:15 PM
  • Location: mpilhlt & online
New approaches for extracting heterogeneous reference data


Monday 15 May 2023

13:00-13:30 Opening

13:30-14:50 Presentations 1

Aleksandra Kaye/Bernardo S. Buarque/Malte Vogl/Raphael Schlattmann (ModelSEN project, MPI for History of Science): Socio-epistemic networks of Polish migrants in Latin America – a challenge for heterogeneous reference extraction

Raphael Schlattmann/Malte Vogl/Aleksandra Kaye/Bernardo S. Buarque (ModelSEN project, MPI for History of Science): Automated construction of historical semiotic networks – Can person-names within textual contexts be mapped to structured references?


14:50-15:00 Coffee Break

15:00-16:20 Presentations 2

Johannes Riedl (University Library Tübingen): A Description of the Work in Progress of the Transformation of the Handbuch der Keilschriftenliteratur

Victor Westrich (Academy of Sciences and Literature, Mainz): Extraction of primary and secondary sources from the Regesta Imperii

16:20-16:30 Coffee Break

16:30-18:15 Presentations 4

Kilian Lüders/Bent Stohlmann (HU Berlin): Extraction of string-citations in court data

Thiago Freitas Hansen/Rafael Castro Andrade (Universidade Federal do Paraná/Positivo University, Curitiba): Words and Social Rights in Brazil (1933-1941)

Will Hanley/J. Hernandez/M. Nagales/B. Goff (Florida State University): Citation Analysis of International Law Journals, 1869-1939


18:15-18:45 Break

18:45-19:30 Discussion of Whitepaper idea

Tuesday 16 May 2023

10:00-11:20 Presentations 4

Diego de la Hera/Dominic Dall’Osto (Wikimedistas Calamuchita/Univ. of Zurich): Using Cita to support reference extraction workflows from Zotero

Julia Hütten/Jeff Huang (mpilhlt/Brown University): Automatically characterizing the heterogeneous interpretations of a citation for incorporating into a hosted dataset


11:20-11:40 Coffee Break

11:40-13:00 Presentations 5

Christian Boulanger (mpilhlt): Order from Chaos: Potential and Limits of CRF-based Reference Extraction from Footnotes

Andreas Wagner (mpilhlt): Obtaining Training Data - Different Tasks, Different Options


13:00-14:30 Lunch Break

14:30-15:50 Presentations 6

Olga Pagnotta/Silvio Peroni (University of Bologna): Measuring the Performances of AnyStyle and Grobid against a Gold Strandard

Anastasiia Iurshina/Tobias Backes/Ahsan Shahid/Philipp Mayer (Univ. of Stuttgart): Outcite: end-to-end reference processing pipeline


15:50-16:15 Coffee Break

16:15-18:00 How to go forward?

18:00-18:15 Closing

Go to Editor View