CENL News

27th April 2023

Network Group “AI in Libraries” Webinars 2023

In May and June 2023, the CENL “AI in Libraries” Network Group will host three webinars on various uses of Artificial Intelligence (AI) in national libraries.

The online events last approximately 45 minutes each.

For more information please see below and/or contact Jean-Philippe Moreux from the National Library of France, the chair of the group at jean-philippe.moreux@bnf.fr.

 

Automatic classification of subject groups of e-dissertation

Marcel Gygli

Photo of Marc GygliThe National Library of Switzerland receives one copy each of dissertations produced in Switzerland from university libraries, a large proportion of which are now digital. The dissertations are to be classified into one of the approximately 100 subject groups. The aim of this project is to test open source algorithms that automatically classify the theses. This is analogous to similar projects in the German National Library and the National Library of Finland.

 

Friday, 12 May 11:00 CET

Download presentation: Automatic classification of subject groups of e-dissertation

External link: Video Recording

 

Fine grained language identification in multilingual corpus with OCR errors

Yves Maurer

Photo of Yves Maurer

The National Library of Luxembourg has digitized more than 1.1 million pages of newspapers from the 19th and the 20th century, processed them with OCR and is making them available to users through eluxemburgensia.lu. Luxembourg is a multilingual country and the newspapers reflect that, with most titles printing articles in multiple languages, even on the same page. Readers are just expected to know each language. In order to effectively use search or machine learning algorithms, it is important to know the language for each article. The project describes how we identified 18 different languages in the 8 million articles, ranging from majority languages like French and German (used for the majority of articles) to minority cases like Latin or Esperanto using a combination of different existing models and dictionaries.

Tuesday 30 May 14:00 CET

Download presentation: Fine grained language identification in multilingual corpus with OCR errors

External link: Video recording

 

Image classification and image retrieval using CLIP models

Jean-Philippe Moreux (BnF), Javier De La Rosa (NLB), Horace Lee (University of Oxford)

Photo of Jean-Philippe Moreux
Jean-Philippe Moreux (BnF)

The CLIP model (OpenAI, 2021) learns visual concepts from natural language supervision.It has been trained on a large dataset of (image, text) pairs. It can be instructed in several natural  languages and can be applied to any visual classification or retrieval use cases, by providing the names of the visual categories to be recognized. The National Library of France, the National Library of Norway and the University of Oxford will present experimentations and applications leveraging CLIP.

Photo of Javier De La Rosa (NLB)
Javier De La Rosa (NLB)

Thursday, 29 June 11:00 CET

Download presentation: Image classification and image retrieval using CLIP models

External link: Video recording

More news