Active Tigger @ FOSDEM2025

Accelerate text annotation for social science

Émilien Schultz
CREST/GENES

2025-01-02

A need to equip social scientists

  • Shitload of digital text data (newspapers, social media, speech-to-text…)
    • from hundreds to millions of documents…
  • Fancy new horizons with fundational models & LLMs
  • No-code tools to democratize/stabilize CSS practices

Rationale for Active Tigger1

An open source web application for text classification :

  • Collaborate on text annotation
  • Accelerate the annotation of specific elements (active learning)
  • Scale predictions to the larger dataset (BERT model fine-tuning)

« Even on complex tasks, a well-trained supervised model […] can equal, and sometimes even surpass human annotation »2

Technical overview

  • Backend: Python + FASTAPI
  • Frontend: React + Python client
  • Models: BERT1 + API

3 guiding values :

  • Tool dedicated to a specific task
  • Integrate (if possible) annotation best practices
  • Keep flexibility to allow reuse/modifications

Current state of development

Currently in beta testing1 / Code on Github

Next milestones :

  • Integrate external API (HuggingFace, etc.) to annotate
  • Better right management & monitoring panel for multiusers
  • Stable “classical” version by June, in Docker

Let’s jump to a quick demo

Do i need a classifier right now ? Maybe !

  • I need to pick up the talk I want to attend after this one
  • I have all the abstracts of the conference

(yes, it is not really social science stuff, but you get the idea : would be similar for annotating different frame in newspaper coverage)

How to proceed

  • Secure the FOSDEM dataset
  • Go to an Active Tigger instance
    • Create a project
    • Create your codebook/labels + embeddings
    • Annotate randomly + actively
    • Train a model / evaluate / iterate
    • Extend the annotation to the dataset
    • Download the results

To try

fosdem/fosdem

Contributors & Funding

Funders : DRARI + CREST + Progédo

Contributors:

  • Étienne Ollion
  • Julien Boelaert
  • Annina Claesson
  • Emma Bonnutti
  • Paul Girard (Ouestware)
  • Léo Mignot
  • Jean-Baptiste Richardet

& all the people who helped and got us feedback.