Active Tigger @ FOSDEM2025
Accelerate text annotation for social science
Émilien Schultz
CREST/GENES
2025-01-02
A need to equip social scientists
- Shitload of digital text data (newspapers, social media, speech-to-text…)
- from hundreds to millions of documents…
- Fancy new horizons with fundational models & LLMs
- No-code tools to democratize/stabilize CSS practices
Rationale for Active Tigger1
An open source web application for text classification :
- Collaborate on text annotation
- Accelerate the annotation of specific elements (active learning)
- Scale predictions to the larger dataset (BERT model fine-tuning)
« Even on complex tasks, a well-trained supervised model […] can equal, and sometimes even surpass human annotation »2
Technical overview
- Backend: Python + FASTAPI
- Frontend: React + Python client
- Models: BERT1 + API
3 guiding values :
- Tool dedicated to a specific task
- Integrate (if possible) annotation best practices
- Keep flexibility to allow reuse/modifications
Current state of development
Currently in beta testing1 / Code on Github
Next milestones :
- Integrate external API (HuggingFace, etc.) to annotate
- Better right management & monitoring panel for multiusers
- Stable “classical” version by June, in Docker
Let’s jump to a quick demo
Do i need a classifier right now ? Maybe !
- I need to pick up the talk I want to attend after this one
- I have all the abstracts of the conference
(yes, it is not really social science stuff, but you get the idea : would be similar for annotating different frame in newspaper coverage)
How to proceed
- Secure the FOSDEM dataset
- Go to an Active Tigger instance
- Create a project
- Create your codebook/labels + embeddings
- Annotate randomly + actively
- Train a model / evaluate / iterate
- Extend the annotation to the dataset
- Download the results
To try
fosdem/fosdem
Contributors & Funding
Funders : DRARI + CREST + Progédo
Contributors:
- Étienne Ollion
- Julien Boelaert
- Annina Claesson
- Emma Bonnutti
- Paul Girard (Ouestware)
- Léo Mignot
- Jean-Baptiste Richardet
& all the people who helped and got us feedback.