Leveraging an unprecedented corpus of newspaper and radio archives, **Impresso – Media Monitoring of the Past** is an interdisciplinary research project that uses machine learning to pursue a paradigm shift in the processing, semantic enrichment, representation, exploration, and study of historical media across modalities, time, languages, and national borders.
I just learned about the Swiss project Impresso: Media Monitoring of the Past. This project has an impressive Web application that lets you search across 76 newspapers in two languages from two countries.
Key to the larger project is using machine learning to handle multiple modalities like:
- News text and radio broadcasts
- Text and Images
- French and German
- Different countries
A Data Lab that uses IPython is coming soon. They also have documentation about a Topic Modelling tool, but I couldn’t find the actual tool.
Anyway, this strikes me as an example of an advanced multi-modal news research environment.