Mir, a Media Information Retrieval System

30 minutes



MIR (https://github.com/grubert65/Mir) is a Media Information Retrieval System completely built in Perl and still under heavy development (I've just started a few months ago and is basically a one-man-project among many others I carry on). It can fetch and extract text from different sources (web sites, directories, emails,...) and is geared around a distributed architecture so can easily scale. Currently the platform has been used by my company to index some branches of the internal documentation regarding projects and CVs (roughtly 1 million docs).
The talk will address these topics:
- Distributed architecture design
- Tips to handle iterative tasks
- Modern Perl (Moose)
- Information Retrieval (basics)