Gransk
Document processing for investigations
Gransk is a free and open source tool that aims to be a Swiss army knife of document processing and analysis. Its primary objective is to quikly provide users with insight to their documents during investigations. It includes a processing engine written in Python and a web interface. Under the hood it uses Apache Tika for content extraction, Elasticsearch for data indexing, and dfVFS to unpack disk images.
Given a bunch of documents and the question "has there been commited a crime here?", Gransk will help you with the following:
- Pull out all text and metadata, and make it searchable (supporting more than 200 document types)
- Organize the documents by metadata, like content type, document authors and email recipients
- Highlight names, email addresses and more from text to help guide the investigation
- Automatically unpack containers like zip-archives and disk images
- Make it easy to find relevant data using powerful filtering and aggregation
- Help you discover related information, like phone numbers or email addresses associated with a name
- A user interface designed for rapid data discovery, making it fun to work with data!
It's super-easy to get started! Just download, import and double-click on this VirtualBox image: https://drive.google.com/uc?export=download&id=0B6iPjQOwe4MKOVhma2VhWmpWaEE. Then, after a couple of seconds, go to http://localhost:8084 (just press "enter" in the login dialog)