automated intelligence gathering and analysis platform

What is Zoogma?

Zoogma is a text intelligence platform. It is basically a repository for text that
  1. collects text from documents in the locations you specify
  2. analyzes that text to find many things you need while doing research
  3. indexes the original text along with the products of the analysis
  4. provides a web services interface that allows your portal to run searches.
We call it an intelligence platform because it allows you to build intelligence products like market intelligence portals, document analysis tools, competitive intelligence dashboards and incident report aggregators without having to deal with all the complexities of text search, text analytics, metadata model design, and other stuff that gets in the way of building your research application.

What does Zoogma do?

When you collect data, you then need to analyze it. You can either employ teams of human analysts to pour over your data or have Zoogma read it for you. Zoogma is an automated intelligence gathering and analysis platform. It uses statistics and natural language processing (NLP) to find clues in unstructured text and make those clues searchable. Zoogma takes data in, and puts intelligence out.

Zoogma saves you time by:

  1. Collecting information from web scrapers, databases and document repositories
  2. Storing information for analysis and search
  3. Analyzing information to assign concepts and discover facts
  4. Indexing information to allow users to search the information produced in the analysis step
  5. Delivering information to your applications using web services

Zoogma Analyzers

Analyzers are software that scan through your data finding things for you. Each analyzer has a specific purpose:
Analyzer Purpose
Categorizer assigns predetermined concepts to documents
Ex. search for a concept like "Explosives"
Entity Miner discovers people, places, organizations and dates in documents
Ex. Who are the 10 most mentioned names in the document set?
Near Dupe Detector identifies documents that are almost the same
Ex. Document A is the same as Document B
more to come... we have several new analyzers in development