# Understanding Krawler

krawler is powered by Feathers (opens new window) and rely on two of its main abstractions: services (opens new window) and hooks (opens new window). We assume you are familiar with this technology.

# Main concepts

krawler manipulates three kind of entities:

  • a store define where the extracted/processed data will reside,
  • a task define what data to be extracted and how to query it,
  • a job define what tasks to be run to fulfill a request (i.e. sequencing).

On top of this hooks (opens new window) provide a set of functions that can be typically run before/after a task/job such as a conversion after a download or task generation before a job run. More or less, this allows to create a processing pipeline (opens new window).

Regarding the store management we rely on abstract-blob-store (opens new window), which abstracts a lot of different storage backends (local file system, AWS S3, Google Drive, etc.), and is already used by feathers-blob (opens new window).

# Global overview

The following figure depicts the global architecture and all concepts at play:

Architecture

# What is inside ?

krawler is possible and mainly powered by the following stack: