Services
Stores
The stores
service allow to manage in-memory data stores with the following operations:
- create(data): create a store based on provided data object properties
- id: unique store ID
- type: store type (e.g.
fs
) - options: specific store implementation options
- remove(id): remove the store with given ID
- get(id): retrieve the store with givnen ID
The returned store objects comply the abstract-blob-store interface. Available store types are the following:
Tasks
The tasks
service allow to manage individual task execution with the following operations:
- create(data): create a task based on provided data object properties
- id: unique task ID
- type: task type (e.g.
http
) - attemptsLimit: if specified the task will be run again until this number of times before being declared as failed
- attemptsOptions: if specified each retried task will be run by merging the associated options for each retry given in this array
- faultTolerant: will catch any error raised by the task execution so that the hook chain be stopped but the job will continue anyway
- options: specific task implementation options plus
- outputType: the type of output produced by this task, defaults to
intermediate
- outputType: the type of output produced by this task, defaults to
- remove(id): remove the task with given ID, this will actually remove the produced output from the store given as a (query) parameters
The returned task objects will contain an additional property for each output types holding an array of produced output files. This is used by the clearOutputs hook to perform cleanup.
By default a task implementation return a stream to extract data from that is piped to the target store. Available task types are the following:
http
for HTTP requestswms
for HTTP requests targeting WMS serviceswcs
for HTTP requests targeting WCS serviceswfs
for HTTP requests targeting WFS servicesoverpass
for HTTP requests to query OpenStreetMap datastore
to read input data from a storenoop
when you don't need to read anything, the purpose is just to launch the hooks, returns anundefined
stream
If the task type is written type-stream
then the stream is not piped directly to the store but returned in a stream
property for further usage by hooks.
Jobs
The jobs
service allow to manage job execution with the following operations:
- create(data): create a job based on provided data object properties
- id: unique job ID
- type: job type (e.g.
async
) - tasks: tasks to be run by the job
- options: specific job implementation options
- remove(id): remove the job with given ID, this will actually remove the produced output from the store given as a (query) parameters
The returned job object is a promise resolved or rejected when the job is finished or has failed.
Available common job options are the following:
- workersLimit: the maximum number of tasks to be run in parallel by the job
- attemptsLimit: if specified each task will be run again until this number of times before being declared as failed
- faultTolerant: will catch erroneous tasks so that the job will continue anyway, the hook chain will be stopped on the faulty tasks however
- timeout: will stop the job and flag it as erroneous after the given timeout (ms), will wait until currently processed tasks are ran however
Available job types are the following:
async
to run tasks in parallel by batchkue
to run tasks by the Kue job sequencer, available specific options are- attemptsLimit: the maximum number of attempts for a task before being declared as failed by Kue
Task templates
When creating a job if a taskTemplate
object is provided it will be automatically merged in all job tasks so that you can use it to store options common to all your tasks. It also provides task ID templating based on jobId
and taskId
injected variables. So if you provide the following task template:
id: 'job',
taskTemplate: {
store: 'job-store',
id: '<%= jobId %>-<%= taskId %>',
type: 'http',
options: {
url: 'xxx',
parameter1: xxx
}
}
id: 'job',
taskTemplate: {
store: 'job-store',
id: '<%= jobId %>-<%= taskId %>',
type: 'http',
options: {
url: 'xxx',
parameter1: xxx
}
}
And submit the following task to your job:
{
id: 'task',
options: {
parameter2: xxx
}
}
{
id: 'task',
options: {
parameter2: xxx
}
}
The final task to be executed will be:
{
store: 'job-store',
id: 'job-task',
type: 'http',
options: {
url: 'xxx',
parameter1: xxx,
parameter2: xxx
}
}
{
store: 'job-store',
id: 'job-task',
type: 'http',
options: {
url: 'xxx',
parameter1: xxx,
parameter2: xxx
}
}
Complete Example
Here's an example of a Feathers server that uses the complete set of krawler services:
const feathers = require('feathers');
const rest = require('feathers-rest');
const hooks = require('feathers-hooks');
const bodyParser = require('body-parser');
const errorHandler = require('feathers-errors/handler');
const plugin = require('krawler');
// Initialize the application
const app = feathers()
.configure(rest())
.configure(hooks())
.configure(plugins())
// Initialize your feathers plugin services
.use('/stores', plugin.stores());
.use('/tasks', plugin.tasks());
.use('/jobs', plugin.jobs());
.use(errorHandler());
// Define the required hooks for your app
app.service('jobs').hooks({ ... });
app.service('tasks').hooks({ ... });
app.listen(3030);
console.log('Feathers app started on 127.0.0.1:3030');
// You can now call services in REST or programmatically
app.service('jobs').create({ ... })
.then(tasks => {
console.log('Job terminated, ' + tasks.length + ' tasks ran')
})
.catch(error => {
console.log(error.message)
})
const feathers = require('feathers');
const rest = require('feathers-rest');
const hooks = require('feathers-hooks');
const bodyParser = require('body-parser');
const errorHandler = require('feathers-errors/handler');
const plugin = require('krawler');
// Initialize the application
const app = feathers()
.configure(rest())
.configure(hooks())
.configure(plugins())
// Initialize your feathers plugin services
.use('/stores', plugin.stores());
.use('/tasks', plugin.tasks());
.use('/jobs', plugin.jobs());
.use(errorHandler());
// Define the required hooks for your app
app.service('jobs').hooks({ ... });
app.service('tasks').hooks({ ... });
app.listen(3030);
console.log('Feathers app started on 127.0.0.1:3030');
// You can now call services in REST or programmatically
app.service('jobs').create({ ... })
.then(tasks => {
console.log('Job terminated, ' + tasks.length + ' tasks ran')
})
.catch(error => {
console.log(error.message)
})