Jobs¶
Jobs are the kind of things that urlwatch can monitor.
The list of jobs to run are contained in the configuration file urls.yaml
,
accessed with the command urlwatch --edit
, each separated by a line
containing only ---
.
While optional, it is recommended that each job starts with a name
entry:
name: "This is a human-readable name/label of the job"
URL¶
This is the main job type – it retrieves a document from a web server:
name: "urlwatch homepage"
url: "https://thp.io/2008/urlwatch/"
Required keys:
url
: The URL to the document to watch for changes
Job-specific optional keys:
cookies
: Cookies to send with the request (see Advanced Topics)method
: HTTP method to use (default:GET
)data
: HTTP POST/PUT datassl_no_verify
: Do not verify SSL certificates (true/false)ignore_cached
: Do not use cache control (ETag/Last-Modified) values (true/false)http_proxy
: Proxy server to use for HTTP requestshttps_proxy
: Proxy server to use for HTTPS requestsheaders
: HTTP header to send along with the requestencoding
: Override the character encoding from the server (see Advanced Topics)timeout
: Override the default socket timeout (see Advanced Topics)ignore_connection_errors
: Ignore (temporary) connection errors (see Advanced Topics)ignore_http_error_codes
: List of HTTP errors to ignore (see Advanced Topics)ignore_timeout_errors
: Do not report errors when the timeout is hitignore_too_many_redirects
: Ignore redirect loops (see Advanced Topics)user_visible_url
: Different URL to show in reports (e.g. when watched URL is a REST API URL, and you want to show a webpage)
(Note: url
implies kind: url
)
Browser¶
This job type is a resource-intensive variant of “URL” to handle web pages requiring JavaScript in order to render the content to be monitored.
The optional pyppeteer
package must be installed to run “Browser” jobs
(see Dependencies).
At the moment, the Chromium version used by pyppeteer
only supports
macOS (x86_64), Windows (both x86 and x64) and Linux (x86_64). See
this issue in the
Pyppeteer issue tracker for progress on getting ARM devices supported
(e.g. Raspberry Pi).
Because pyppeteer
downloads a special version of Chromium (~ 100 MiB),
the first execution of a browser
job could take some time (and bandwidth).
It is possible to run pyppeteer-install
to pre-download Chromium.
name: "A page with JavaScript"
navigate: "https://example.org/"
Required keys:
navigate
: URL to navigate to with the browser
Job-specific optional keys:
wait_until
: Eitherload
,domcontentloaded
,networkidle0
, ornetworkidle2
(see Advanced Topics)
As this job uses Pyppeteer
to render the page in a headless Chromium instance, it requires massively
more resources than a “URL” job. Use it only on pages where url
does not
give the right results.
Hint: in many instances instead of using a “Browser” job you can monitor the output of an API called by the site during page loading containing the information you’re after using the much faster “URL” job type.
(Note: navigate
implies kind: browser
)
Shell¶
This job type allows you to watch the output of arbitrary shell commands, which is useful for e.g. monitoring an FTP uploader folder, output of scripts that query external devices (RPi GPIO), etc…
name: "What is in my Home Directory?"
command: "ls -al ~"
Required keys:
command
: The shell command to execute
Job-specific optional keys:
- none
(Note: command
implies kind: shell
)
Optional keys for all job types¶
name
: Human-readable name/label of the jobfilter
: filters (if any) to apply to the output (can be tested with--test-filter
)max_tries
: Number of times to retry fetching the resourcediff_tool
: Command to a custom tool for generating diff textdiff_filter
: filters (if any) to apply to the diff result (can be tested with--test-diff-filter
)treat_new_as_changed
: Will treat jobs that don’t have any historic data asCHANGED
instead ofNEW
(and create a diff for new jobs)compared_versions
: Number of versions to compare for similaritykind
(redundant): Eitherurl
,shell
orbrowser
. Automatically derived from the unique key (url
,command
ornavigate
) of the job type
Settings keys for all jobs at once¶
See Job Defaults for how to configure keys for all jobs at once.