Skip to main content

Cron Jobs

Crons

To run the jobs automaticity, you can create cronjobs. The exact implementation depends on your hosting provider. 

There are four different options to run cron jobs. 

  • Using the administrator cron module. This module provides a pseudo cron that runs in the background while you are logged in. This keeps the extracted and checked links up to date while you are working in the administrator.
  • Using the HTTP interface from your browser or using tools like wget and curl.
  • Using Joomla's build in task scheduler.
  • Using Joomla's Command line interface.

The Maintenance menu in the administrator shows the links and command to use for the HTTP and CLI cron.

Whether you can run the CLI command from a cronjob depends on your hosts' configuration. Alternatively, you can head over to System ⇾ Scheduled Tasks and create a GET Request using the URL's provided.

Using the Command Line is the most effective method to parse and check links in batches. 

Cron Frequency?

Extracting frequency

With the Admin Pseudo Cron Module active items should be reprocessed will you are active in the administrator. 

With the global or extractor setting 'Extract on Save' active, items should be reprocessed on save.

Furthermore, Articles and most other content types in Joomla have a 'modified on date', so only changed items will be reprocessed. So in general the extract cron won't need a high frequency.  Maybe daily to process items without modified dates

If your site has a lot of unattended changes, like with the publish-up and publish-down times for articles, you might need to decrease the interval. Or set the option 'Only extract from published content' to off.

Check frequency

The frequency to recheck extracted links depends on the number of links and the recheck interval.

You will find an estimate on the Maintenance page.

Domain throttling

To prevent flooding a remote host, there is a request limit per domain. You can change the period between two requests to the same host in the BLC Options.

If you have a lot of links to the same domain, you might need several cron runs to check them all.

 

Other Options

There are quite a few settings to control the cron jobs. Like batch sizes and check frequency. These options are depicted on the main options page

Maximum execution time

There are no maximum execution time settings. An aborted operation will simply restart the next time. 

If multiple cron's run simultaneously, Inserting links into the database can collide, and you might get database errors (duplicate entry).