Schedule Tasks With cron & anacron
by Pete Kelly (critter)
Imagine having a faithful servant that could perform all your regular, mundane and necessary tasks for you? One who never forgets to do something, performs their duties precisely as they have been instructed and only ever needs to be told once how to do something? If only such a servant existed.
Well in the Linux/Unix world, there is just such a servant and its name is cron. Actually there are a few variations of this available but let's concern ourselves with the basic, available on all systems, cron.
The first version of cron to appear in Unix was written, as were so many of the basic Unix tools, by Brian Kernighan. More modern versions usually follow the format of the version by Paul Vixie, so you may sometimes see it referred to as vixie-cron. Cron is intended to take some of the drudgery out of a system administrators life and to ensure that critical tasks are not overlooked. Cron is an essential tool for system maintenance, but can equally well help to keep a much simpler system, on a laptop or desktop PC, trim, tidy and backed up.
If all of this sounds too good to be true, well then I am sorry, but there are a couple of minor issues that you may not like.
The first issue is that this is a command line utility and so you will need to enter a terminal to set it up. It is really easy to do once you have the information. Cron is a daemon, which means that once started, it sits in the background until it is needed, consuming very few resources. It is started by one of the start-up scripts when you boot the computer, so all you have to do is to configure it once and then forget it.
The other thing to be aware of is that cron is not intelligent. Faithful, accurate and reliable – yes, but not too bright. Tell cron to do something really stupid and it is done regardless of the consequences. Stupid is as stupid does, as our beloved Mr. Gump would say. Beware.
To configure cron, you have to provide a list of instructions, along with times and frequencies to carry them out. This is usually done in a plain text file known as a crontab. The format of this text file is simple but strict, and there is a command named crontab that allows you to manipulate it, as we shall see. Although the crontab file is a plain text file, it is not meant to be edited as you would normally do. There is a good reason for using the crontab command to create and edit your crontab file. When you save the file, it will check for and report on any syntax errors.
There is a file named crontab in the /etc/ directory that is used by the system for its maintenance scripts, but each user can have their own personal crontab. Let's take a look at the global system /etc/crontab.
01 * * * * root nice -n 19 run-parts --report /etc/cron.hourly
02 4 * * * root nice -n 19 run-parts --report /etc/cron.daily
22 4 * * 0 root nice -n 19 run-parts --report /etc/cron.weekly
42 4 1 * * root nice -n 19 run-parts --report /etc/cron.monthly
The first line just tells cron which command shell to use to interpret the commands you want to execute. The next line tells it where to look for the commands. The third line tells cron to mail the results to root. Yes, it can do that, although I prefer to redirect any output that I want with a command similar to:
cron-command >> /home/me/cron-output.txt
Lastly is the home directory that cron should use. If this is not specified then cron will use the home directory of the user as specified in the /etc/passwd file.
Now we come to the bit that actually tells cron what to do and when to do it.
The first 5 positions separated by spaces or tabs tell cron when to execute the command in this order:
Using an asterisk in any position means first - last or, every possible value. Values may be given as lists – 1,3,5 or as ranges 1-5. A step may be given as value/step e.g. in the hours range 0-23/4 would execute the command every 4 hours. */4 achieves the same result.
Additionally these five positions may be replaced by one of the following short-cuts:
@reboot Run once after reboot
@yearly Run once a year
@annually Run once a year
@monthly Run once a month
@weekly Run once a week
@daily Run once a day
@hourly Run once an hour
After these five is the name of the user who runs the command, here it is root but this field is neither required nor allowed in a users personal crontab.
Finally we have the command to execute. The actual command need not concern us but for those who may be interested:
nice -n 19 tells the process scheduler to give the following command the lowest priority ( in other words be nice and don't hog too much of the processors time when it may be needed by others).
Run-parts --report This command runs all executable files in the following named directory.
So, for example, the third command down will execute all files at 4:22am on each Sunday every month i.e. weekly and at a time when system usage is likely to be low. If you want to make use of this feature and add your own scripts to one of these directories then make sure that the script name does not contain a period – myscript not myscript.sh. From the run-parts documentation: “the names must consist entirely of upper and lower case letters, digits, underscores, and hyphens.”
The first time that you use the crontab command, if no crontab file exists for you then it creates one but you won't find a file named crontab anywhere in your home directory. The file is named with your user name and is put in /var/spool/cron/.
The crontab command has few options and only three that are likely to be useful. They are:
-e edit or create the file
-l list the contents of the file
-r remove the file
Typing “crontab -e” will open the file in the vi or vim editor ready to be edited. If the idea of having to use the vi editor fills you with dismay, then don't despair. To force crontab to use a friendlier editor, such as nano, you can enter this on the command line
You can use whatever your favorite editor happens to be. You can even add this to your .bashrc file to make the change permanent. You may notice that the file you are editing is not called crontab, but has a rather strange name. Just accept the name and all will be well.
It’s now time for an example.
First of all, check that cron is actually running with this command
ps aux | grep crond
You should get output similar to this
root 2955 0.0 0.0 4512 1144 ? Ss 09:51 0:00 crond
pete 15518 0.0 0.0 4300 720 pts/3 S+ 13:38 0:00 grep --color crond
The first line is cron running. The second is the command we used to find that out. In the extremely unlikely event that cron is not running, then the easiest way for PCLinuxOS users to start it is with the PCC control panel.
PCC > System > Manage system services.
Make sure that crond is running by clicking the start button and check the 'On Boot' box.
If I want to wrap the contents of a folder into a tarball (one big file) everyday and compress it, then save it, with a unique file name that includes the date, to a remote directory and to have all of this executed automatically for me at 2:25am every day, then I could write a simple script to do that called maybe docsbak like this:
tar -czf /backups/`date +%d-%m-%Y`-backup.tar.gz /home/pete/Documents/*
make it executable with
chmod +x docsbak
edit my crontab by typing
25 02 * * * /home/pete/docsbak
Then after a few days I might see a directory listing like this
pete@connaught$ ll -h /backups/
-rw-r--r-- 1 pete pete 19M Sep 11 02:25 11-09-2011-backup.tar.gz
-rw-r--r-- 1 pete pete 19M Sep 12 02:25 12-09-2011-backup.tar.gz
-rw-r--r-- 1 pete pete 21M Sep 13 02:25 13-09-2011-backup.tar.gz
-rw-r--r-- 1 pete pete 18M Sep 14 02:25 14-09-2011-backup.tar.gz
I could now add a command to delete files that are more than say 30 days old and have that automatically executed monthly.
Of course this assumes that your computer is running continuously and I know many Linux users leave their machines running for months at a time.
But what if you don't leave your machine running? And what about laptop users? If the time to perform a task falls when the machine is off then the job doesn't get done. In these cases we need anacron.
While the cron daemon runs permanently in the background and wakes up every minute to check if there is something that needs to be done, then either does it or goes back to sleep for another minute, anacron employs a different strategy. Anacron looks not at the current time but at how long it is since a task was carried out.
There are other differences, too. There are no individual user configuration files. All tasks are executed as root and listed in the global configuration file /etc/anacrontab, unless directed to another file by using the -t option (actually root may use his powers to execute the file as another user for reasons of security). Also, anacron does not use minutes and hours to time the execution of tasks, only days.
The configuration file has a similar layout to crontab but has four fields per task line.
- The period in days before a task
needs to be executed
- A delay in minutes before starting
- A unique job identifier
- The command to execute
The first field tells anacron how many days should have passed before the task is executed. This can be replaced by one of the short-cuts as used by crontab, with the obvious exception of @hourly.
The second field is to tell anacron how long it should wait after it is started before executing a due task. Why should you want a delay? Well anacron is not a daemon hanging around in the background like cron. It is executed, does what is necessary and then exits. Normally, it is executed by a start-up script on boot up, so it is a good idea to let things settle down and allow the user to get started using the system before quietly getting on with the housekeeping and maintenance tasks assigned to it.
Field number three is a name you give to the task. This can be anything you want that is unique, reasonable and meaningful to you. Anacron will create a file of this name in /var/spool/anacron. The contents of this file is simply a time-stamp of the form year month day e.g. 20110828. This is what anacron looks at when executed. If this date, plus the contents of field one is less than or equal to now (now being the current date), then the task will be carried out and the time-stamp in the file updated to now.
Finally comes the command to carry out, which is often preceded by the nice command to introduce a degree of niceness to reduce system load. Here's part of a basic PCLinuxOS anacrontab:
# the jobs will be started during the following hours only
#period delay job-identifier command
1 5 cron.daily nice -n 19 run-parts /etc/cron.daily
7 25 cron.weekly nice -n 19 run-parts /etc/cron.weekly
@monthly 45 cron.monthly nice -n 19 run-parts /etc/cron.monthly
You may notice that anacron is being told to execute some of the same tasks as cron. So what is going on here?
There is some clever interplay between cron and anacron. If the system is down for a period of time, then certain tasks set to be run by cron will be missed. Then these can be checked and completed by anacron. Similarly if cron has completed a task, then you don't need anacron to repeat the process.
This is achieved in various ways by different distributions. PCLinuxOS does it by having anacron look at daily, hourly, monthly and possibly yearly cron tasks to see if they have been missed, and also by cron running a little script to update the anacron time-stamp of any tasks it runs so that anacron knows that the task has been completed.
The START_HOURS_RANGE variable limits the time period when anacron is permitted to execute tasks, here 6AM to 10PM, which ensures that anacron will not be executing tasks when cron is most active.