[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

F. Invoking the LinkController Programs

Because they use the Perl Getopt::Mixed module, all of the LinkController command line programs respond to the standard POSIX style command line options. At least the following two options will be implemented.

`--help'
This option will give a list of all of the options understood by the program along with brief explanations of what they do.
`--version'
This option will give some version information for the program.

You can use the `--help' option to get help on each program, for example:

 
extract-links --help

You can then use that information to get the program to do what you want.

F.1 Invoking link-report  link-report usage summary
F.2 Invoking test-link  test-link usage summary
F.3 Invoking extract-links  extract-links usage summary
F.4 Invoking fix-link  fix-link usage summary
F.5 Invoking check-page  check-page usage summary
F.6 Invoking build-schedule  build-schedule usage summary


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

F.1 Invoking link-report

The `link-report' program prints out status information about links allowing the user to see what needs to be fixed. The default is to print out all of the broken and redirected links that currently occur on the users web pages and which are either redirected or broken.

Before running `link-report' you should probably use test-link (see section F.2 Invoking test-link) to check which links are broken. That may not be needed if your system administrator does it for you. After you have identified broken links you may want to use fix-link (see section F.4 Invoking fix-link) to repair the broken links.

The primary configuration file used by link-report is the `.link-control.pl' file. This tells it where the schedule file and LinkController database are. See section 2.2 Setting Configuration Variables, for how to control the contents of this file.

In the case of the `--long-list' report, a second configuration file, the `infostrucs' file, is used. This contains the information needed to know where to extract links from by default. See section 2.4 Configuring Infostructures, for more details on configuring this.

FIXME this section should give a better description of each option.

 
link-report [options]

 -V --version            Give version information for this program
 -h --help --usage       Describe usage of this program.
    --help-opt=OPTION    Give help information for a given option
 -v --verbose[=VERBOSITY] Give information about what the program is 
                         doing.  Set value to control what information
                         is given.

 -U --uri=URIs           Give URIs which are to be reported on.
 -f --uri-file=FILENAME  Read all URIs in a file (one URI per line).
 -E --uri-exclude=EXCLUDE RE Add a regular expressions for URIs to 
                         ignore.
 -I --uri-include=INCLUDE RE Give regular expression for URIs to check
                         (if this option is given others aren't 
                         checked).
 -e --page-exclude=EXCLUDE RE Add a regular expressions for pages to 
                         ignore.
 -i --page-include=INCLUDE RE Give regular expression for URIs to check
                         (if this option is given others aren't 
                         checked).

 -a --all-links          Report information about every URI.
 -b --broken             Report links which are considered broken.
 -n --not-perfect        Report any URI which wasn't okay at last test.
 -r --redirected         Report links which are redirected.
 -o --okay               Report links which have been tested okay.
 -d --disallowed         Report links for which testing isn't allowed.
 -u --unsupported        Report links which we don't know how to test.
 -m --ignore-missing     Don't complain about links which aren't in the
                         database.
 -g --good               Report links which are probably worth listing.

 -N --no-pages           Report without page list.
    --config-file=FILENAME Load in an additional configuration file
    --link-index=FILENAME Use the given file as the index of which file
                         has what link.
    --link-database=FILENAME Use the given file as the dbm containing 
                         links.

 -l --long-list          Where possible, identify the file and long 
                         list it (implies infostructure).  This is used
                         for emacs link-report-dired.
 -R --uri-report         Print URIs on separate lines for each link.
 -H --html               Report status of links in html format.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

F.2 Invoking test-link

The test-link program tests all of the links in the LinkController database storing information about any problems found. It works as a robot contacting the servers where the target of each link is stored and verifying that the resource the link points to is really there.

Before running test-link you should probably use extract-links (see section F.3 Invoking extract-links) to collect all of the links you want to test and then build-schedule (see section F.6 Invoking build-schedule).

The configuration file used by test-link is the `.link-control.pl' file. This tells it where the schedule file and LinkController database are. See section 2.2 Setting Configuration Variables, for how to control the contents of this file.

FIXME this section should give a better description of each option.

 
test-link [arguments]

 -V --version            Give version information for this program
 -h --help --usage       Describe usage of this program.
    --help-opt=OPTION    Give help information for a given option
 -v --verbose[=VERBOSITY] Give information about what the program is
                         doing.  Set value to control what information
                         is given.
 --quite -q --silent     Program should generate no output except in
                         case of error.
    --no-warn            Avoid issuing warnings about non-fatal 
                         problems.

 -c --config-file=FILENAME Load in an additional configuration file
 -u --user-address=STRING Email address for user running link testing.
 -H --halt-time=MINUTES  stop after given number of minutes

    --never-stop         keep running without stopping
    --no-robot           Don't follow robot rules.  Dangerous!!!
 -w --no-waitre=NETLOC-REGEX Home HOST regex: no robot rules.. 
                         (danger?)!!!
    --test-now           Test links now not when scheduled (testing 
                         only)
    --untested           Test all links which have not been tested.
    --sequential         Put links into schedule in order tested (for 
                         testing)
 -H --halt-time=MINUTES  stop after given number of minutes
 -m --max-links=INTEGER  Maximum number of links to test (-1=no limit)

Several of the options could potentially lead to overloading networks and even other people's computer systems:

Don't use --no-robot, except for when you are doing local testing (that is, you aren't connected to the internet proper).

Don't use --never-stop or --test-now except when you are watching what is happening.

Generally you should be somewhat careful about running this program since it does automatically connect to other servers on the internet. Reasonable care has been taken to ensure it does this in a responsible way, but you must make sure that anybody who is inconvenienced has a good route for communicating this problem back to you.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

F.3 Invoking extract-links

The extract-links program walks through the users web pages collecting all of the links from those pages and storing them into a database for later checking by the test-link program (see section F.2 Invoking test-link). It can also list the links found into a given file.

After running extract-links you should use build-schedule (see section F.6 Invoking build-schedule) which will make sure that any new links discovered are scheduled for checking..

There are two configuration files used by extract-links. The `.link-control.pl' file is the first. This tells it where the various files it uses are. See section 2.2 Setting Configuration Variables, for how to control the contents of this file. The second file is the `infostrucs' file. This contains the information needed to know where to extract links from by default. See section 2.4 Configuring Infostructures, for more details on configuring this.

FIXME this section should give a better description of each option.

 
extract-links [arguments] [url-base [file-base]]

 -V --version            Give version information for this program
 -h --help --usage       Describe usage of this program.
    --help-opt=OPTION    Give help information for a given option
 -v --verbose[=VERBOSITY] Give information about what the program is
                         doing.  Set value to control what information
                         is given.
 --quiet -q --silent     Program should generate no output except in
                         case of error.

 -e --exclude-regex=REGEX Exclude expression for excluding files.
 -p --prune-regex=REGEX  Regular expression for excluding entire 
                         directories.
 -d --default-infostrucs handle all default infostrucs (as well as ones 
                         listed on command line)

 -l --link-database=FILENAME Database to create link records into.
 -c --config-file=FILENAME Load in an additional configuration file

 -o --out-url-list=FILENAME File to output the URL of each link found to
 -i --in-url-list=FILENAME File to input URLs from to create links


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

F.4 Invoking fix-link

The fix-link program is designed to repair a broken links across all of the files which LinkController is managing. It does this by looking up index files and seeing files contain the broken link then doing a textual substitution in each of these files. This makes it much faster than searching through all of the files in a set of web pages to see which pages have the broken link.

In order to work properly, extract-links (see section F.3 Invoking extract-links) must have been run first to build up the index databases used by fix-link.

There are two configuration files used by fix-link. The file `.link-control.pl' is the first. This tells it where the other configuration file and index files are. See section 2.2 Setting Configuration Variables, for how to control the contents of this file. The second file is the `infostrucs' file. This contains the information needed to relate broken links to the files which need to be repaired. See section 2.4 Configuring Infostructures, for more details on confiuguring this.

 
fix-link [options] old-link new-link

 -V --version            Give version information for this program
 -h --help --usage       Describe usage of this program.
    --help-opt=OPTION    Give help information for a given option
 -v --verbose[=VERBOSITY] Give information about what the program is 
                         doing. Set value to control what information is 
                         given.
 -q --quiet --silent     Program should generate no output except in
                         case of error.
    --no-warn            Avoid issuing warnings about non-fatal 
                         problems.

    --directory=DIRNAME  correct all files in the given directory.

 -r --relative           Fix relative links (expensive??).
 -t --tree               Fix the link and any others based on it.
 -b --base=FILENAME      Base URI of the document or directory to be 
                         fixed.

    --config-file=FILENAME Load in an additional configuration file


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

F.5 Invoking check-page

Check page is useful where broken links in files need to be manually corrected. It outputs a list of line numbers where interesting links occur allowing the user to find those lines and correct the mistakes. The output format is compatible with the emacs compile mode which allows fast access to the problem locations.

There are two configuration files used by extract-links. The file `.link-control.pl' is the first. This tells it where the link database is. See section 2.2 Setting Configuration Variables, for how to control the contents of this file. The second file is the `infostrucs' file. This allows check-page to know what the base URI of the file being checked is and so check relative links within the page corectly. See section 2.4 Configuring Infostructures, for more details on configuring this.

 
check-page [options] filename...

 -V --version            Give version information for this program
 -h --help --usage       Describe usage of this program.
    --help-opt=OPTION    Give help information for a given option
 -v --verbose[=VERBOSITY] Give information about what the program is 
                         doing.  Set value to control what information 
                         is given.

 -r --redirect           Report links which are redirected.
 -m --ignore-missing     Don't complain about links which aren't in 
                         database.

    --link-index=FILENAME Use the given file as the index of which 
                         file has what link.
    --link-database=FILENAME Use the given file as the dbm containing
                         links.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

F.6 Invoking build-schedule

The build-schedule program makes a schedule for testing links. If run with no options it will make sure that all the links in the LinkController database will be checked at some point in the future.

Before running build-schedule you should probably use extract-links (see section F.3 Invoking extract-links) to collect all of the links you want to test. Afterwards you should use test-link to check which ones are broken (see section F.2 Invoking test-link).

The configuration file used by build-schedule is the `.link-control.pl' file. This tells it where the schedule file and LinkController database are. See section 2.2 Setting Configuration Variables, for how to control the contents of this file.

 
build-schedule [options]

 -V --version            Give version information for this program
 -h --help --usage       Describe usage of this program.
    --help-opt=OPTION    Give help information for a given option
 -v --verbose[=VERBOSITY] Give information about what the program is
                         doing.  Set value to control what information 
                         is given.
 --quite -q --silent     Program should generate no output except in
                         case of error.
    --no-warn            Avoid issuing warnings about non-fatal 
                         problems.

 -l --url-list=FILENAME  File with complete list of URLs to schedule
 -s --schedule=FILENAME  Override location of the schedule
 -t --spread-time=SECONDS Time over which to spread checking; default 10 
                        days
 -S --start-offset=SECONDS Time offset from now for starting work (can
                         be negative)
 -d --ignore-db          Set the time with no regard to curent setting
 -i --ignore-link        Set the time with no regard to link status
    --no-warn            Avoid issuing warnings about non-fatal 
                         problems.
    --config-file=FILENAME Load in an additional configuration file


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Michael De La Rue on February, 3 2002 using texi2html