ViewVC Help
View File | Revision Log | View Changeset | Root Listing
root/Oni2/Validate External Links/Read Me First.rtf
Revision: 1070
Committed: Tue Oct 3 03:01:32 2017 UTC (8 years ago) by iritscen
Content type: application/rtf
File size: 3645 byte(s)
Log Message:
ValExtLinks improvements:
- Now advises reader of external internal links.
- The exceptions file now allows finer-grained exemption of a URL by matching to the specific page that contains it instead of exempting all occurrences of that URL (but the '*' wildcard will match all containing pages). Currently you can only list a URL once, however.
- The exceptions file now allows external internal and potential intrawiki links to be exempted from the report.
- The path to Google Chrome (for taking screenshots) is now external to the script, supplied as an argument after "--take-screenshots".
- All of OniGalore's interwiki shortcuts are now recognized.
- Protection against failed retrieval of redirect URL.
- Better recognition of unimportant redirects (http->https, added ending slash).

File Contents

# Content
1 {\rtf1\ansi\ansicpg1252\cocoartf1504\cocoasubrtf830
2 {\fonttbl\f0\fswiss\fcharset0 Helvetica;}
3 {\colortbl;\red255\green255\blue255;}
4 {\*\expandedcolortbl;;}
5 \margl1440\margr1440\vieww12820\viewh10560\viewkind0
6 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\qc\partightenfactor0
7
8 \f0\b\fs28 \cf0 Validate External Links
9 \b0 \
10 by Iritscen\
11 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
12 \cf0 \
13 The ValExtLinks script is a Bash shell script developed on a Mac, but hopefully any Bash shell from v3.2 onward can run it. Non-Bash shells will not work, as there is a lot of Bash-specific syntax in this script. The script invokes Unix binaries that are mostly pretty standard, but you might want to make sure that you have "curl" and "expect" installed.\
14 \
15 The purpose of this read-me is not to tell you how to use the script. Running the script with the --help option should give you that information. This read-me simply intends to draw your attention to the following items that you may or will need to adapt to your system before you can use ValExtLinks:\
16 \
17 - "Run validate_external_links.command" is intended to be the means of running the script. The idea is to make sure this file is executable and then to double-click it (or invoke it from a cron event) when you want to run ValExtLinks. First you will need to set the variables at the top of the script to be correct for your system.\
18 \
19 - For testing and development, please note the following:\
20 - You can use the _LOCAL variables in "Run validate_external_links.command" to supply the exact links and exception files that you want the script to process. Sample alternate invocations of ValExtLinks are also provided.\
21 - Currently, the extlinks.csv file hosted on oni2.net refreshes twice a day, at 06:20 and 14:20 GMT.\
22 - The "Sample files" folder contains:\
23 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li1606\fi-1607\pardirnatural\partightenfactor0
24 \cf0 - archive.txt: a sample of the output of the Internet Archive's availability API.\
25 - extlinks.csv: the current format of oni2.net's wiki link dump.\
26 - header-redirect.txt: a sample of what "curl" sees when it is run with the --head option and gets a redirection (302) response.\
27 - header.txt: a sample of what "curl" sees when it is run with the --head option and gets an OK (200) response.\
28 - ValExtLinks report.htm/.rtf/.txt: a reference sample of the output you should get.\
29 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
30 \cf0 \
31 - sftp_login.txt should be populated with your SFTP login info and the path that you want the report uploaded to. The ValExtLinks script will parse this file and pass its info to val_expect_sftp.txt, which performs the actual upload of the ValExtLinks report. The script was written to use SFTP because Oni2.net does not support regular FTP.\
32 \
33 - The AGENT variable should contain a reasonably up-to-date user agent string, preferably generated from Google Chrome since that is the browser used to take screenshots of pages. There are different web sites you can visit in the browser to learn the user agent string, or you can simply upload and then visit the supplied print_agent.php.\
34 \
35 - The CURL_CODES and HTTP_CODES variables need to be set to wherever you are uploading those two files so that the reader can be directed to them properly from the header of the ValExtLinks report.\
36 \
37 Have fun validating those external links!}