ViewVC Help
View File | Revision Log | View Changeset | Root Listing
root/Oni2/Validate External Links/Read Me First.rtf
Revision: 1127
Committed: Sat Mar 28 02:08:29 2020 UTC (5 years, 6 months ago) by iritscen
Content type: application/rtf
File size: 3714 byte(s)
Log Message:
Val now counts redirects from youtu.be to youtube.com as OK links.  These links will be reported on if the argument --show-yt-redirects is used.  Renamed --show-https-upgrade to --show-https-upgrades for consistency.  Also sorted the file and page suffix arrays and added some more items to them.  Now handling status codes 400, 418, 502 and 530.  Fixed incorrect nbsps in HTML report.  Val is no longer confused by URLs ending in '(' or ')', or which contain a '%' towards the end.

File Contents

# Content
1 {\rtf1\ansi\ansicpg1252\cocoartf2512
2 \cocoatextscaling0\cocoaplatform0{\fonttbl\f0\fswiss\fcharset0 Helvetica-Bold;\f1\fswiss\fcharset0 Helvetica;}
3 {\colortbl;\red255\green255\blue255;}
4 {\*\expandedcolortbl;;}
5 \margl1440\margr1440\vieww12820\viewh10560\viewkind0
6 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\qc\partightenfactor0
7
8 \f0\b\fs28 \cf0 Validate External Links
9 \f1\b0 \
10 by Iritscen\
11 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
12 \cf0 \
13 The ValExtLinks script is a Bash shell script developed on a Mac, but hopefully any Bash shell from v3.2 onward can run it. Non-Bash shells will not work, as there is a lot of Bash-specific syntax in this script. The script invokes Unix binaries that are mostly pretty standard, but you might want to make sure that you have "curl" and "expect" installed.\
14 \
15 The purpose of this read-me is not to tell you how to use the script. Running the script with the --help option should give you that information. This read-me simply intends to draw your attention to the following items that you may or will need to adapt to your system before you can use ValExtLinks:\
16 \
17 - "Run validate_external_links.command" is intended to be the means of running the script. The idea is to make sure this file is executable and then to double-click it (or invoke it from a cron event) when you want to run ValExtLinks. First you will need to set the variables at the top of the script to be correct for your system.\
18 \
19 - For testing and development, please note the following:\
20 - You can use the _LOCAL variables in "Run validate_external_links.command" to supply the exact links and exception files that you want the script to process. Sample alternate invocations of ValExtLinks are also provided.\
21 - Currently, the extlinks.csv file hosted on oni2.net refreshes twice a day, at 06:20 and 14:20 GMT.\
22 - The "Sample files" folder contains:\
23 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li1606\fi-1607\pardirnatural\partightenfactor0
24 \cf0 - archive-response.txt: a sample of the output of the Internet Archive's availability API.\
25 - extlinks.csv: the current format of oni2.net's wiki link dump.\
26 - header-redirect.txt: a sample of what "curl" sees when it is run with the --head option and gets a redirection (302) response.\
27 - header-ok.txt: a sample of what "curl" sees when it is run with the --head option and gets an OK (200) response.\
28 - ValExtLinks report.htm/.rtf/.txt: a reference sample of the output you should get.\
29 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
30 \cf0 \
31 - sftp_login.txt should be populated with your SFTP login info and the path that you want the report uploaded to. The ValExtLinks script will parse this file and pass its info to val_expect_sftp.txt, which performs the actual upload of the ValExtLinks report. The script was written to use SFTP because Oni2.net does not support regular FTP.\
32 \
33 - The AGENT variable should contain a reasonably up-to-date user agent string, preferably generated from Google Chrome since that is the browser used to take screenshots of pages. There are different web sites you can visit in the browser to learn the user agent string, or you can simply upload and then visit the supplied print_agent.php.\
34 \
35 - The CURL_CODES and HTTP_CODES variables need to be set to wherever you are uploading those two files so that the reader can be directed to them properly from the header of the ValExtLinks report.\
36 \
37 Have fun validating those external links!}