ViewVC Help
View File | Revision Log | View Changeset | Root Listing
root/Oni2/Validate External Links/validate_external_links.sh
Revision 1201 - (view) - [selected]
Modified Mon Oct 20 02:41:05 2025 UTC (5 days, 18 hours ago) by iritscen
File length: 62624 byte(s)
Diff to previous 1193
ValExtLinks: Don't assume that curl will return any result code at all (it might crash).
Revision 1193 - (view) - [select for diffs]
Modified Thu Oct 31 22:03:44 2024 UTC (11 months, 3 weeks ago) by iritscen
File length: 62622 byte(s)
Diff to previous 1190 , to selected 1201
ValExtLinks now recognizes private YT videos.
Revision 1190 - (view) - [select for diffs]
Modified Sun Jun 23 23:36:00 2024 UTC (16 months ago) by iritscen
File length: 62383 byte(s)
Diff to previous 1189 , to selected 1201
ValExtLinks: Added archive.ph as an alternate domain to archive.is.
Revision 1189 - (view) - [select for diffs]
Modified Sun Apr 21 16:08:13 2024 UTC (18 months ago) by iritscen
File length: 62359 byte(s)
Diff to previous 1184 , to selected 1201
ValExtLinks: Added file suffix, page suffix and HTTP response code which were previously unknown.
Revision 1184 - (view) - [select for diffs]
Modified Sun May 21 22:22:55 2023 UTC (2 years, 5 months ago) by iritscen
File length: 62343 byte(s)
Diff to previous 1183 , to selected 1201
ValExtLinks: Added --only-200-ok argument.
Revision 1183 - (view) - [select for diffs]
Modified Tue May 16 01:10:09 2023 UTC (2 years, 5 months ago) by iritscen
File length: 61176 byte(s)
Diff to previous 1182 , to selected 1201
ValExtLinks now knows to recommend that a template or image intrawiki link start with a ':' so it's a clickable link.
Revision 1182 - (view) - [select for diffs]
Modified Sun May 7 19:53:19 2023 UTC (2 years, 5 months ago) by iritscen
File length: 60903 byte(s)
Diff to previous 1178 , to selected 1201
ValExtLinks: Added special code for checking OneDrive file links.  Added some safety code around 'curl' usage on YT and OneDrive link checks.  Fix logic error with handling the wildcard in a URL.
Revision 1178 - (view) - [select for diffs]
Modified Mon Jan 23 01:51:32 2023 UTC (2 years, 9 months ago) by iritscen
File length: 59157 byte(s)
Diff to previous 1177 , to selected 1201
ValExtLinks now supports wildcards in the URL too, not just the containing page. Added Cloudflare's error code 520 to list of recognized errors.
Revision 1177 - (view) - [select for diffs]
Modified Fri Jan 13 22:26:56 2023 UTC (2 years, 9 months ago) by iritscen
File length: 58831 byte(s)
Diff to previous 1175 , to selected 1201
ValExtLinks now skips URLs that aren't HTTP(S) protocol. Added some error-checking on line parsing. Added my email address.
Revision 1175 - (view) - [select for diffs]
Modified Tue Aug 23 14:15:48 2022 UTC (3 years, 2 months ago) by iritscen
File length: 58253 byte(s)
Diff to previous 1160 , to selected 1201
ValExtLinks: Added audit feature which tells the user if there are items in the exception list which are no longer present on the wiki or no longer return the given error code.
Revision 1160 - (view) - [select for diffs]
Modified Sun Aug 15 14:20:21 2021 UTC (4 years, 2 months ago) by iritscen
File length: 56060 byte(s)
Diff to previous 1158 , to selected 1201
ValExtLinks: Added some entries to the lists of known file and page suffixes.
Revision 1158 - (view) - [select for diffs]
Modified Sun Jun 13 20:50:43 2021 UTC (4 years, 4 months ago) by iritscen
File length: 56043 byte(s)
Diff to previous 1157 , to selected 1201
ValExtLinks: Added argument to 'curl' that prevents some sites from rejecting it. Val now skips archive.is links too when skipping archive.org links.
Revision 1157 - (view) - [select for diffs]
Modified Sun May 9 21:53:48 2021 UTC (4 years, 5 months ago) by iritscen
File length: 55936 byte(s)
Diff to previous 1149 , to selected 1201
ValExtLinks: Make sure that bad YT links count as NG. Various tweaks to project organization.
Revision 1149 - (view) - [select for diffs]
Modified Sun Feb 7 22:36:56 2021 UTC (4 years, 8 months ago) by iritscen
File length: 55804 byte(s)
Diff to previous 1148 , to selected 1201
ValExtLinks: The messages about skipping URLs now show the wiki page's namespace. Added 504 to known response codes.
Revision 1148 - (view) - [select for diffs]
Modified Thu Feb 4 23:15:20 2021 UTC (4 years, 8 months ago) by iritscen
File length: 55740 byte(s)
Diff to previous 1147 , to selected 1201
ValExtLinks: Val can now recognize bad YouTube links (no thanks to YouTube). Fixed some math errors. Added error 429 to known codes.
Revision 1147 - (view) - [select for diffs]
Modified Tue Feb 2 20:10:39 2021 UTC (4 years, 8 months ago) by iritscen
File length: 54585 byte(s)
Diff to previous 1146 , to selected 1201
ValExtLinks: Changed --suggest-snapshots to --suggest-snapshots-ng and added --suggest-snapshots-ok for getting snapshot URLs for all good links. This can be used to confirm that sites are backed up in case they die in the future, but note that this argument will take hours to run due to the API rate limit. Added awareness of API rate limit so Archive.org will not start blocking script.
Revision 1146 - (view) - [select for diffs]
Modified Sun Nov 1 18:55:05 2020 UTC (4 years, 11 months ago) by iritscen
File length: 52586 byte(s)
Diff to previous 1145 , to selected 1201
ValExtLinks: Replace all occurrences of HTML-encoded '&'s in exception URL, not just the first.
Revision 1145 - (view) - [select for diffs]
Modified Sun Oct 11 14:58:59 2020 UTC (5 years ago) by iritscen
File length: 52585 byte(s)
Diff to previous 1144 , to selected 1201
ValExtLinks: Added .do to recognized page suffixes.
Revision 1144 - (view) - [select for diffs]
Modified Sun Sep 6 20:51:22 2020 UTC (5 years, 1 month ago) by iritscen
File length: 52582 byte(s)
Diff to previous 1142 , to selected 1201
ValExtLinks: Changed --skip-archive-links argument to --check-archive-links because the default should be to skip them. Val now uploads all three formats of its report, and links to the RTF and TXT versions from the HTML one. Val can also now tell whether each upload succeeded. A report with no link issues will print a placeholder message in that section of the report. Fixed a bug where Val thought a link should be an interwiki link when it was really a link to an archive.org snapshot from said wiki.
Revision 1142 - (view) - [select for diffs]
Modified Fri Sep 4 03:07:08 2020 UTC (5 years, 1 month ago) by iritscen
File length: 50927 byte(s)
Diff to previous 1141 , to selected 1201
Val now tries each URL three times. This has proven more effective than giving Val a long timeout and trying each URL once. The summary report has been refined a bit; the most notable change is that the final number and breakdown of link issues leaves out the excepted links. Also stopped Val from getting confused by HTML-encoded '&'s in the exceptions list.
Revision 1141 - (view) - [select for diffs]
Modified Fri Sep 4 02:54:30 2020 UTC (5 years, 1 month ago) by iritscen
File length: 49822 byte(s)
Diff to previous 1137 , to selected 1201
Committing the changes to Val which I meant to commit over a week ago. I committed everything but the updated script itself. See last Val commit message for list of changes.
Revision 1137 - (view) - [select for diffs]
Modified Tue Jul 21 14:16:54 2020 UTC (5 years, 3 months ago) by iritscen
File length: 50434 byte(s)
Diff to previous 1136 , to selected 1201
ValExtLinks: Added '.full' as a recognized page suffix.
Revision 1136 - (view) - [select for diffs]
Modified Mon Jul 20 15:58:39 2020 UTC (5 years, 3 months ago) by iritscen
File length: 50429 byte(s)
Diff to previous 1135 , to selected 1201
ValExtLinks now reads its exceptions list from a wiki page instead of a text file that only I can change.
Revision 1135 - (view) - [select for diffs]
Modified Sun Jul 12 23:57:00 2020 UTC (5 years, 3 months ago) by iritscen
File length: 49467 byte(s)
Diff to previous 1127 , to selected 1201
Added option to not validate archive.org URLs, as those are unlikely to go bad, and we have an increasing number of them. Val now reports trivial redirect settings in Config section.
Revision 1127 - (view) - [select for diffs]
Modified Sat Mar 28 02:08:29 2020 UTC (5 years, 6 months ago) by iritscen
File length: 48069 byte(s)
Diff to previous 1125 , to selected 1201
Val now counts redirects from youtu.be to youtube.com as OK links.  These links will be reported on if the argument --show-yt-redirects is used.  Renamed --show-https-upgrade to --show-https-upgrades for consistency.  Also sorted the file and page suffix arrays and added some more items to them.  Now handling status codes 400, 418, 502 and 530.  Fixed incorrect nbsps in HTML report.  Val is no longer confused by URLs ending in '(' or ')', or which contain a '%' towards the end.
Revision 1125 - (view) - [select for diffs]
Modified Wed Mar 25 21:50:30 2020 UTC (5 years, 7 months ago) by iritscen
File length: 46766 byte(s)
Diff to previous 1124 , to selected 1201
Val no longer prints the result codes for IW and EI links, since that doesn't really make sense. Fixed a spacing issue in HTML report.
Revision 1124 - (view) - [select for diffs]
Modified Wed Mar 25 01:59:27 2020 UTC (5 years, 7 months ago) by iritscen
File length: 46522 byte(s)
Diff to previous 1123 , to selected 1201
Val now removes the annoying ':80' in many Archive links.
Revision 1123 - (view) - [select for diffs]
Modified Sat Mar 21 22:08:35 2020 UTC (5 years, 7 months ago) by iritscen
File length: 46369 byte(s)
Diff to previous 1122 , to selected 1201
Fixed bug in Val that was causing dozens of 403 errors to be returned unnecessarily. Polished report wording and messages a little.
Revision 1122 - (view) - [select for diffs]
Modified Fri Mar 20 22:13:48 2020 UTC (5 years, 7 months ago) by iritscen
File length: 45860 byte(s)
Diff to previous 1120 , to selected 1201
Val now links to wiki pages using HTTPS instead of HTTP. Fixed code that exempts minor forms of redirects from being listed. New arguments --show-added-slashes and --show-https-upgrade allow one to turn off these exemptions. Reworked summary section extensively to be more readable.
Revision 1120 - (view) - [select for diffs]
Modified Wed Mar 18 17:08:59 2020 UTC (5 years, 7 months ago) by iritscen
File length: 43115 byte(s)
Diff to previous 1119 , to selected 1201
Val's reports now print section headers for the init/config stage and for the link results themselves.
Revision 1119 - (view) - [select for diffs]
Modified Wed Mar 18 00:24:42 2020 UTC (5 years, 7 months ago) by iritscen
File length: 42960 byte(s)
Diff to previous 1118 , to selected 1201
Properly fixed Val's parsing of Archive API responses this time. Added a little space between each link result, making report much easier to read.
Revision 1118 - (view) - [select for diffs]
Modified Tue Mar 17 16:07:35 2020 UTC (5 years, 7 months ago) by iritscen
File length: 42591 byte(s)
Diff to previous 1075 , to selected 1201
Fixed ValExtLinks' reading of Archive API replies. Fix for reading links that happen to have a shebang in them. Now knows how to handle NULL namespace links. Now prints elapsed time.
Revision 1075 - (view) - [select for diffs]
Modified Fri Oct 6 02:02:16 2017 UTC (8 years ago) by iritscen
File length: 41547 byte(s)
Diff to previous 1070 , to selected 1201
ValExtLinks: Fixed bug in interwiki link suggestions. Corrected documentation error.
Revision 1070 - (view) - [select for diffs]
Modified Tue Oct 3 03:01:32 2017 UTC (8 years ago) by iritscen
File length: 41586 byte(s)
Diff to previous 1069 , to selected 1201
ValExtLinks improvements:
- Now advises reader of external internal links.
- The exceptions file now allows finer-grained exemption of a URL by matching to the specific page that contains it instead of exempting all occurrences of that URL (but the '*' wildcard will match all containing pages). Currently you can only list a URL once, however.
- The exceptions file now allows external internal and potential intrawiki links to be exempted from the report.
- The path to Google Chrome (for taking screenshots) is now external to the script, supplied as an argument after "--take-screenshots".
- All of OniGalore's interwiki shortcuts are now recognized.
- Protection against failed retrieval of redirect URL.
- Better recognition of unimportant redirects (http->https, added ending slash).
Revision 1069 - (view) - [select for diffs]
Modified Wed Aug 2 04:26:48 2017 UTC (8 years, 2 months ago) by iritscen
File length: 37409 byte(s)
Diff to previous 1067 , to selected 1201
ValExtLinks: IW links now reported as separate category from OK links. RD links that are just redirecting from http:// to https:// are now regarded as OK.
Revision 1067 - (view) - [select for diffs]
Modified Tue Aug 1 17:09:42 2017 UTC (8 years, 2 months ago) by iritscen
File length: 36804 byte(s)
Diff to previous 1066 , to selected 1201
Val now understands HTTP redirect responses and will report the URL we're redirected to. Also now tallies IW links.
Revision 1066 - (view) - [select for diffs]
Modified Tue Aug 1 14:30:24 2017 UTC (8 years, 2 months ago) by iritscen
File length: 33956 byte(s)
Diff to previous 1064 , to selected 1201
Updating Val to new location of files on oni2.net and slight changes to Archive API.
Revision 1064 - (view) - [select for diffs]
Added Sun Jul 2 21:50:22 2017 UTC (8 years, 3 months ago) by iritscen
File length: 33963 byte(s)
Diff to selected 1201
Committing my wiki link validation script, as it is reasonably mature now.

Convenience Links

Links to HEAD: (view)

Compare Revisions

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a