Bioequivalence and Bioavailability Forum • Documents in the Internet Archive

Documents in the Internet Archive [Tips / Tricks]

posted by Helmut – Vienna, Austria, 2013-11-21 17:45 (4240 d 00:39 ago) – Posting: # 11938
Views: 14,804

Dear all,

sometimes one needs a previous version of a document which is no more available on agency’s websites. Why don’t they have the version control / audit trail they require from us?

Example: A BE study on alendronate was performed in August 2008 according to FDA’s draft guidance (January 2008). Due to improvements in bioanalytical technology in the October 2011’s revision FDA requires plasma data instead of urine. The study was submitted to Oman’s authority which required a copy of the old guidance before accepting the study. Procedure:

Find the URL of the current guidance at FDA’s site:
http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm082421.pdf
Fire up the Internet Archive and paste the URL to the WayBackMachine BROWSE HISTORY
Move in the timeline to the earliest year (2010) and click on the earliest snapshot (March 9^th): http://web.archive.org/web/20100309050736/http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm082421.pdf

Problems:

FDA redesigned their website in 2009. Now all guidances start with “ucm” followed by a six-digit number. Before mid-2009 it was a four-digit number followed by “dft” or “fnl”. You can only search the internet-archive for a URL, not the document’s contents. In other words, if you don’t know the old URL, you will not find it – although it might exist.
If a site / directory has a low number of visits or backlinks, the Alexa crawler will visit it with a low frequency and miss intermediate revisions. Therefore, only three previous versions of EMA’s Q&A-document are archived. Furthermore, the archive has a back-lag of 6–24 months.
The site’s owner might decide to prevent archiving. This works even retrospectively. If
User-agent: ia_archiver Disallow: /
is added to the site’s robots.txt, the site will not be crawled any more and previous versions will be removed from the archive. Bad luck.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Documents in the Internet ArchiveHelmut 2013-11-21 16:45
- Documents in the Internet Archive Mahesh M 2015-12-24 13:37