4

Scanning the Source Code

In This chapter:

The next few chapters of this book contain specially formatted versions of the documents that we wrote to design the DES Cracker. These documents are the primary sources of our research in brute-force cryptanalysis, which other researchers would need in order to duplicate or validate our research results.

The Politics of Cryptographic Source Code

Since we are interested in the rapid progress of the science of cryptography, as well as in educating the public about the benefits and dangers of cryptographic technology, we would have preferred to put all the information in this book on the World Wide Web. There it would be instantly accessible to anyone worldwide who has an interest in learning about cryptography.

Unfortunately the authors live and work in a country whose policies on cryptography have been shaped by decades of a secrecy mentality and covert control. Powerful agencies which depend on wiretapping to do their jobs--as well as to do things that aren't part of their jobs, but which keep them in power--have compromised both the Congress and several Executive Branch agencies. They convinced Congress to pass unconstitutional laws which limit the freedom of researchers--such as ourselves--to publish their work. (All too often, convincing Congress to violate the Constitution is like convincing a cat to follow a squeaking can opener, but that doesn't excuse the agencies for doing it.) They pressured agencies such as the Commerce Department, State Department, and Department of Justice to not only subvert their oaths of office by supporting these unconstitutional laws, but to act as front-men in their repressive censorship scheme, creating unconstitutional regulations and enforcing them against ordinary researchers and

4-1


4-2

authors of software.

The National Security Agency is the main agency involved, though they seem to have recruited the Federal Bureau of Investigation in the last several years. From the outside we can only speculate what pressures they brought to bear on these other parts of the government. The FBI has a long history of illicit wiretapping, followed by use of the information gained for blackmail, including blackmail of Congressmen and Presidents. FBI spokesmen say that was "the old bad FBI" and that all that stuff has been cleaned up after J. Edgar Hoover died and President Nixon was thrown out of office. But these agencies still do everything in their power to prevent ordinary citizens from being able to examine their activities, e.g. stonewalling those of us who try to use the Freedom of Information Act to find out exactly what they are doing.

Anyway, these agencies influenced laws and regulations which now make it illegal for U.S. crypto researchers to publish their results on the World Wide Web (or elsewhere in electronic form).

The Paper Publishing Exception

Several cryptographers have brought lawsuits against the US Government because their work has been censored by the laws restricting the export of cryptography. (The Electronic Frontier Foundation is sponsoring one of these suits, Bernstein v. Department of Justice, et al ).* One result of bringing these practices under judicial scrutiny is that some of the most egregious past practices have been eliminated.

For example, between the 1970's and early 1990's, NSA actually did threaten people with prosecution if they published certain scientific papers, or put them into libraries. They also had a "voluntary" censorship scheme for people who were willing to sign up for it. Once they were sued, the Government realized that their chances of losing a court battle over the export controls would be much greater if they continued censoring books, technical papers, and such.

Judges understand books. They understand that when the government denies people the ability to write, distribute, or sell books, there is something very fishy going on. The government might be able to pull the wool over a few judges' eyes about jazzy modern technologies like the Internet, floppy disks, fax machines, telephones, and such. But they are unlikely to fool the judges about whether it's constitutional to jail or punish someone for putting ink onto paper in this free country.

_________________

* See http://www.eff.org/pub/Privacy/ITAR_export/Bernstein_case/.


4-3

Therefore, the last serious update of the cryptography export controls (in 1996) made it explicit that these regulations do not attempt to regulate the publication of information in books (or on paper in any format). They waffled by claiming that they "might" later decide to regulate books--presumably if they won all their court cases -- but in the meantime, the First Amendment of the United States Constitution is still in effect for books, and we are free to publish any kind of cryptographic information in a book. Such as the one in your hand.

Therefore, cryptographic research, which has traditionally been published on paper, shows a trend to continue publishing on paper, while other forms of scientific research are rapidly moving online.

The Electronic Frontier Foundation has always published most of its information electronically. We produce a regular electronic newsletter, communicate with our members and the public largely by electronic mail and telephone, and have built a massive archive of electronically stored information about civil rights and responsibilities, which is published for instant Web or FTP access from anywhere in the world.

We would like to publish this book in the same form, but we can't yet, until our court case succeeds in having this research censorship law overturned. Publishing a paper book's exact same information electronically is seriously illegal in the United States, if it contains cryptographic software. Even communicating it privately to a friend or colleague, who happens to not live in the United States, is considered by the government to be illegal in electronic form.

The US Department of Commerce has officially stated that publishing a World Wide Web page containing links to foreign locations which contain cryptographic software "is not an export that is subject to the Export Administration Regulations (EAR)."* This makes sense to us--a quick reductio ad absurdum shows that to make a ban on links effective, they would also have to ban the mere mention of foreign Universal Resource Locators. URLs are simple strings of characters, like http://www.eff.org; it's unlikely that any American court would uphold a ban on the mere naming of a location where some piece of information can be found.

Therefore, the Electronic Frontier Foundation is free to publish links to where electronic copies of this book might exist in free countries. If we ever find out about such an overseas electronic version, we will publish such a link to it from the page at http://www.eff.org/pub/Privacy/Crypto_misc/DES_Cracking/.

___________________

* In the letter at http://samsara.law.cwru.edu/comp_law/jvd/pdj-bxa-gjs070397.htm, which is part of Professor Peter Junger's First Amendment lawsuit over the crypto export control regulations.


Scanning

When printing this book, we used tools from Pretty Good Privacy, Inc (which has since been merged into Network Associates, Inc.). They built a pretty good set of tools for scanning source code, and for printing source code for scanning. The easiest way to handle the documents we are publishing in this book is to use their tools and scanning instructions.

PGP published the tools in a book, naturally, called "Tools for Publishing Source Code via OCR", by Colin Plumb, Mark H. Weaver, and Philip R. Zimmermann, ISBN # 1-891064-02-9. The book was printed in 1997, and is sold by Printers Inc. Bookstore, 301 Castro St, Mountain View, California 94041 USA; phone +1650 961 8500; http://www.pibooks.com.

The tools and instructions from the OCR Tools book are now available on the Internet as well as in PGP's book. See http://www.pgpi.com/project/, and follow the link to "proof-reading utilities". If that doesn't work because the pages have been moved or rearranged, try working your way down from the International PGP page, http://www.pgpi.com.

PGP's tools produce per-line and per-page checksums, and make normally invisible characters like tabs and multiple spaces explicit. Once you obtain these tools, we strongly suggest reading the textual material in the book, or the equivalent README file in the online tool distribution. It contains very detailed instructions for scanning and proofreading listings like those in this book. The instructions that follow in this chapter are a very abbreviated version.

The first two parts of converting these listings to electronic form is to scan in images of the pages, then convert the images into an approximation of the text on the pages. The first part is done by a mechanical scanner; the second is done by an Optical Character Recognition (OCR) program. You can sometimes rent time at a local "copy shop" on a computer that has both a scanner and an OCR program.

When scanning the sources, we suggest "training" your OCR program by scanning the test-file pages that follow, and some of the listings, and correcting the OCR program's idea of what the text actually said. The details of how to do this will depend on your particular OCR program. But if you straighten it out first about the shapes of the particular characters and symbols that we're using, the process of correcting the errors in the rest of the pages will be much easier.

Some unique characters are used in the listings; train the OCR program to convert them as follows:


4-5

Right pointing triangle (used for tabs) - currency symbol (byte value octal 244)

Tiny centered triangle "dot" (used for multiple spaces) - center dot or bullet (byte value octal 267)

Form feed - yen (byte value octal 245)

Big black square (used for line continuation) - pilcrow or paragraph symbol (byte value octal 266).

Once you've scanned and OCR'd the pages, you can run them through PGP's tools to detect and correct errors, and to produce clean online copies.

Bootstrapping

By the courtesy of Philip R. Zimmermann and Network Associates, to help people who don't have the PGP OCR tools, we have included PGP's bootstrap and bootstrap2 pages. (The word bootstrap refers to the concept of "pulling yourself up by your bootstraps", i.e. getting something started without any outside help.) If you can scan and OCR the pages in some sort of reasonable way, you can then extract the corrected files using just this book and a Perl interpreter. It takes more manual work than if you used the full set of PGP tools.

The first bootstrap program is one page of fairly easy to read Perl code. Scan in this page, as carefully as you can: you'll have to correct it by hand. Make a copy of the file that results from the OCR, and manually delete the checksums, so that it will run as a Perl script. Then run this Perl script with the OCR result (with checksums) as the argument. If you've corrected it properly, it will run and produce a clean copy of itself, in a file called bootstrap. (Make sure none of your files have that name.) If you haven't corrected it properly, the perl script will die somehow and you'll have to compare it to the printed text to see what you missed.

When the bootstrap script runs, it checks the checksum on each line of its input file. For any line that is incorrect, the script drops you into a text editor (set by the EDITOR environment variable) so you can fix that line. When you exit the editor, it starts over again.

Once the bootstrap script has produced a clean version of itself, you can run it against the scanned and OCR'd copy of the bootstrap2 page. Correct it the same way, line by line until bootstrap doesn't complain. This should leave you with a clean copy of bootstrap2.

The bootstrap2 script is what you'll use to scan in the rest of the book. It works like the bootstrap script, but it can detect more errors by using the page


4-6

checksum. Again, it won't correct most errors itself, but will drop you into an editor to correct them manually. (If you want automatic error correction, you have to get the PGP book.)

All the scannable listings in this book are in the public domain, except the test-file, bootstrap, and bootstrap2 pages, which are copyrighted, but which Network Associates permits you to freely copy. So none of the authors have put restrictions on your right to copy their listings for friends, reprint them, scan them in, publish them, use them in products, etc. However, if you live in an unfree country, there may be restrictions on what you can do with the listings or information once you have them. Check with your local thought police.


pp. 4-7 to 4-14 in preparation: Six pages of test files, 1 page of bootstrap, 1 page of bootstrap2.


Note: URLs for other parts welcomed