Ideally, the original message is not noticeably degraded by presence of a hidden message. As a result, the most effective techniques tend to make use of data that contains a lot of redundancy, such as raw audio and image files. Steganography works much less effectively, if at all, with efficient compressed formats such as JPEG and MPEG.
Unfortunately, sending large amounts of raw audio and image data can arouse suspicion, and the pseudo-English encoding schemes are not sophisticated enough to fool a human observer.
The snow program runs in two modes - message concealment, and message extraction. During concealment, the following steps are taken.
Each of the steps are described in detail below.
If you want to compress a long message, or one not containing standard text, you would be better off compressing the message externally with a specialized compression program, and bypassing snow's optional compression step. This usually results in a better compression ratio.
The lower 7 bits of each character in the password are packed into an array, which is used to set the encryption key. The ICE encryption algorithm can operate at different levels, with higher levels using longer keys and providing more security. The ICE level appropriate for the password length is used.
CFB mode makes use of an initialization vector (IV), which is initially set to the first 64 bits of the key encrypted by itself. Each time a bit is encrypted, the IV is encrypted, and the leftmost bit of the encrypted IV is XORed with the bit. The IV is then shifted left one bit, and the ciphertext bit is added to the right. Decryption reverses this process.
Data is written 3 bits at a time, coding for 0 to 7 spaces. Any messages not a multiple of 3 bits will be padded by zeroes. During extraction, an extra one or two bits at the end will be ignored (fortunately there are no two-bit Huffman codes to confuse things).
An alternative scheme was considered, where bits were written one at a time as either a space or a tab. Although this scheme adds fewer characters per bit (1 vs 1.5), it requires more columns per bit (4.5 vs 2.67), and column space is the limiting factor.
Tabs are used to separate the blocks of spaces. Thus 3 bits are usually coded in 8 columns of text, and given that the default line length is 80 characters, this allows 30 bits to be stored on empty lines. A tab is not appended to the end of a line unless the last 3 bits coded to zero spaces, in which case it is needed to show some bits are actually there.
If a message will not fit into the available text, empty lines will be appended and used to contain the overflow. A warning message will also be produced, since this affects the look of the original text.