Ivars Peterson's MathTrek

May 4, 1998

### Cracking a Medieval Code

The first printed book on cryptology was written by Johannes Trithemius (1462-1516), an abbot in Spanheim, Germany, who was one of the leading intellectuals of his day. Bearing the title Polygraphiae libri sex ("Six Books of Polygraphy"), it was published in 1518 after Trithemius's death.

The first of the six books contains 384 columns of Latin words, two columns per page. Each word stands for a letter of the alphabet. Here's a sample from the first page:

 a Deus a clemens b Creator b clementissimus c Conditor c pius

By taking the words standing for the letters of a secret message from consecutive columns, it's possible to construct passages that make sense as innocent prayers. For example, enciphering the letters of the word abbot would generate the Latin sentence DEUS CLEMENTISSIMUS REGENS AEVUM INFINIVET.

The remaining books of Polygraphia introduce additional cryptographic schemes, accompanied by lengthy tables, for ingeniously hiding information.

Polygraphia wasn't Trithemius's first venture into cryptology. In 1499, he had composed a controversial, cryptic volume called Steganographia (meaning "covered writing"). For years, it circulated privately in manuscript form before finally being printed in 1606, then placed on the official list of prohibited books in 1609. Ostensibly, it explained how to employ spirits to send secret messages.

The first two books of Steganographia contain numerous examples of some simple types of ciphers. For instance, in the message beginning PARAMESIEL OSHURMI DELMUSON THAFLOIN PEANO CHARUSTREA MELANY LYAMUNTO . . . , the first nonsense word signals the specific cryptographic system being used. The decipherer then knows to extract every other letter of every other word, starting with the second word, to get the message (in Latin): Sum tali cautela ut . . . .

Book III consists largely of tables of numbers, whose columns are headed by zodiacal and planetary symbols, suggesting astronomical data. Unlike the first two books, however, there are few clues to help decipher the contents.

For centuries, scholars debated whether the incomplete third book of Steganographia contains any examples of enciphered messages. Many concluded that it does not hold cryptographic secrets and merely represents magical operations of interest only to occultists.

Nonetheless, the preface of Book III begins by announcing the provocative goal of presenting a method of transmitting messages afar without the use of word, book, or messenger. Trithemius warns, however, that he has deliberately expressed himself obscurely:

"This I did that to men of learning and men deeply engaged in the study of magic, it might, by the Grace of God, be in some degree intelligible, while on the other hand, to the thick-skinned turnip-eaters it might for all time remain a hidden secret, and be to their dull intellects a sealed book forever."

Those are fighting words to a cryptanalyst, and Jim Reeds of AT&T Labs in Florham Park, N.J., couldn't resist taking on the centuries-old challenge.

"On receiving a photocopy of the Steganographia, I decided to see ifI could find any hidden messages in Book III," Reeds recounts in a paper to be published in the journal Cryptologia. "I knew that Book III was probably in a draft state. Hence, it might be missing important information; the printed version had of course not received the author's proof corrections."

At the same time, he notes, "if Book III was anything like Books I and II, it was probably pointless to try to follow the instructions given in the text. Moreover, I could expect that any plain [deciphered] texts would be short and banal."

Reeds made a lucky guess. He assumed that the cipher was numerical and that the tables were to be read in columns vertically. He also decided that the table accompanying the preface was in the form of a key, in which successive lines describe blocks of 25 numbers each, which might specify distinct letters-to-numbers encryption formulas.

Reeds started by rewriting the first numerical table, turning the columns into rows, excluding all headings and data not appearing within the original columns, and replacing any nonnumerical symbols with a / sign. Here's a sample:

/ 644 650 629 650 645 635 646 636 632 646 639 634 641 642 649 642 648 638 634 647 632
630 642 633 648 650 655 626 650 644

638 633 635 642 632 640 637 643 638 634 / 669 675 654 675 670 660 675 661 651 671 664
659 666 667 674 667 673 663 659

672 657 655 667 658 673 675 660 651 675 669 663 658 660 667 637 665 662 668 663 659 /
694 700 679 700 695 685 696 686

He noticed that the slashes divide the first 160 numbers into four blocks of exactly 40 numbers each. Moreover, almost all the numbers in each block fall within a particular numerical range.

Reeds wrote the four, 40-number blocks in four rows, one underneath the other, to see if there was any similarity in structure among the rows:

 644 650 629 650 645 635 646 636 632 646. . . 669 675 654 675 670 660 675 661 651 671. . . 694 700 679 700 695 685 696 686 632 696. . . 719 725 704 725 720 710 721 711 707 721. . .

He found that, with few exceptions, a number in a given row is 25 greater than the corresponding number in the row above.

"Although I still did not know that there was a cipher present," Reeds says, "it was clear from the emergence of this pattern that there was enough truth in my initial guesses about column reading and the importance of the number 25 to continue further."

"And if there were a cipher present, this finding would surely be due to the presence of four copies of an isolog: four copies of the same plain text encrypted in different but related ways," he continues. "If I knew how to read those parts of the text encoded with numbers in the range 626 through 650, I could probably use the same recipe to read those parts encoded with numbers in the range 651 through 675: Simply subtract 25 from each number and proceed as before."

Reeds went on to check how often each of the 25 different numbers of a row were used:

 626   1 631    0 636    1 641    1 646    2 627    0 632    3 637    1 642    4 647    1 628    0 633    2 638    3 643    1 648    2 629    1 634    3 639    1 644    2 649    1 630    1 635    2 640    1 645    1 650    4

The counts looked uneven enough to be consistent with a Latin or German text rather than just random outcomes.

A little bit of experimentation revealed that a reversed 22-letter alphabet apparently fit the observed frequency distribution: 650 = A, 649 = B, and so on, through an alphabet consisting of the letters A, B, C, D, E, F, G, H, I, L, M, N, O, P, Q, R, S, T, U, X, Y, Z, along with three additional symbols beyond Z (arbitrarily labeled alpha, beta, and gamma).

Applied to the 40 letters of the first row, this guess yields: gazafrequenslibicosduyitca?[gamma]agotriumphos. That's certainly pronounceable, and it has hints of meaningful Latin words.

Additional clues helped Reeds unveil the scheme that Trithemius had used. For example, the symbol beta is really the common German letter sequence sch, and what Reeds had thought to be x is w. He then discovered that alpha is tz and gamma is th.

"One final piece of luck cinched the identification of the cipher alphabet," Reeds says. He performed an Internet search for the two-word phrase gaza frequens and came up with the Latin passage Gaza frequens Libycos duxit Carthago triumphos. . . . This confirmed that gamma is th and suggested that the letter Reeds had labeled as y is actually x.

The Book III ciphers turn out to be numerical substitution ciphers, with multiple numerical equivalents supplied for each plain text letter, Reeds concludes.

 Th Sch Tz Z X W U T S R Q P 01 02 03 04 05 06 07 08 09 10 11 12 26 27 28 29 30 31 32 33 34 35 36 37 51 52 53 54 55 56 57 58 59 60 61 62 76 77 78 79 80 81 82 83 84 85 86 87

 O N M L I H G F E D C B A 13 14 15 16 17 18 19 20 21 22 23 24 25 38 39 40 41 42 43 44 45 46 47 48 49 50 63 64 65 66 67 68 69 70 71 72 73 74 75 88 89 90 91 92 93 94 95 96 97 98 99 00

What did Trithemius go to so much trouble to encipher? The text turns out to be somewhat garbled, probably indicating that pieces have been lost over the years or were missing to start with. Thus, the text now available represents little more than a collection of isolated sentence fragments. Those fragments reveal nothing astonishing: mundane Latin and German phrases, including one that can be loosely translated as "the bringer of this letter is a bad rogue and a thief."

"Book III contains cryptograms," Reeds says. "Like those in Books I and II, they are disguised and presented in a context of angelic magic." However, the cryptographic technique is different because the letters are represented by numbers, disguised as astronomical data, instead of being hidden within a larger mass of letters.

Trithemius might have chosen angel language not to promote magic but as a ploy to capture the reader's interest. "If so," Reeds says, "he was vastly successful, even if he completely miscalculated how his book would be received."

In one final twist to this tale, it turns out that Reeds was not the first to reveal the ciphers in Book III of Trithemius's Steganographia. Thomas Ernst, now a professor of German at La Roche College in Pittsburgh, had resolved the problem several years earlier as a graduate student at the University of Pittsburgh. He had written a paper in German describing his solution, which was published in 1996 in the Dutch journal Daphnis, but apparently no one had paid much attention to it.

In 1676, an obscure figure named Wolfgang Heidel had also claimed that he had deciphered Book III, but his findings were disputed when he insisted on writing about the discovery in his own secret code. Ernst now strongly suspects that Heidel had actually figured out Trithemius's code.

The magic of Trithemius's cryptic work has finally evaporated in the face of determined code-breaking.