With one last code modification, you're in, and the contents of the vault are yours! Cracking Codes with Python is not quite about breaking into banks or pulling off elaborate heists, but it's always fun to dream.
Author's note: This was originally published on EH-Net, so definitely give them a visit! I have done some editing for spelling, grammar, and readibility for this version. That said, the content itself should match the original review.
Cracking Codes with Python - Introduction
Cracking Codes with Python by Al Sweigart is a newly published (January 2018), 424-page book from No Starch Press that bills itself as "An Introduction to Building and Breaking Ciphers." You can pick up a copy of this book (ISBN-13: 978-1-59327-822-9) from No Starch for $29.95 with a free eBook. Al Sweigart is a professional software developer who teaches programming to kids and adults. He is the author of Automate the Boring Stuff with Python, Invent Your Own Computer Games with Python, and Scratch Programming Playground, also from No Starch Press. His other books are also freely available under a Creative Commons license on his website https://inventwithpython.com/.
This book serves as an introduction to Python, as well as programming in general. The only pre-requisites for this book are a computer that can run Python, and the desire to learn.
Before I dive into my review, I'd like to cover who I am, what I do, and my interest in this book. Hopefully this will explain my existing strengths and weaknesses going into the book. Additionally, it might help you decide if this book is for you. I am currently a Principal Penetration Testing Consultant for Secureworks, and have been pentesting for over 5 years. I was previously a developer, though never any production level Python (plenty of scripting though). I've also had plenty of CTF experience, so I at least have a vague interest (and some experience) in cryptography. That said, I am far from an expert in cryptography, and have never taken any training or coursework in it.
Instead of a rote chapter by chapter summary, I'd like to organize this review a bit differently. I will break each major section up into parts, and go over what the section covers. After I cover the book, I'll go over some of the applications that you'll develop over the course of it. I'll also be including a link to my versions of the applications in my GitHub. Finally, I'll wrap up the book in its entirety and add any parting words or suggestions.
Cracking Codes with Python - Details
The Introduction and Chapter 1 go over a few great topics, and build up what the book is really about. The author covers a few basic cryptosystems, as well as an introduction to cryptography in general. I really like the notes about former export laws and the RSA encryption scheme being included. This is important for readers to understand how far cryptography, as well as computer law, have come since the 1990s. The paper cryptography tools section was fun, though I wasn't the biggest fan of the included virtual cipher wheel. It might have been nice to cover one more manual cipher, but I understand that other ciphers are covered later on.
Chapters 2 and 3 are purely an Introduction to Python. I'll be honest, when I first glanced at the table of contents, I was a bit skeptical that any Python cold really be covered in the 28 pages of these two chapters. That said, the author does a great job of introducing basic concepts, and building on them piece by piece. I would have preferred 3 chapters, with the IDLE/Hello World chapter being a bit longer. That said, that's also probably personal preference. By the end of this chapter, most neophytes should have a basic of understanding Python and what is to come. All of the following chapters will build upon the knowledge of these chapters. With that in mind, this section is not intended as a fully fledged Python tutorial.
After the introduction to Python, it is time for some more ciphers. Chapters 4 through 8 follow a similar structure to each other. They introduce a new cipher, build on the previous Python knowledge, and then demonstrate how to break the cipher. The Caesar cipher was a great first choice, as the reader should already be familiar with it from Chapter 1. The book was clearly designed for reading the chapters in order, especially for readers who are learning Python. That said, this is a hands-on guide for beginners, so more experienced developers might find the early chapters a bit boring outside of the encryption/decryption algorithms. I really like the Summary at the end of the chapters, as it reinforces the covered concepts to new developers.
Chapter 9 strays from the formula slightly, but I think this chapter is great. While it doesn't necessarily mention unit testing or TDD, it does say that the reader is developing automated testing. I think that this is a perfect topic to cover at this point in the Python tutorial. This also verifies that the reader copied or coded the previous 2 programs correctly. I even learned about the random.SystemRandom() method, which I've never used before.
Chapters 10 through 12 build on the Python knowledge with a slightly different formula. Chapter 10 covers file I/O, which is very important for new developers to understand. Additionally, it allows readers to see the usefulness of the encryption programs that they've built, be encrypting a file.
Chapter 11 builds on that knowledge in another Python centric chapter. It walks the reader through a class that detects whether or not a string is English based on some minimum word and letter requirements. The detectEnglish class is nice to have, and I'll probably start using it during some of my CTFs.
Finally, Chapter 12 is a great one, as it introduces the first real cryptanalysis technique. Note that I was unable to get the script from Chapter 12 to work initially. The reason for this was that isEnglish was returning false, even with the original plaintext. The reason for this was that each of my dictionary keys was ending in a carriage word. Once I fixed the file for my Mac (cat dictionary.txt | sed "s/$(printf '\r')\$//" > dictionary-fixed.txt), I was good to go. This chapter did not introduce a lot of new Python. This allows the reader to see how they can perform an attack against the Transposition Cipher using the detectEnglish class.
Cracking Codes with Python - Chapters 13-18
Chapters 13 through 18 are similar to the earlier chapters. The continue to introduce a new cipher, build on the crypto and Python knowledge, and then break the cipher. It was handy how the author only briefly mentioned the Euclidian algorithms. The reader is then allowed to do more research on their own. I was also able to slightly increase my affine cipher algorithm's security during this chapter. I used all of the ASCII characters between 32 and 127 for my SYMBOLS. This increases the key possibilities from 1320 to 6840, though this is still fairly trivial to brute-force.
While the affine cipher isn't much more secure than the Caesar cipher, it was still fun to learn about it. Chapter 17 was another crypto attack, and the first that didn't use any manner of brute forcing. I knew that substitution ciphers could easily be solved using online tools etc., but it was nice to walk through it step by step and cover why a brute-force attack is infeasible. Finally, the Vigenère chapter introduces a cipher that cannot be defeated by brute-force or word pattern analysis!
Before I go any further on the Vigenère cipher, I also wanted to mention the string concatenation notes. Not only will it depend on your Python version, but there is an even faster method. In Python 2.7.13, using stringTest.py, I got a time of 20.535 seconds with concatenation, 20.188 seconds with list concatenation, and 17.540 seconds with an inline list comprehension. For more information, you can visit the following article.
The Vigenère cipher was a fun one, and it was great to see how such a "simple" change to the Caesar cipher made it "unbreakable" for so long.
Chapters 19 and 20 cover frequency analysis, and applying this technique to break the Vigenère cipher. Chapter 19 was a great walkthrough for the technique. It also presented a much cleaner solution than I would have tried for my first attempt. I probably would have gone through the message letter by letter, counted the character counts, and compared them to an English frequency order manually. The dictionary attack was similar to the others, and fairly straightforward. Beyond a dictionary attack, I never actually knew how a substitution cipher could be broken.
The author does a great job walking the reader though Kasiski elimination, which really is a cool technique. Note that this is a much longer script than any of the previous chapters. Be sure to follow along with the source code and the text, if you'd like to fully understand it. I also really liked how the author covered the technique in-depth before showing any Python code. This definitely helped me to have a basic understanding of the attack beforehand. It might also help to have the script open while going through the chapter. It is pretty easy to get lost while the author walks through it.
Chapter 21 covers the One-Time Pad Cipher, which is a slight modification of the Vigenère cipher that makes it unbreakable. While the one-time pad wasn't covered as a script, I have included it along with my scripts as a basic example. Although this was a shorter chapter, it was definitely worth covering, especially if the reader is hoping to actually use any of the encryption methods in this book. I am not sure if I will use the one-time pad for anything personally. That said, I can definitely think of a few CTF challenges that could revolve around it!
Cracking Codes with Python - Chapters 22-24
Chapters 22 and 23 build up to the RSA algorithm. The prime number testing and generation was a good start, and I had personally never seen the Rabin-Miller algorithm before. I have used the Sieve of Eratosthenes for the Project Euler problem #3 though.
Chapter 24 covers "textbook" RSA, and it does so in a fairly straight-forward way. I had a few questions that weren't answered until the middle of the chapter. That may have been due to my thought process/personal preference though. I've used a few tools to encrypt/decrypt or "attack" messages encrypted with RSA before. With that in mind, I now have a better understanding of the algorithm and how it works. I am glad that the author mentions not rolling your own cryptography at the end, but mentioning it more than once might be nice. There were a few things that I wish the author had covered in this chapter though, such as:
- How does the algorithm weaken if either p or q isn't actually prime?
- What attacks could be performed if an attacker discovers either p or q (common in CTFs)?
- How to perform a brute-force attack against a much smaller key-size.
- What are some of the "advanced techniques" that cryptographers use to break the algorithm demonstrated?
Cracking Codes with Python - My Contributions
I've also released my versions of the scripts, so you can see what you will be creating during the book. You can find them in my GitHub repository, so please feel free to take a look, copy, change, or share! Note that I am using Python 2.7, and have some experience in Python. This means that the scripts will not match the book exactly. Additionally, some of the examples from the book won't work exactly in my scripts. This is because the either require more input, or allow more symbols. Note that I have modified any of the files that begin with a number. I have also accounted for this in the import statements as well. I tried not to mess with the "library" modules such as cryptoMath, so those are exactly the same.
Note that a number of the scripts contain duplicated code. In the future, it might be worthwhile to add logic for selection to a master class. For example, the transpositionCipherHacker and affineCipherHacker are almost exactly the same. The only real difference being the imported classes and references. I may do this in the future for my versions of the scripts. For now, I leave this as an exercise to the reader.
Cracking Codes with Python - Final Thoughts
If you are new to Python, or development in general, then I highly recommend you follow the author's suggestion of reading through the book beginning to end. That said, if you already have Python experience, then I'd suggest a slightly different approach to the book. In that case, I would start by reading the beginning of the chapter. Once you get a hang for what the algorithm does, and how it works, I'd move on to the code. If you can understand how the script works, and what it does, then move on to the next chapter. If not, you can follow along with the author as he steps through it.
In conclusion, this book was definitely worth the read, even as an experienced Python developer. I learned more about cryptography, and even a few new Python tricks. It was also really cool to see scripts that I wrote actually encrypt/decrypt something. I even found a few errors in the book that I submitted to the author! It would be great to see a sequel to this book, covering more "advanced" algorithms such as RC4, DES, AES, etc. I understand that it might not fit the author's theme, as that will likely be more about math and cryptography.
Cracking Codes with Python - Real Life Example
Finally, I also have a funny note about the following quote from the author: "The ciphers in this book (except for the public key cipher in Chapters 23 and 24) are all centuries old, but any laptop has the computational power to hack them. No modern organizations or individuals use these ciphers anymore, but by learning them, you’ll learn the foundations cryptography was built on and how hackers can break weak encryption.". This isn't necessarily true, as I've seen a Caesar cipher used to encrypt sensitive data in a production environment.