Hello and welcome to NCC Group’s Cryptopals guided tour! This post is the first in a series of eight installments covering the solutions to the Cryptopals Crypto Challenges. These have been a long time coming, and we’re excited to finally start bringing them to you.
For those who don’t know, Cryptopals is a series of eight sets of challenges covering common cryptographic constructs and how they can be attacked. You can read more about Cryptopals at https://cryptopals.com/.
There’s a lot of practical knowledge wrapped up in these challenges, and working through them is an excellent way for programmers to learn more about cryptography – or for cryptographers to learn more about programming. We strongly encourage you to give them a try and to see how far you can get on your own.
The series is meant as something you can check after finishing a challenge or set. It’s an opportunity to see how else you might’ve solved the challenges, to check for anything you might’ve missed, and to see the sights surrounding each problem. This is what we mean by “guided tour”: more than just a walkthrough, it’s a walk through the scenic route!
Of course, we can’t deliver on that promise through code alone. We tried writing up blog posts, but they started to run long and ended up looking more intimidating than they really were.
The format we finally settled on is screencasts: for each challenge, we’re releasing a recording of yours truly (Eli Sohl from NCC Group’s Cryptography Services practice) solving the challenge live and narrating the process from start to finish. Each video comes with a timestamped index of content, plus a list of links for further reading. We’ll release these videos one set at a time, starting with this post which covers the first set of challenges.
So, without further ado, here are the videos for Cryptopals Set 1. We hope you find them helpful, and we look forward to sharing the videos for the following challenge sets as soon as they’re ready.
Oh, and by the way: if you just want to see the finished solution code, you can find that here.
Set 1, Challenge 1: Convert hex to base64
Challenge link: https://cryptopals.com/sets/1/challenges/1
00:00 – Intro
01:10 – Exploring options for hex-decoding in an interpreter
02:00 – Introducing b16decode
03:15 – Writing the script’s main logic
04:00 – Format string f”{expr=}” syntax
04:50 – Testing the solution
05:00 – Cleaning up the solution by refactoring
06:00 – Adding a __name__ guard (and explaining why we need one)
07:00 – Adding an explicit output check and testing the solution again
Further reading:
https://www.ietf.org/rfc/rfc4648.txt
https://docs.python.org/3/library/base64.html
https://docs.python.org/3/reference/import.html?highlight=name#name__
Set 1, Challenge 2: Fixed XOR
Challenge link: https://cryptopals.com/sets/1/challenges/2
00:00 – Intro
01:00 – Iterating over two bytestrings in tandem
01:50 – Iterating by index
02:45 – Iterating by zip
03:40 – How to benchmark a function in iPython; benchmarking index and zip methods
04:50 – Implementing xor (first method: appending to bytestring)
06:48 – Implementing xor (second method: appending to list)
07:20 – Implementing xor (third method: joining bytes from a generator expression)
08:16 – Implementing xor (fourth method: calling bytes() on a generator expression)
08:47 – Benchmarking xor implementations
10:10 – Starting the solution script
10:55 – Wrapping two-argument bytes_xor, adding more features
13:47 – When to use assertions in Python (and when not to)
15:35 – Writing the main block
17:00 – Testing the solution
Further reading:
https://en.wikipedia.org/wiki/Exclusive_or#Computer_science
https://docs.python.org/3/library/functions.html#zip
https://www.python.org/dev/peps/pep-0289/
https://docs.python.org/3/glossary.html#term-generator
https://docs.python.org/3/using/cmdline.html#cmdoption-O
https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit
Set 1, Challenge 3: Single-byte XOR cipher
Challenge link: https://cryptopals.com/sets/1/challenges/3
00:00 – Intro
00:35 – Describing frequency analysis
01:10 – Downloading a book from Project Gutenberg and counting character frequencies
03:20 – Looking at frequency counts by character set
04:10 – Starting the solution script, talking about Black (“The uncompromising Python code formatter”)
04:45 – Writing a function to score a plaintext’s plausibility by its frequency counts
06:58 – Writing the XOR cipher cracking function (first version)
07:55 – How (and why) to use infinity in Python
10:30 – Writing crack_xor_cipher_worse so we can view top 5 candidate plaintexts
11:20 – Testing the cracking function
12:05 – Talking about ASCII table layout
13:37 – Talking about sample sizes, swapping out frequency count dictionary
15:20 – Modifying “score_text” to only count lowercase frequencies
15:45 – Solving the challenge, starting to refactor
16:20 – Introducing dataclasses and creating ScoredGuess dataclass
20:30 – Rewriting crack_xor_cipher to use ScoredGuess
23:12 – Optimizing the scoring algorithm
27:30 – Testing and benchmarking the optimized algorithm (it works really well!)
Further reading:
https://en.wikipedia.org/wiki/Substitution_cipher
https://en.wikipedia.org/wiki/Frequency_analysis
https://www.python.org/dev/peps/pep-0008/
https://github.com/psf/black
https://en.wikipedia.org/wiki/Etaoin_shrdlu
https://en.wikipedia.org/wiki/ASCII#Design_considerations
https://docs.python.org/3/library/dataclasses.html
https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit
Set 1, Challenge 4: Detect Single-Character XOR
Challenge link: https://cryptopals.com/sets/1/challenges/4
00:00 – Intro
00:30 – Explaining the solution strategy
01:00 – Starting the solution script
02:05 – Writing the search loop and solving the challenge
02:50 – Cleaning up the script
03:25 – Explaining how the “progress meter” works
05:23 – Retesting the solution
Further reading:
https://en.wikipedia.org/wiki/Distinguishing_attack
Set 1, Challenge 5: Implement repeating-key XOR
Challenge link: https://cryptopals.com/sets/1/challenges/5
00:00 – Intro
00:35 – Starting the solution script and explaining how it will work
01:00 – Talking about itertools
02:10 – Writing a main block (at hyperspeed)
02:25 – Talking about string literal concatenation
03:55 – Testing the solution
Further reading:
https://docs.python.org/3/library/itertools.html
https://docs.python.org/3/glossary.html#term-slice
https://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher
https://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher#Kasiski_examination
https://docs.python.org/3/reference/lexical_analysis.html#string-literal-concatenation
Set 1, Challenge 6: Break repeating-key XOR
Challenge link: https://cryptopals.com/sets/1/challenges/6
00:00 – Intro
00:45 – High-level overview of solution algorithm
01:30 – Talking about Hamming distance and Hamming weight
03:10 – Implementing Hamming weight (first method: converting to string, counting “1”s)
03:57 – Implementing Hamming weight (second method: checking lowest bit and shifting)
05:38 – Implementing Hamming weight (third method: fancy in-place digit sum)
12:10 – Implementing Hamming weight (fourth method: using subtraction to zero each 1 bit)
15:17 – Implementing Hamming weight (fifth method: lookup table)
16:02 – Implementing Hamming distance in terms of XOR and Hamming weight
16:38 – Benchmarking Hamming weight functions
18:40 – Starting the solution script, copying in Hamming distance and weight functions
20:30 – Moving on to guessing key size
21:10 – Explaining why this method for guessing key size works (Kasiski examination)
22:35 – “That’s enough talking, let’s get to actually writing this thing”
24:35 – Cutting a corner by generating transpositions of blocks using slice syntax
26:00 – Cracking each transposed cipher and combining the results
27:00 – Writing the main block
29:30 – Testing the solution
Further reading:
https://en.wikipedia.org/wiki/Hamming_weight
https://en.wikipedia.org/wiki/Hamming_distance
https://en.wikipedia.org/wiki/Hacker%27s_Delight
https://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher
https://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher#Kasiski_examination
Set 1, Challenge 7: AES in ECB mode
Challenge link: https://cryptopals.com/sets/1/challenges/7
00:00 – Intro
01:10 – Discussing options: “Cryptography” library
01:35 – Discussing options: Pure Python implementation
02:05 – Discussing options: CryptoHack’s AES challenges
02:35 – Discussing options: “pycrypto” library
03:08 – Discussing options: “pycryptodome” library
03:33 – Creating a requirements.txt file and a virtual environment
04:20 – Starting the solution script
04:58 – Starting the main block
05:28 – Why don’t we always use AES-ECB?
07:05 – Testing the solution
Further reading:
https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Electronic_codebook_(ECB)
https://github.com/pycrypto/pycrypto/issues/176
https://www.pycryptodome.org/en/latest/
https://www.pycryptodome.org/en/latest/src/cipher/aes.html
https://cryptohack.org/
Set 1, Challenge 8: Detect AES in ECB mode
Challenge link: https://cryptopals.com/sets/1/challenges/8
00:00 – Intro
00:30 – Describing the solution strategy, showing the famous penguin picture
01:50 – Implementing bytes_to_chunks
04:35 – Starting the main block
05:50 – Deduplicating blocks by converting to set
07:38 – Testing the solution
Further reading:
https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Electronic_codebook_(ECB)
https://en.wikipedia.org/wiki/Distinguishing_attack
Thank you!
Before wrapping up this post, I’d like to take a moment to thank Parnian Alimi and Marie-Sarah Lacharite for their valuable feedback and detailed code review.
I would also like to thank the authors of the Cryptopals challenges. I’ve never met any of them, but I’ve spent a lot of time with their work and I appreciate the effort they’ve put into it.