Post

PicoCTF 2023 - PowerAnalysis: Part 1

Here is the description of the challenge:

Description

Before this I knew nothing about cryptography basically, and had no idea what power analysis was so I knew this is going to be a hard one. So I did what everyone would do in this case, looked up power analysis, what it is, how it works.

One of the things I found was this link, this pretty much explains what power analysis is, basically, when encrypting something with AES, the power consumption is different when it does different tasks, and we can figure out what keys it uses for encryption. This requires us to have a lot of data. In any other case, we would need an oscilloscope, but in this case, we can just retrieve data from the instance we can launch. So, we have two tasks

  • retrieve enough data for power analysis
  • somehow do power analysis

If we want to do power analysis, we need to figure out in what format we have to save the data, and since there is no way I’m writing a script myself for power analysis, let’s find one, hopefully in python.

I found this link which is a writeup of another CTF, containing a script named power.py, which does exactly what we need, here is how it looks:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# source: https://github.com/newaetech/chipwhisperer-jupyter/blob/master/courses/sca101/SOLN_Lab%204_2%20-%20CPA%20on%20Firmware%20Implementation%20of%20AES.ipynb

import numpy

sbox = [
	0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76, 
	0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0, 
	0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc, 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15, 
	0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a, 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75, 
	0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0, 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84, 
	0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b, 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf, 
	0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85, 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8, 
	0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5, 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2, 
	0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17, 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73, 
	0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88, 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb, 
	0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c, 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79, 
	0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9, 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08, 
	0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6, 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a, 
	0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e, 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e, 
	0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94, 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf, 
	0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16
]

def aes_internal(inputdata, key):
	return sbox[inputdata ^ key]

HW = [bin(n).count("1") for n in range(0, 256)]

def mean(X):
	return numpy.sum(X, axis=0)/len(X)

def std_dev(X, X_bar):
	return numpy.sqrt(numpy.sum((X-X_bar)**2, axis=0))

def cov(X, X_bar, Y, Y_bar):
	return numpy.sum((X-X_bar)*(Y-Y_bar), axis=0)

textin_array = numpy.load("textin_array.npy")
trace_array = numpy.load("trace_array.npy")

t_bar = numpy.sum(trace_array, axis=0)/len(trace_array)
o_t = numpy.sqrt(numpy.sum((trace_array - t_bar)**2, axis=0))

key = [0] * 16
for bnum in range(0, 16):
	maxcpa = [0] * 256
	for kguess in range(0, 256):
		hws = numpy.array([[HW[aes_internal(textin[bnum],kguess)] for textin in textin_array]]).transpose()
		hws_bar = mean(hws)
		o_hws = std_dev(hws, hws_bar)
		correlation = cov(trace_array, t_bar, hws, hws_bar)
		cpaoutput = correlation/(o_t*o_hws)
		maxcpa[kguess] = max(abs(cpaoutput))
	key[bnum] = numpy.argmax(maxcpa)
	print(f"found byte {bnum}")

print()
print(f"key: {''.join(f'{b:02x}' for b in key)}")

Kinda a lot to take in, but we don’t have to understand everything to make this work. What we need is somehow figure out how to make our data so it can be used by this script. We can see that is uses numpy which is a python library used for quick access of data for example. Here is the line we are concerned about:

1
2
textin_array = numpy.load("textin_array.npy")
trace_array = numpy.load("trace_array.npy")

So, let’s download the two files in the example and let’s see what kind of data is it, let’s try using cat:

1
2
3
4
❯ cat trace_array.npy  
�NUMPYv{'descr': '<i8', 'fortran_order': False, 'shape': (500, 2666), }                                                     
�rfS2MEX7Wml[aW^5VU^K<(!"984#*\uC@XSO,ISxcdMvW\<bj����tZ:5>JA,)EZ_Ugt�SonlQ'VIJ&(6JD`hgU2RDON`vhnM]Ti]>LYzmrbJTUmgylYO=QV}U?1c[f;@:U^=6XbN@by�WZNR6 ?i��^:1<lqp[;.5@fT{Ra[svar�sfGcLmfu^W]Snz��x]=42-9QN<@Y���\KTbp���vpa~���~g>On��j�gvQzZmiol]z���hZDVl�lR<JM]reePgt��T9>q�}TTVW9Xjr?%"&15NP`h�p|~���������������������������������������������������������������������������������������������������������zs��������v�������z������������������������z����������|������������������`Qo����������l���|����������������x�������������������������� 
[...]

Well, this is not really useful. Let’s try loading it and printing it out with python instead.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
 python3               
Python 3.10.10 (main, Mar  5 2023, 22:26:53) [GCC 12.2.1 20230201] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> trace_array = numpy.load("trace_array.npy")
>>> print(trace_array)
[[143 127 114 ... 250 227 228]
 [103  66  40 ... 259 271 194]
 [142 139 116 ... 141 200 244]
 ...
 [105  89 107 ... 207 208 231]
 [ 70  72  71 ... 251 227 273]
 [116 127 106 ... 199 226 236]]

npy files are actually so called 2 dimensional arrays, when we ask python to print it, it doesn’t print the whole thing, it only prints the starts and ends of the arrays, and only the first and last 3 ones. So, what did we learn with this? well, we will need to make a 2d array and convert it to npy somehow.

So this is useful, but we still need to somehow get the data that we can convert someohw, let’s look at what our instance does:

1
2
❯ nc saturn.picoctf.net 50306
Please provide 16 bytes of plaintext encoded as hex: 

Pretty straigh forward, let’s see what it does if we do this:

1
2
3
❯ nc saturn.picoctf.net 50306
Please provide 16 bytes of plaintext encoded as hex: 61616161616161616161616161616161
power measurement result:  [88, 70, 97, 106, 146, 159, 139, 112, 106, 122, 142, 146, 131,      ...      207, 156, 162, 169, 132, 143, 187, 239, 280, 312, 299, 291, 264, 228, 233, 241, 272, 227, 225, 193, 221, 242, 263, 263, 181, 207, 188, 204, 159, 197, 212, 215]

we can see it returns a loooong array of numbers, (the dots are included by me, the returned array is much longer).

Now that we know how we get the data and what we have to turn it into, let’s write a script:

1
2
3
4
5
6
import random
import string
import re
import numpy

from pwn import *

Let’s import everything we might need, we will see where we use these things. We could use sockets to connect to the server but it’s easier with pwn.

We have 2 arrays that we will need, one for the input text, and one for the array that we get:

1
2
main_trace_array = []
main_input_array = []
1
2
3
letters = string.ascii_letters
ip = "saturn.picoctf.net"
port = int(input("Port: "))

first we get all the letters that we will randomly choose from for out inputs, then, since the ip is always the same we can just hardcode the domain name there, and since the port is changing we can ask that from the user to input.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
for _ in range(0, 500):

    rand_string = ''.join(random.choice(letters) for i in range(16))

    int_list = []
    for x in rand_string:
        int_list.append(int(ord(x)))

    conn = remote(ip, port)
    conn.recv()

    rand_string = rand_string.encode("utf-8").hex()
    bytes_string = bytes(rand_string, 'utf-8')

    conn.sendline(bytes_string)
    answer = conn.recvall(timeout=3.5)
    conn.close()

    trace_list = []
    trace_string = re.findall("([0-9]+)", str(answer))
    for x in trace_string:
        trace_list.append(int(x))
    main_trace_array.append(trace_list)
    main_input_array.append(int_list)

Then comes the loop, we will do this 500 times, so we get enough traces.

We first make a random string, then we convert the string into an array of the integer representation of the characters.

Then we just connect to the instance.

We convert the string to a bytes string, then we send it, with a timeout of 3.5 to receive everything, then we close the connection.

We extract the integers with regex, and append it to the trace_list one by one, and when we are done with that, append the trace_list to the main_trace_array, basically creating an array inside an array.

When the loop is done, we just save the results into a numpy file:

1
2
numpy.save("trace_array.npy", main_trace_array)
numpy.save("textin_array.npy", main_input_array)

so, we just have to run this to gather data, and when that’s done, we can just run power.py and see if it worked:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
❯ python3 power.py      
found byte 0
found byte 1
found byte 2
found byte 3
found byte 4
found byte 5
found byte 6
found byte 7
found byte 8
found byte 9
found byte 10
found byte 11
found byte 12
found byte 13
found byte 14
found byte 15

key: d3a5641a039fd87871f417094257285f

As stated in the challenge description we have to submit the key like this: picoCTF{key}, and we are done.

This post is licensed under CC BY 4.0 by the author.