Capture the flag - Encrypted encoding

Recently came across a capture the flag exercise that was slightly more involved and pretty fun. I wanted to share my thought process for figuring this out and getting the flag.

Part of this excercise was gaining access to a machine and gaining the ability to run certain/specific commands. That was part one, which was certainly interesting, but I’m only sharing part two, below. This has more applicable blue-team knowledge and was more fun to me than the first part. Part two involved finding a file and ‘decrypting’ the content. After gaining access to the device, the ability to echo out or cat was gained. Attempts to run other commands like ls or sh to gain a shell were unsuccessful given the limited scope and access. BUT having the ability to pass a filename and get the contents echoed back out, was the key. My first recon step was to try and get the output of the users .bash_history, basically a cat ~/.bash_history, which showed the following:

[...]
1975 ~/.encrypt.py
1976 history -d  0-$(($HISTCMD-3))
1977 exit

Since the history doesn’t show it, I presume the .encrypt.py script was uploaded via another undetermined method. Using the method I discovered to show file content, I echoed the contents of the .encrypt.py that was run

.encrypt.py

import base64
import sys
from random import randint

def main(argv):
    stringtoencode = input("Please enter the message to Encrypt:\n")
    randomxorkey = randint(1, 255)

    encodedString=[randomxorkey]
    for b in stringtoencode:
        encodedString.append(ord(b)^randomxorkey)
    with open("encrypted.txt","wb") as enc:
        enc.write((base64.b64encode(bytearray(encodedString))))

if __name__ == "__main__":
   main(sys.argv[1:])

Interesting file… it has no shebang line to say what type of script it is, but looks like a Python script given the file extension, whitespace and imports. Based off that assumption we can tell the with open() statement is writing the following statments/process to a file named encrypted.txt. So, using the previous exploit to read a file, is the flag what’s in encrypted.txt?

encrypted.txt

zLeYpKW/hb+Vo7m+iqCtq7GJoq+jqKWiq5Olv5Oio7iTqaKvvrW8uKWjou0=

Oh boy, that looks like simple enough base64 but no flag format. Alas, trying to decode it as a base64 gives us… absolutely nothing. Nil, no bytearray; nothing. Let’s review the encrypt.py script and try to better understand and decipher more of what it is doing to try and reverse this output.

Explain it to me

Looking at the functions, we see one primary function main() with an argument. The line if __name__ == "__main__": is pythonese to determine if the script execution is being run as the primary/main module. If so, it then executes the code block, in this case the main() function. To main we go! Looks like it sets a few variables first, one of which is the string value supplied by a user executing the script. The script was run without any arguments, then accepted input from the user directly. That type of input doesn’t show up in .bash_history, but we know they ran the script based on the command history and the presence of the output file as defined in the script. There was no arguments passed to the script, but the script itself shows an argument being passed to main() as being the first argument to the script itself. Weird, the bash history didn’t show any arguments being passed. In that case, the argv array would be empty. It doesn’t appear that the parameter passed to main is actually used anywhere else in the function; no idea why it’s there.

After setting a variable to the input from the user, the encrypt.py script sets another variable, randomxorkey to a random value from the 1-255 range. Next, it assignes an array to a different variable, where the first value/element is the randomly chosen integer of randomxorkey. Then a for loop that iterates over the user input string. In python, you can iterate over each character in a string and run some block of code for each character in that string. Strings in python are iterable! Not just arrays/lists/dictionaries. So what is this block of code doing to every character in the input string? I know already that encodedString is an array, so in this loop it is being appended to. What’s being appended is… ord() of the character in the string, with ^ of the randomxorkey value. What is ord() ? What is ^ ?

The Python ord() function converts a given character/string into an integer that represents the Unicode code of the character. ^ is the binary XOR Operator. ^ performs logical XOR on each bit position on the binary representations of integers, which evaluates to 1 if and only if exactly one of the two input bits at the same position are 1. In this script, it is taking the integer value ord() of a character from the input string, and XORing it with the randomxorkey. This resulting changed integer value is appended to the encodedString array. What this shows is that the encrypted.txt file isn’t actually encrypted, it’s encoded! Encoding transforms data into another format using a known reversible scheme so that it can be easily transferred or stored. Encryption is for maintaining data confidentiality and thus the ability to reverse cipher are limited to those who have the known key. We should be able to get plaintext from this without needing a ‘key’, just need to figure out the random XOR value that was used. COULD bruteforce that due to the low range, but lets do it a better way.

Next, the script opens a file handle using the with open() statement and writes the encodedString array to the file. Because of the with statement, Python automatically handles closing the file without an explict close() needed. But, it’s not just writing the encodedString array to the file! It appears to be base64 encoding the array first then writing it to the file. So encrypted.txt is a base64 encoded string. But why wasn’t anything decoded from our earlier attempt? How do we get that plain text back if we don’t actually know the original XOR key? Right, first element isn’t really part of the encodedString. It is the value of the XOR key, the integer that the rest of the characters in the string were XOR’d with. We need to extract that value from the output first, and then use it to decode the rest of the string. The opposite of an XOR is also XOR. By XORing the characters ORD with the XOR value again and converting that to the chr() representation, we should get the plaintext value and thus the inputString/Flag.

Reverse engineering

Here’s what an attempt at reversing that input via my own python script looks like:

#!/usr/bin/python3
import base64, sys
import argparse

parser = argparse.ArgumentParser(description="'Decode' given file")
parser.add_argument('-f', '--file', default='encrypted.txt', help="Path to file to read and decode")
args = parser.parse_args()

def main():
    with open(args.file, "rb") as encfile:
        data = encfile.read().strip()

    b64Bytes = base64.b64decode(data)
    xorValue = int(b64Bytes.hex()[:2], 16)

    encStr = ''.join(chr(int(b) ^ xorValue) for b in b64Bytes)

    print(encStr)

if __name__ == "__main__":
    main()

This script will read in the file passed as an argument or default file of encrypted.txt, and base64 decode it. Again reading the .encrypt.py script, I expect the first two bytes of the array to be the XOR value. Due to the way python handles base64 and written to file, it’ll be in a bytearray form not a plain integer, so we need to read in the first 2 bytes for hex value and convert it to an int in the same base. Converting the bytearray of the base64 input from file to a hex() string for parsing, we then use int(value, 16) to parse that hex value into a base16 INT. Setting that to the xorValue variable, now we can use it to reverse the encoded string back to plaintext. xorValue should now contain the value that was set by the .encrypt.py script for the initial encoding, derived from the encrypted.txt. Next my script does a fancy Pythonese oneliner. The opposite of ord() from the intial .encrypt.py script, is chr(); it takes the integer representation and converts it to a unicode character which in this case should be plaintext. Can’t forget the xorValue! Again, reversing the script and method for my own script and methods. This part of my script does that for loop over the array in oneline, then combines the resulting array into a string value. Finally, printing the end result of the decoded string value. Python is very succint in this manner. Lets run my custom script and see what the result is:

Flag bearer

$ ./flagbearer.py 
{ThisIsYourFlag}Encoding_is_not_encryption!

Sweet! There’s our flag.


Proudly written with VIM, pushed to gitea and processed with golang static site generator