Deobfuscate Python (ABCTF2016

One of the challenges in ABCTF 2016 last week was to deobfuscate python and obtain the flag. As this is something I’ve only done a few times, I figured it would be a fun challenge and write-up.

Deobfuscate Python – Introduction

To start things off, the original code is below.

#ORIGINAL - http://pastebin.com/8JkW5E1m

from _md5 import MD5Type as z
 
def is_flag(s):
   শ = int(s[pow(ord(s[0]), 2) - ord(ZeroDivisionError.__name__[0b101]) * 0o105 + 19:])
   �-- = list(filter(lambda q: q % (int(z.__name__[::-3]) - 0b10) == 0, [শ, শ * 2, শ * 3, শ * 4, শ * 5]))
   if len(�--) != 5:
       return False
   Ӝ = s[0b01:-0b10]
   if (len([n for n in list(Ӝ) if lambda y: i in �--])) != 0o10:
       return False
   ڪ = [(ord(x) - 0x30) for x in list(s)] # 01100110 01101001 01101100 01101100 00100000 01101001 01101110 00100000 01110100 01101000 01100101 00100000 01100010 01101100 01100001 01101110 01101011 01110011
   if s.count(chr(0b101 * 0x11)) != 3:
       return False
   if Ӝ.index("_") != Ӝ[::-1].index("_"):
       return False
   if (s[ڪ[int(chr(ord(True.__str__()[0b10]) - 0o102))]]) != chr(pow(0b101 ** 0x02, 0o02) - 0b1000011100):
       return False
   return True

Variable/value Replacement

First, I replaced the Unicode variables (শ, �--, Ӝ, and ڪ) with a, b, c, and d respectively. This made the code easier to read, and prevented it from breaking in anything that I looked t it in.

Next up was to replace some of the non-decimal numbers to get the math easier to follow. I also converted that binary comment, just to make sure it wasn’t anything super useful yet.

>>> print 0b101
5
>>> print 0o105
69
>>> print 0b10
2
>>> print 0o10
8
>>> print 0b01
1
>>> print 0o102
66
>>> print 0x02
2
>>> print 0o02
2
>>> print 0b1000011100
540
>>> print 0x11
17
>>> print 0x30
48

For those of you not following along, that brings us to (slightly) more readable code.

from _md5 import MD5Type as z
 
def is_flag(s):
   a = int(s[pow(ord(s[0]), 2) - ord(ZeroDivisionError.__name__[5]) * 69 + 19:])
   b = list(filter(lambda q: q % (int(z.__name__[::-3]) - 2) == 0, [a, a * 2, a * 3, a * 4, a * 5]))
   if len(b) != 5:
       return False
   c = s[1:-2]
   if (len([n for n in list(c) if lambda y: i in b])) != 8:
       return False
   d = [(ord(x) - 48) for x in list(s)] # fill in the blanks
   if s.count(chr(5 * 17)) != 3:
       return False
   if c.index("_") != c[::-1].index("_"):
       return False
   if (s[d[int(chr(ord(True.__str__()[2]) - 66))]]) != chr(pow(5 ** 2, 2) - 540):
       return False
   return True

Math Updates

Next, I wanted to update some of the math, as well as the variable replacements. This would allow me to really get a handle on what was going on.

>>> print ZeroDivisionError.__name__[5]
i
>>> print ord('i')
105
>>> print 5 * 17
85
>>> print chr(85)
U
>>> print True.__str__()[2]
u
>>> print ord('u')
117
>>> print chr(117)
u
>>> print 117-66
51
>>> print chr(51)
3
>>> print int('3')
3
>>> print pow(5 ** 2, 2)
625
>>> print chr(625-540)
U

These changes get us to the code below, which is definitely starting to look like a readable program.

from _md5 import MD5Type as z
 
def is_flag(s):
   a = int(s[pow(ord(s[0]), 2) - 105 * 69 + 19:])
   b = list(filter(lambda q: q % (int(z.__name__[::-3]) - 2) == 0, [a, a * 2, a * 3, a * 4, a * 5]))
   if len(b) != 5:
       return False
   c = s[1:-2]
   if (len([n for n in list(c) if lambda y: i in b])) != 8:
       return False
   d = [(ord(x) - 48) for x in list(s)] # fill in the blanks
   if s.count('U') != 3:
       return False
   if c.index("_") != c[::-1].index("_"):
       return False
   if (s[d[3]]) != 'U':
       return False
   return True

I wanted to point out that I did not solve the (105 * 69 + 19) formula that is on line 4 yet. The reason for this is that Python actually does math from left to write, as opposed to following a standard order of operations (except where ** is concerned). That means I will need the first number before I can solve the rest of it.

I also didn’t replace the number on line 5 yet, because that one was slightly trickier. At first glance, it would seem that z.__name__ would be ‘MD5Type’. In this case, the __name__ of MD5Type is actually ‘md5’, as you can see below.

>>> from _md5 import MD5Type as z
>>> print z.__name__
md5
>>> print int(z.__name__[::-3]) - 2
3

That quick update (as well as removing the now unused import) got us to here.

def is_flag(s):
   a = int(s[pow(ord(s[0]), 2) - 105 * 69 + 19:])
   b = list(filter(lambda q: q % 3 == 0, [a, a * 2, a * 3, a * 4, a * 5]))
   if len(b) != 5:
       return False
   c = s[1:-2]
   if (len([n for n in list(c) if lambda y: i in b])) != 8:
       return False
   d = [(ord(x) - 48) for x in list(s)] # fill in the blanks
   if s.count('U') != 3:
       return False
   if c.index("_") != c[::-1].index("_"):
       return False
   if (s[d[3]]) != 'U':
       return False
   return True

General Cleanup

Next, I added a main method as well as some comments and debugging statements. This would let me start working on the remaining logic. I also broke out the math inside of a so that I could print out its value during my test runs.

def is_flag(s):
   index0 = pow(ord(s[0]), 2) - 105 * 69 + 19  # python OoO = left to right
   print "index0 = " + str(index0)
   a = int(s[index0:])
   print "a = " + str(a)
   b = list(filter(lambda q: q % 3 == 0, [a, a * 2, a * 3, a * 4, a * 5]))
   print "b = " + str(b)
   if len(b) != 5:
       print "#1 failure"
       return False
   else:
       print "#1 SUCCESS"
   c = s[1:-2]
   print "c = " + str(c)
   if (len([n for n in list(c) if lambda y: i in b])) != 8:
       print "#2 failure"
       return False
   else:
       print "#2 SUCCESS"
   d = [(ord(x) - 48) for x in list(s)] # fill in the blanks
   if s.count('U') != 3:
       print "#3 failure"
       return False
   else:
       print "#3 SUCCESS"
   if c.index("_") != c[::-1].index("_"):
       print "#4 failure"
       return False
   else:
       print "#4 SUCCESS"
   if (s[d[3]]) != 'U':
       print "#5 failure"
       return False
   else:
       print "#5 SUCCESS"
   return True

def main():
    flag = "test"

    if is_flag(flag):
      print "THE FLAG IS: " + str(flag)

if __name__ == "__main__":
    main()

Deobfuscate Python – Logic Simplification

Finally, I could start wading into the actual logic and cleaning up the code a little more.

b = list(filter(lambda q: q % 3 == 0, [a, a * 2, a * 3, a * 4, a * 5]))

While this line seems a bit confusing and complex, it is actually easy for us to clean up. At its core, it is saying to make a list of the numbers that are divisible by three from the list [a, a*2, a*3, a*4, a*5]. If we combine this with the conditional two lines down (if len(b) != 5:), then we have a much simpler check to work with:

if a % 3 != 0:

The reason for this is that if a is divisible be 3, then a*2 etc. will be divisible by 3. If it isn’t divisible by 3, then b will not have a length of 5.

b = [a, a * 2, a * 3, a * 4, a * 5]

The next line I wanted to look at was line 13 (if (len([n for n in list(c) if lambda y: i in b])) != 8:). This seemed complex at first glance until I realized what it was actually doing. The second conditional (lambda y: i in b) is always going to return true, regardless of what is in b. What this means is that this is just checking to see if len(c) == 8. Based on the line above (c = s[1:-2]), we now know that s (which is our original flag) has to have a length of 11!

Before looking at my newly updated code, I also wanted to add a comment to the fourth conditional (if c.index(“_”) != c[::-1].index(“_”):). This check verifies that the first underscore is palindromic (in the same location forwards as well as backwards). For example, if there is an underscore in the second position of the string, then the string must also have an underscore in the next to last position. Note that the string in question is c though, so we have to ignore the first and last two characters of s.

With these few changes, as well as a few more notes, the updated code was as follows.

def is_flag(s):
   index0 = pow(ord(s[0]), 2) - 105 * 69 + 19  # python OoO = left to right
   print "index0 = " + str(index0)
   a = int(s[index0:])
   print "a = " + str(a)
   b = [a, a * 2, a * 3, a * 4, a * 5]
   if a % 3 != 0:
       print "#1 failure"
       return False
   else:
       print "#1 SUCCESS"
   c = s[1:-2]
   if len(c) != 8
       print "#2 failure"
       return False
   else:
       print "#2 SUCCESS"
   d = [(ord(x) - 48) for x in list(s)] # fill in the blanks
   if s.count('U') != 3:
       print "#3 failure"
       return False
   else:
       print "#3 SUCCESS"
   if c.index("_") != c[::-1].index("_"):
       print "#4 failure"
       return False
   else:
       print "#4 SUCCESS"
   if (s[d[3]]) != 'U':
       print "#5 failure"
       return False
   else:
       print "#5 SUCCESS"
   return True

def main():
    # flag can only be 11 characters long
    # flag must have three U's
    # flag must have two _'s that are in the same place forwards as backwards (ignoring first and last two chars)

    flag = "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"

    if is_flag(flag):
      print "THE FLAG IS: " + str(flag)

if __name__ == "__main__":
    main()

Deobfuscate Python – Solving

The next piece of the logic I looked into involved the following 3 lines.

   index0 = pow(ord(s[0]), 2) - 105 * 69 + 19  # python OoO = left to right
   a = int(s[index0:])
   if a % 3 != 0:

To start with, to get a zero value for index0 we’d need a value of 85.229 for ord(s[0]). This means that any character lower than ‘U’ will give us a negative number, and anything higher will give us a positive number. Since any higher positive number will give us a result higher than 11, we know that we can throw them out. Additionally, since we need to convert the result into an integer eventually, we know that we don’t want any higher negative numbers (we know this because of the need for ‘U’s as well as ‘_’s). In this case, we know that we want s[0] = ‘U’.

Additionally, this meant I s[-1] was divisible by 3 (if you look, this character is never used again, so 3/6/9 all work here).

Now I knew the flag had to have the following values.

def main():
    # flag can only be 11 characters long
    # flag must have three U's
    # flag must have two _'s that are in the same place forwards as backwards (ignoring first and last two chars)

    flag = "U"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "3"

    if is_flag(flag):
      print "THE FLAG IS: " + str(flag)

At this point I could pass the first two checks, so I was well on my way to solving this.

Check #3 and #4 are just about the number of ‘U’s in my flag, as well as my two underscore locations, so I did not need to worry about them yet.

Moving on to check #5, this seemed like an interesting problem with a simple solution.

d = [(ord(x) - 48) for x in list(s)] # fill in the blanks
if (s[d[3]]) != 'U':

First, I needed d[3] to return a number between 0 and 10 to prevent an index out-of-bounds error. This meant that I needed ord(s[3] – 48) (since that is how d gets created) to return a number between 0 and 10 for the same reason. Using the ASCII chart, I knew that the decimal value of a number – 48 gave that number back. With this in mind, if s[3] is a number between 0 and 10 (actually 1 and 9 since I already populated the end values), then I’d be able to pass this test.

(pseudo-code)
if s[3] == '4', then d[3] == 4
therefore
s[4] == 'U'

With that logic out-of-the-way, I was able to update my flag (and pass checks #1, #2, and #5).

    flag = "U"
    flag += "x"
    flag += "x"
    flag += "4"
    flag += "U"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "3"

The remaining two checks were now easy to pass, as I just needed to add in 2 more ‘U’s and my ‘_’s

    flag = "U"
    flag += "_"
    flag += "x"
    flag += "4"
    flag += "U"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "_"
    flag += "U"
    flag += "3"

When it was all said and done, my code was as follows.

def is_flag(s):
    index0 = pow(ord(s[0]), 2) - 105 * 69 + 19  # python OoO = left to right
    # anything lower than 85.229 will return a negative number
    # 85 means 'U', which gives index0 of -1
    print "index0 = " + str(index0)
    a = int(s[index0:])
    print "a = " + str(a)
    # a must be divisible by 3
    b = [a, a * 2, a * 3, a * 4, a * 5]
    print "b = " + str(b)
    c = s[1:-2]
    print "c = " + str(c)
    array1 = [n for n in list(c) if lambda y: i in b]
    print "array1 = (" + str(len(array1)) + ") = " + str(array1)
    if (len(array1)) != 8:
        print "#1 failure"
        return False
    else:
        print "#1 SUCCESS"
    d = [(ord(x) - 48) for x in list(s)] # fill in the blanks
    print "d = " + str(d)
    if s.count('U') != 3:
        print "#2 failure"
        return False
    else:
        print "#2 SUCCESS"
    if c.index("_") != c[::-1].index("_"):
        print "#3 failure"
        return False
    else:
        print "#3 SUCCESS"
    if (s[d[3]]) != 'U':
        print "#4 failure"
        return False
    else:
        print "#4 SUCCESS"
    return True

def main():
    # flag can only be 11 characters long
    # flag must have three U's
    # flag must have two _'s that are in the same place forwards as backwards (ignoring first and last two chars)

    flag = "U"
    flag += "_"
    flag += "x"
    flag += "4"  # this needs to be a number so that ord - 48 returns an index between 0 and 11; whatever number this is flag[x] needs to be a 'U'
    flag += "U"
    flag += "x"
    flag += "x"
    flag += "x"
    flag += "_"
    flag += "U"
    flag += "3"

    if is_flag(flag):
      print "THE FLAG IS: " + str(flag)

if __name__ == "__main__":
    main()

And, when I ran it, everything passed!

root@kali:~# python obfuscate.py 
index0 = -1
a = 3
b = [3, 6, 9, 12, 15]
c = _x4Uxxx_
array1 = (8) = ['_', 'x', '4', 'U', 'x', 'x', 'x', '_']
#1 SUCCESS
d = [37, 47, 72, 4, 37, 72, 72, 72, 47, 37, 3]
#2 SUCCESS
#3 SUCCESS
#4 SUCCESS
THE FLAG IS: U_x4Uxxx_U3

Deobfuscate Python – Conclusion

Putting the flag inside of the proper format (ABCTF{flag}) and submitting it gave me my points, and a solid sense of satisfaction.

Ray Doyle

Ray Doyle is an avid pentester/security enthusiast/beer connoisseur who has worked in IT for almost 16 years now. From building machines and the software on them, to breaking into them and tearing it all down; he’s done it all. To show for it, he has obtained an OSCE, OSCP, eCPPT, GXPN, eWPT, eWPTX, SLAE, eMAPT, Security+, ICAgile CP, ITIL v3 Foundation, and even a sabermetrics certification!

He currently serves as a Senior Staff Adversarial Engineer for Avalara, and his previous position was a Principal Penetration Testing Consultant for Secureworks.

This page contains links to products that I may receive compensation from at no additional cost to you. View my Affiliate Disclosure page here. As an Amazon Associate, I earn from qualifying purchases.

Deobfuscate Python (ABCTF2016 – Obfuscated 1)

Deobfuscate Python – Introduction

Variable/value Replacement

Math Updates

General Cleanup

Deobfuscate Python – Logic Simplification

Deobfuscate Python – Solving

Deobfuscate Python – Conclusion

Leave a ReplyCancel Reply

Deobfuscate Python – Introduction

Variable/value Replacement

Math Updates

General Cleanup

Deobfuscate Python – Logic Simplification

Deobfuscate Python – Solving

Deobfuscate Python – Conclusion

Leave a ReplyCancel Reply

Related Posts

BEST Hacking Software – Learn the Tools of the Trade

Learn Penetration Testing – How to Become an Ethical Hacker!

Cyber Security Certifications and Courses – Gotta Catch ‘Em All!