Saturday, August 24, 2013

Breaking Single Character XOR Cipher



Of course this been topic cryptography I wouldn’t bug you up with whole lot of complicated mathematic. Single character XOR cipher been the most simplest and effective encoding technique used by malware author for exfiltration of data and it is used very often. For those who are not familiar with this term, exfiltration data files contains stolen data from victims computer and it resides on victims computer and the malware will export it to Command and Control center eventually. Another reason using such simple technique instead of algorithms such as Blowfish or AES-256, is it might trigger IDS or IPS single character XOR inspite of been simple it does gives sufficient level of obfuscation.

Now talking about Single Character XOR cipher, the data to encode is XOR with one byte key this will yield cipher text. For ex
To encode the data :
DATA  Key = Cipher text
  5Fh   
54h = 0Bh
To decode the data :
 Cipher text Key = DATA
  0B   
54h = 5Fh
But our goal is to break Ciphertext and to recover the   key which help us to recover the original data. The tool we will be using are :
1.   A hex editor   
Now talking about the algorithm we will use :
We will simply assume a key the XOR the data to recover the plain text now we will search for occurrence of most frequently occurring English words some method for "scoring" a piece of English plaintext. (Character frequency is a good metric.) Evaluate each output and choose the one with the best score. if we get the word then bingo we got the key. If not try other key since the range of key is from 0 to 255(FFh) i.e one byte it is abe piece of cake for a computers of this era.

This is the encoded data which we will crack:



 After XOR with X (0x58) we will get :



Taking the top 100 most commonly used English word we can do this attack first we will put the XOR logic in a loop which iterate the key from 0(0x0) to 255(0xFF) on each iteration we will compare the decoded data to our list of 100 frequently occurring English if we have the occurrence  we will print it .

The script is as follows :

import binascii,sys
keywords=['the', 'be', 'to', 'of', 'and', 'a', 'in', 'that', 'have', 'I', 'it', 'for', 'not', 'on', 'with', 'he', 'as', 'you', 'do', 'at', 'this', 'but', 'his', 'by', 'from', 'they', 'we', 'say', 'her', 'she', 'or', 'an', 'will', 'my', 'one', 'all', 'would', 'there', 'their', 'what', 'so', 'up', 'out', 'if', 'about', 'who', 'get', 'which', 'go', 'me', 'when', 'make', 'can', 'like', 'time', 'no', 'just', 'him', 'know', 'take', 'people', 'into', 'year', 'your', 'good', 'some', 'could', 'them', 'see', 'other', 'than', 'then', 'now', 'look', 'only', 'come', 'its', 'over', 'think', 'also', 'back', 'after', 'use', 'two', 'how', 'our', 'work', 'first', 'well', 'way', 'even', 'new', 'want', 'because', 'any', 'these', 'give', 'day', 'most', 'us']
def inc_key(key):
    int_key=int(key,16)
    int_key=int_key+1
    hex_key=hex(int_key)[2:]
    return hex_key

def decrypt(data,key):
    xor=lambda x ,y :int(x,16)^int(y,16)
    ans=""
    for temp in [data[a:a+2] for a in range(0,len(data)-2,2)]:
        ans=ans+chr(xor(temp,key))
       
       
    return ans
       

def body(data):
   
    key="00"

    for a in range(0,255):
        ans=decrypt(data,key)
        for word in keywords:
            word=" "+word+" "
            if word in ans or word+" " in ans or " "+word in ans or " "+word+" " in ans:
                print "[*]Encrypted data : "+data
                print "[*]Decrypted data : "+ans
                print "[*]key : 0x"+(key)
                break
        key=inc_key(key)


if(len(sys.argv)==1):
        print "\tuseage :\n\t"+sys.argv[0]+" <file name>"

if(len(sys.argv)==2):
          print "[+]Starting Decrpytion ..."
          for data in open(sys.argv[1]):
                   body(data)

          print "[+] Done





It should also be noted that this technique is useful in extracting malware from antivirus quarantines. Frequently, intrusion analysts are only called in after a first responder already tried to fix the problem. The “fix” may include installing an antivirus program that captures and encrypts valuable evidence. If the only copy of malware that you need to analyze is located in a quarantine container, then consider what methodologies may have been used to lock them inside. For example, simply unzipping McAfee quarantine files with 7-zip and reversing the files with the XOR key j (hex 0x6A) will yield the original malware.
XOR encryption is used frequently, for both legitimate and illegal purposes; it is important for analysts to know that this encryption can be broken with minimal effort and the result may be very valuable to the investigation.

No comments:

Post a Comment