pythonencryptionkeymalwarerc4-cipher

Decrypt RC4 with Python knowing the key


I come here with a little problem (sorry for my english this is not my mothertongue). I have a little malware project at school, globally I have a keylogger which create a file and then encrypt the content of this file with RC4 ecnryption. (below two examples of different content of this file)

First one

========================
î³É†"Û4ZôßaV0%í;ËüüòaÐAiúˆcdÖ&

Second one

========================

ä¡Çš3µ{²agÄ}W8%í–YœèáŒ9¯ŽW‰‚”&ÂIAžŠmá'8V¡Uñši&Ædà·…í^ê¥>´©ÍjÌ{

ŒFÔ¿%ïÆñyFÎ3x°º“A)o4´´­‡—ÆéâùyÕ p‑@­®á÷Lì-»›ys‘ñî“wõl„/•+
w×jc´E"ؾ‑øSG,Üo`ürx;¼¡T#üdÊ-‡k«G‘å”b¿+†]Ù©–f:^É( ÁË è­Pàv8¹|籟D7…ÁSgœÛÚjá"æ‡=I•Âa|âOÄÝ\Ùþ×Æçð©Ð¨Û•Øâ¿tù5¾à…y¸Æy

========================

‰Î$œk{²aΞ:Xyc¢“ðƯ¨ÉxéÁR ËÌ~Ž

========================

IéÓí­q3C¦_Å)$Ãü¥Âƒ7Ð&èIΞ:XycjõúØ·v¯ìzëà—'ì`

========================

±Öšk{ð,•Ã{V4#

========================

ü§ÎâK³[’2…×bx2|p‚³ÐæˆéXÉár ëì^ iÈòãˆK7Ï4àZG¯

‰—¥ÃpÂTÅι Âu‚HÚ±+ïÈñýj«=v¾´O'a°Äß­‰—ÆéìùyÕ ž@­®á÷Bì#»›w}‘þã“wûl„/›% yÙdmºK"ؾ‑ùPI"Üo`ürx;¼¡T#ü

========================

 

========================

õ¡‹Ñ$Ý:þ;‡›iT|É9ZlcÒCkøŠafÔ$ãB™izl*Wò¾ójÐÃÄ

========================

 

========================

Now by doing some reverse engineering stuff I have found the key which is If you want to keep a secret, you must also hide it from yourself

So my goal was to decrypt the content of the file in order to see what kind of data the keylogger is writing so here is my python code (based on the wikipedia page of RC4).

#!/usr/bin/env python
class WikipediaARC4:
    def __init__(self, key = None):
        self.state = list(range(256)) # initialisation de la table de permutation
        self.x = self.y = 0 # les index x et y, au lieu de i et j
  
        if key is not None:
            self.init(key)
  
    # Key schedule
    def init(self, key):
        for i in range(256):
            self.x = (ord(key[i % len(key)]) + self.state[i] + self.x) & 0xFF
            self.state[i], self.state[self.x] = self.state[self.x], self.state[i]
        self.x = 0
  
    # Générateur
    def crypt(self, input):
        output = [None]*len(input)
        for i in range(len(input)):
            self.x = (self.x + 1) & 0xFF
            self.y = (self.state[self.x] + self.y) & 0xFF
            self.state[self.x], self.state[self.y] = self.state[self.y], self.state[self.x]
            output[i] = chr((ord(input[i]) ^ self.state[(self.state[self.x] + self.state[self.y]) & 0xFF]))
        return ''.join(output)
  
if __name__ == '__main__':
    test_vectors = [['If you want to keep a secret, you must also hide it from yourself', 'î³É†"Û4ZôßaV0%í;ËüüòaÐAiúˆcdÖ&']]
    for i in test_vectors:
        print(i[0])
        print(WikipediaARC4(i[0]).crypt(i[1]))

When I launch my code this is my output

If you want to keep a secret, you must also hide it from yourself xBë⁃ìÖú½ÜûuåÐÅÙrJM©[&$5ȿÅk

So here is my question, I don't really know if this is really it because the decrypted content doesn't seem to have a particular meaning, Has any of you some thoughts about it ?

Thanks guys !

Edit: the two functions of the RC4 in the malware

void __cdecl initRC4(int param_1,int param_2)

{
  undefined uVar1;
  uint local_10;
  int local_c;
  int local_8;

                    /* 0x1530  2  initRC4 */
  local_8 = 0;
  for (local_c = 0; local_c < 0x100; local_c = local_c + 1) {
    *(char *)(param_1 + local_c) = (char)local_c;
  }
  for (local_10 = 0; (int)local_10 < 0x100; local_10 = local_10 + 1) {
    local_8 = (int)((uint)*(byte *)(param_1 + local_10) + local_8 +
                   (uint)*(byte *)(param_2 + (local_10 & 3))) % 0x100;
    uVar1 = *(undefined *)(param_1 + local_10);
    *(undefined *)(param_1 + local_10) = *(undefined *)(param_1 + local_8);
    *(undefined *)(local_8 + param_1) = uVar1;
  }
  return;
}


undefined4 __cdecl FUN_004019a3(int param_1,int param_2)

{
  HANDLE pvVar1;

  FUN_00401bb0();
  pvVar1 = CreateMutexA((LPSECURITY_ATTRIBUTES)0x0,1,"ND9487GF943GF4328FHG");
  if (pvVar1 != (HANDLE)0x0) {
    if (param_1 != 1) {
      Sleep(5000);
      DeleteFileA(*(LPCSTR *)(param_2 + 4));
    }
    FUN_004016de(s_lang32.ini_00403020);
    initRC4(0x405400,0x403040);
    FUN_0040192e();
    FUN_00401969();
  }
  return 0;
}

Solution

  • The decompiled code for the key schedule has the following element:

    (param_2 + (local_10 & 3))
    

    param_2 is the address of the key (as you have figured out already) and local_10 seems to be the loop variable (let's call it i). The & 3 reads out the least significant 2 bits of i, so when i goes from 0 to 255, i & 3 takes the values 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, ...

    That means only the first 4 bytes of the key are actually used, in other words the key is "If y".

    Regarding the encoding: I encrypted b"ISOLANG 32ISOLANG 32[R" with the key b"If y" and got the following result:

    HEX:    ee b3 c9 86 22 db 34 9a 5a f4 df 61 1f 56 30 25 ed bb cb fc fc f2
    chr:     î  ³  É     "  Û  4     Z  ô  ß  a     V  0  %  í  »  Ë  ü  ü  ò
    

    which is similar to your first string, with a few differences:

    When I print the string that I got, the values 0x9a and 0x1f are skipped, just like in your string. Note that these "invisible" characters in the cipher text are still important in the decryption, since skipping them desynchronizes the key stream. Knowing that there are gaps in the cipher text, I could try to guess where the gaps are in the second cipher text, in order to re-synchronize the decryption. However, you were able to decrypt the first string correctly, so I assume you have access to the raw bytes rather than an incomplete string representation, so you wouldn't have to guess any gaps.

    Here is some of the code that I used for playing around with the strings:

    def ksa(key):
        s = list(range(256))
        j = 0
        l = len(key)
        for i in range(256):
            j = (j + s[i] + key[i % l]) & 0xff
            s[i], s[j] = s[j], s[i]
        return s
    
    def keystream(key):
        s = ksa(key)
        i = j = 0
        while True:
            i = (i + 1) & 0xff
            j = (j + s[i]) & 0xff
            s[i], s[j] = s[j], s[i]
            yield s[(s[i] + s[j]) & 0xff]
    
            
    def crypt(key, msg):
        return bytes(a ^ b for a, b in zip(msg, keystream(key)))
    
    plain_text = b"ISOLANG 32ISOLANG 32[R"
    cipher_text = crypt(b"If y", plain_text)
    print(cipher_text.hex(' '))