I am writing a program that encrypts/decrypts a message using Caesar Cipher and a shift number given by the user. I got that part down, more or less.
However, the final part of the program performs a brute force attack on an encrypted message, and we are attempting to find the shift key that was used to encrypt it. I am having trouble determining the best way to stop the program when it finds the match.
I will paste my code below, and what it comes up with running it now. I have tested with pencil and paper, and can see the key is 17 for our particular input, so I put a crap condition just to get it working. I am wondering how to stop it when it finds the correct value and/or a sentence that is actual English, so that it works for every input, not just our test case.
P.S. I apologize if this is answered elsewhere, the only posts I found on the topic were more related to stopping brute force attacks and/or having secure passwords.
char decrypt(char letter, int key);
int main(){
// Part 3 Brute-Force Attack
std::string ciphertext;
std::cout << "Enter in the ciphertext: ";
std::getline(std::cin, ciphertext);
std::cout << "\nBrute-Force Decryptions:" << std::endl;
int i = 0;
int keyCode = 0;
std::string gang = "";
int alphaArray[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24};
// The algorithm has nested for each loops, the first one keeps track of the key initialized to 0
// and the second loop goes through the string given and converts it to the proper value based
// on the shift key. The reason we want to find 't', 'h' or 'e' is because this signifies the
// begginning of an actual English sentence, and most likely our cracked message. I was able to
// determine this would begin the correct string we are looking for by bruteforcing the first // few letters using pencil and paper
for(auto key : alphaArray){
std::cout << "\nkey = " << key << ":\t";
for(auto letter : ciphertext){
letter = decrypt(letter, i);
gang += letter;
}
std::cout << gang << std::endl;
if (tolower(ciphertext[0]) - key + 26 == 't' && tolower(ciphertext[1]) - key == 'h' && tolower(ciphertext[2]) - key == 'e'){
keyCode = key;
break;
}
i++;
gang = "";
}
// Output results
std::cout << "\nPart 3 KeyCode: " << keyCode << std::endl;
std::cout << "\n\n";
return 0;
}
//////////////////////////////////////////////////////////////////////////////////////
// Function: Attempts to decrypt letter by shifting it up the alphabet 'key' times /
// Params: Letter as a char and key as an int value /
// Return: Letter shifted 'key' times /
//////////////////////////////////////////////////////////////////////////////////////
char decrypt(char letter, int key){
int position = (letter - key - 'A') % 26;
if(position < 0)
position += 26;
return (char)position + 'a';
}
Output:
Brute-Force Decryptions:
key = 0: kyvtjyzwkttzgyvitzjtrcjftbefnetrjtkyvttrvjrittzgyviftrwkvitalczljttrvjriftnyftivglkvucptljvutkyvttzgyvitkftgrjjtdvjjrxvjtkftyzjtkiffgjtulizextyzjtdzczkripttrdgrzxejtzetxrlchtefkvtkyrktkyvttrvjrittzgyvitzjtvrjzcpttirtbvutsptkyvtsilkvgwfitvtrkkrtbh
key = 1: jxusixyvjssyfxuhsyisqbiesademdsqisjxussquiqhssyfxuhesqvjuhszkbykissquiqhesmxeshufkjutboskiutsjxussyfxuhsjesfqiiscuiiqwuisjesxyisjheefistkhydwsxyiscybyjqhossqcfqywdisydswqkbgsdejusjxqjsjxussquiqhssyfxuhsyisuqiybosshqsautsrosjxusrhkjufvehsusqjjqsag
key = 2: iwtrhwxuirrxewtgrxhrpahdrzcdlcrphriwtrrpthpgrrxewtgdrpuitgryjaxjhrrpthpgdrlwdrgtejitsanrjhtsriwtrrxewtgridrephhrbthhpvthridrwxhrigddehrsjgxcvrwxhrbxaxipgnrrpbepxvchrxcrvpjafrcditriwpiriwtrrpthpgrrxewtgrxhrtphxanrrgprztsrqnriwtrqgjiteudgrtrpiiprzf
key = 3: hvsqgvwthqqwdvsfqwgqozgcqybckbqogqhvsqqosgofqqwdvsfcqothsfqxizwigqqosgofcqkvcqfsdihsrzmqigsrqhvsqqwdvsfqhcqdoggqasggousgqhcqvwgqhfccdgqrifwbuqvwgqawzwhofmqqoadowubgqwbquoizeqbchsqhvohqhvsqqosgofqqwdvsfqwgqsogwzmqqfoqysrqpmqhvsqpfihsdtcfqsqohhoqye
key = 4: gurpfuvsgppvcurepvfpnyfbpxabjapnfpgurppnrfneppvcurebpnsgrepwhyvhfppnrfnebpjubperchgrqylphfrqpgurppvcurepgbpcnffpzrffntrfpgbpuvfpgebbcfpqhevatpuvfpzvyvgnelppnzcnvtafpvaptnhydpabgrpgungpgurppnrfneppvcurepvfprnfvylppenpxrqpolpgurpoehgrcsbeprpnggnpxd
key = 5: ftqoeturfooubtqdoueomxeaowzaizomeoftqoomqemdooubtqdaomrfqdovgxugeoomqemdaoitaodqbgfqpxkogeqpoftqooubtqdofaobmeeoyqeemsqeofaotueofdaabeopgduzsotueoyuxufmdkoomybmuszeouzosmgxcozafqoftmfoftqoomqemdooubtqdoueoqmeuxkoodmowqponkoftqondgfqbradoqomffmowc
key = 6: espndstqenntaspcntdnlwdznvyzhynldnespnnlpdlcnntaspcznlqepcnufwtfdnnlpdlcznhszncpafepowjnfdponespnntaspcneznalddnxpddlrpdneznstdneczzadnofctyrnstdnxtwtelcjnnlxaltrydntynrlfwbnyzepneslenespnnlpdlcnntaspcntdnpldtwjnnclnvponmjnespnmcfepaqzcnpnleelnvb
key = 7: dromcrspdmmszrobmscmkvcymuxygxmkcmdrommkockbmmszrobymkpdobmtevsecmmkockbymgrymbozedonvimeconmdrommszrobmdymzkccmwocckqocmdymrscmdbyyzcmnebsxqmrscmwsvsdkbimmkwzksqxcmsxmqkevamxydomdrkdmdrommkockbmmszrobmscmokcsvimmbkmuonmlimdromlbedozpybmomkddkmua
key = 8: cqnlbqrocllryqnalrbljubxltwxfwljblcqnlljnbjallryqnaxljocnalsdurdblljnbjaxlfqxlanydcnmuhldbnmlcqnllryqnalcxlyjbblvnbbjpnblcxlqrblcaxxyblmdarwplqrblvrurcjahlljvyjrpwblrwlpjduzlwxcnlcqjclcqnlljnbjallryqnalrblnjbruhllajltnmlkhlcqnlkadcnyoxalnljccjltz
key = 9: bpmkapqnbkkqxpmzkqakitawksvwevkiakbpmkkimaizkkqxpmzwkinbmzkrctqcakkimaizwkepwkzmxcbmltgkcamlkbpmkkqxpmzkbwkxiaakumaaiomakbwkpqakbzwwxaklczqvokpqakuqtqbizgkkiuxiqovakqvkoictykvwbmkbpibkbpmkkimaizkkqxpmzkqakmiaqtgkkziksmlkjgkbpmkjzcbmxnwzkmkibbiksy
key = 10: aoljzopmajjpwolyjpzjhszvjruvdujhzjaoljjhlzhyjjpwolyvjhmalyjqbspbzjjhlzhyvjdovjylwbalksfjbzlkjaoljjpwolyjavjwhzzjtlzzhnlzjavjopzjayvvwzjkbypunjopzjtpspahyfjjhtwhpnuzjpujnhbsxjuvaljaohajaoljjhlzhyjjpwolyjpzjlhzpsfjjyhjrlkjifjaoljiybalwmvyjljhaahjrx
key = 11: znkiynolziiovnkxioyigryuiqtuctigyiznkiigkygxiiovnkxuiglzkxiparoayiigkygxuicnuixkvazkjreiaykjiznkiiovnkxizuivgyyiskyygmkyizuinoyizxuuvyijaxotminoyisorozgxeiigsvgomtyiotimgarwituzkizngziznkiigkygxiiovnkxioyikgyoreiixgiqkjiheiznkihxazkvluxikigzzgiqw
key = 12: ymjhxmnkyhhnumjwhnxhfqxthpstbshfxhymjhhfjxfwhhnumjwthfkyjwhozqnzxhhfjxfwthbmthwjuzyjiqdhzxjihymjhhnumjwhythufxxhrjxxfljxhythmnxhywttuxhizwnslhmnxhrnqnyfwdhhfrufnlsxhnshlfzqvhstyjhymfyhymjhhfjxfwhhnumjwhnxhjfxnqdhhwfhpjihgdhymjhgwzyjuktwhjhfyyfhpv
key = 13: xligwlmjxggmtlivgmwgepwsgorsargewgxliggeiwevggmtlivsgejxivgnypmywggeiwevsgalsgvityxihpcgywihgxliggmtlivgxsgtewwgqiwwekiwgxsglmwgxvsstwghyvmrkglmwgqmpmxevcggeqtemkrwgmrgkeypugrsxigxlexgxliggeiwevggmtlivgmwgiewmpcggvegoihgfcgxligfvyxitjsvgigexxegou
key = 14: wkhfvkliwfflskhuflvfdovrfnqrzqfdvfwkhffdhvdufflskhurfdiwhufmxolxvffdhvdurfzkrfuhsxwhgobfxvhgfwkhfflskhufwrfsdvvfphvvdjhvfwrfklvfwurrsvfgxulqjfklvfplolwdubffdpsdljqvflqfjdxotfqrwhfwkdwfwkhffdhvdufflskhuflvfhdvlobffudfnhgfebfwkhfeuxwhsirufhfdwwdfnt
key = 15: vjgeujkhveekrjgtekuecnuqempqypecuevjgeecgucteekrjgtqechvgtelwnkwueecguctqeyjqetgrwvgfnaewugfevjgeekrjgtevqercuueoguuciguevqejkuevtqqruefwtkpiejkueoknkvctaeecorckipuekpeicwnsepqvgevjcvevjgeecgucteekrjgtekuegcuknaeetcemgfedaevjgedtwvgrhqtegecvvcems
key = 16: uifdtijguddjqifsdjtdbmtpdlopxodbtduifddbftbsddjqifspdbgufsdkvmjvtddbftbspdxipdsfqvufemzdvtfeduifddjqifsdupdqbttdnfttbhftdupdijtdusppqtdevsjohdijtdnjmjubszddbnqbjhotdjodhbvmrdopufduibuduifddbftbsddjqifsdjtdfbtjmzddsbdlfedczduifdcsvufqgpsdfdbuubdlr
key = 17: thecshiftccipherciscalsocknowncasctheccaesarccipherocaftercjuliusccaesarocwhocreputedlycusedctheccipherctocpasscmessagesctochisctroopscduringchiscmilitaryccampaignscincgaulqcnotecthatctheccaesarcciphercisceasilyccrackedcbycthecbrutepforcecattackq
Part 3 KeyCode: 17
Brute-force-breaking Caesar-Cypher can be easily done by using heuristics. A very good and simple approach is, to use the average frequency of a letter in a given language. Such tables can be found on the internet for many languages.
Let us look at the English language. I found for example the following:
a: 0.0684
b: 0.0139
c: 0.0146
d: 0.0456
e: 0.1267
f: 0.0234
g: 0.0180
h: 0.0701
i: 0.0640
j: 0.0033
k: 0.0093
l: 0.0450
m: 0.0305
n: 0.0631
o: 0.0852
p: 0.0136
q: 0.0004
r: 0.0534
s: 0.0659
t: 0.0850
u: 0.0325
v: 0.0084
w: 0.0271
x: 0.0007
y: 0.0315
z: 0.0004
Later we need to look up this occurrence value for each letter of the alphabet. We will use lowercase only. We can define a compile-time array for that.
constexpr std::array<double, 26> LetterWeight{ .0684,.0139,.0146,.0456,.1267,
.0234,.0180,.0701,.0640,.0033,.0093,.0450,.0305,.0631,.0852,.0136,.0004,.0534,
.0659,.0850,.0325,.0084,.0271,.0007,.0315,.0004 };
Then, we will do the following. With the brute force approach, we will try all possible keys for decryption in a loop.
So, we will decrypt the encrypted message 25 times. We can omit the key 0, because then the message would be clear and readable.
In each loop for the 25 keys, we will additionally analyze each letter of the decrypted message (if it is the right key or not does not matter at this moment).
We will build a score for this decrypted string by adding up the above defined occurrence-value for each key based on the letter.
In the end, we will have a sum for each key / letter of the decrypted string. Additionally, we will store the key that has been used for those sums.
For this, we will define a std::array
of std::pair
. There will be 26 elements in the std::array
. One for each of the tested keys. The content will be the sum of letter-occurrence-values for each key and again the key.
std::array<std::pair<double, int>, 26> score{};
We need to store again the key in the array, because we want to sort the std::array
later to find the highest score and its corresponding key.
Regarding the encryption/decryption function, I gave a detailed answer with a long explanation here. Please, kindly check this.
Now, with all the above wisdom, we can come up with the following code cracker:
#include <iostream>
#include <array>
#include <cctype>
#include <string>
#include <algorithm>
// Some test string
const std::string test{ R"(Kyv rcxfizkyd yrj evjkvu wfi vrty cffgj, kyv wzijk fev bvvgj kirtb fw kyv bvp zezkzrczqvu kf 0 reu kyv
jvtfeu cffg xfvj kyiflxy kyv jkizex xzmve reu tfemvikj zk kf kyv gifgvi mrclv srjvu fe kyv jyzwk bvp.
Kyv ivrjfe nv nrek kf wzeu 'k', 'y' fi 'v' zj svtrljv kyzj jzxezwzvj kyv svxxzeezex fw re rtklrc Vexczjy
jvekvetv,reu dfjk czbvcp fli tirtbvu dvjjrxv.Z nrj rscv kf uvkvidzev kyzj nflcu svxze kyv tfiivtk jkizex
nv riv cffbzex wfi sp silkvwfitzex kyv wzijk wvn cvkkvij ljzex gvetzc reu grgvi.)" };
// Letter frequencies in the English Language
constexpr std::array<double, 26> LetterWeight{ .0684,.0139,.0146,.0456,.1267,.0234,.0180,.0701,.0640,.0033,.0093,.0450,.0305,.0631,.0852,.0136,.0004,.0534,.0659,.0850,.0325,.0084,.0271,.0007,.0315,.0004 };
// Simple function for Caesar encyption/decyption (decryption by simply using a negative key)
std::string caesar(const std::string& in, int key) {
std::string res(in.size(), ' ');
std::transform(in.begin(), in.end(), res.begin(), [&](char c) {return std::isalpha(c) ? (char)((((c & 31) - 1 + ((26 + (key % 26)) % 26)) % 26 + 65) | c & 32) : c; });
return res;
}
// Test code
int main() {
// We will try all possible ciphers 1..25
std::array<std::pair<double, int>, 26> score{};
for (int key = 1; key < 26; ++key) {
// Get one possible deciphered test
for (const char c : caesar(test, key)) {
// Calculate score according toLetter weight
score[key].first += (std::isalpha(c)) ? LetterWeight[(c & 31) - 1] : 0.0; // Build the score for this key, using the accumulated letter weight
score[key].second = key; // And store the key. Is necessary, becuase we will sort later and do not want to loose the key information
}
}
// Now sort for getting the index with the highes score
std::sort(score.begin(), score.end());
// And show the most probable result to the user.
std::cout << "Decrypted with key: "<< score.back().second-26 << "\n\n" << caesar(test, score.back().second) << "\n\n";
};
All the code breaking can be done with ~15 lines of code. So, rather simple.
The resulting output will be: