This page looks best with JavaScript enabled

Tables

C++ vtables ➕➕

 ·  ☕ 7 min read  ·  👻 Ahmed Raof

Introduction

Some source language has a significant impact on the assembly. For example, C++ has several features and constructs that do not exist in C, and these can complicate analysis of the resulting assembly.

Malicious programs written in C++ create challenges for the malware analyst that make it harder to determine the purpose of assembly code. Understanding basic C++ features and how they appear in assembly language is critical to analyzing malware written in C++.

In order to understand C++ concepts as they are represented in dis-assemblies, you must be able to:

  • Identify the classes
  • Identify relationships between classes
  • Identify the class members

Scratching The Service

OOP

C and C++ are both procedural programming languages, but C++ is an object-oriented programming (OOP) language as well. This means that C++ includes additional features such as classes, objects, inheritance, polymorphism, and templates, which are not present in C.

Classes are like structs, except that they store function information in addition to data. Classes provide a way to define custom data types with their own properties (data members) and behaviors (member functions) that can be used to create objects.

In this example, the class is called SimpleClass. It has one data member, x, and a single function, HelloWorld. We create an instance of SimpleClass named myObject and call the HelloWorld function for that object.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#include <iostream>

class SimpleClass {
	public:
	 int x; // data member
   // member functions
	 void HelloWorld() { 
	 printf("Hello World\n");
	 }
};
int main() {
 // custome data type
 SimpleClass myObject;
 myObject.HelloWorld();
}

Overloading and Mangling

Function overloading is a feature C++ that allows multiple functions with the same name, but that accept different parameters. When the function is called, the compiler determines which version of the function to use based on the number and types of parameter used in the call, as shown in c++ code and picture below.

in the c++ code, three functions do the same thing; adding all parameters and printing the result. If we look at the picture below, we can notice that it’s __cdecl calling convention where function parameters are pushed onto the stack in reverse order, i.e., the rightmost parameter is pushed first, so the first function call there was only when the variable pushed on the stack and in the second call two variable are pushed on the stack and so on…

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
#include<iostream>
using namespace std;

void sum(int num1) {
    cout << "Result: " << num1 << endl;
}

void sum(int num1, int num2) {
    cout << "Result: " << num1 + num2 << endl;
}

void sum(int num1, int num2, int num3) {
    cout << "Result: " << num1 + num2 + num3 << endl;
}

int main() {
    sum(1);
    sum(1, 2);
    sum(1, 2, 3);
    return 0;
}
code

C++ uses a technique called name mangling to support method overloading. In the PE file format, each function is labeled with only its name, and the function parameters are not specified in the compiled binary format. To support overloading, the names in the file format are modified so that the name information includes the parameter information. For example, in the picture above we can notice that sum function are called __Z3sumi where i represent the number of parameters are passed.

IDA Pro can de-mangle the names for most compiler.

    
 
The internal function names are visible only if there are symbols in the code you are analyzing. Malware usually has the internal symbols removed :)

Example

Tools: It’s not related to the challenge, but I would like to mention ClassInformer. It was used as a VTable finder, and I remember my first time using it in Flare-On 9 Challenge 5. It saved me a lot of time in reverse engineering

Easy C++ Challenge [FlagYard - tables]

Static Analysis

When dealing with C++ binaries, one common challenge is the presence of mangled names. C++ compilers use name mangling to encode information about a function’s parameters and return type into its name as we discussed. We can use the “Demangled names” option in IDA to remove the mangled names.

 

By looking at the main function it ask for the Flag and store it in variable s. Then it allocates 8 bytes of memory using the operator new function and assigns the pointer to the newly allocated memory to a variable named v6. The memset function is then used to set all the bytes of the allocated memory to 0. After the memory has been allocated and initialized, a function named sub_401350 is called with v6 as its argument.

In brief, the code in the function sub_401350 sets the pointer v6 to the address of the virtual function table (vtable) for the class P0.

When I scrolled down to examine the vtables, I noticed that there were a total of 44 vtables present in the program, denoted as P[0-43]. I noticed that this is the exact number of iterations in the for-loop that followed.

Based on this, I made a guess that there may be a vtable for each character of the Flag, make a check or something else !!!

Upon inspecting function sub_401390, it can be observed that it contains a switch case statement that checks the user input character. Depending on the value of this character, the program jumps to a specific function within the P0 class as we pass a pointer v8 to the vtable of P0

  

Debugging Time

Soooo we already know the flag format, which is FlagY{}. Let’s set a breakpoint on the function sub_401390 in the for loop and let’s run the program. We input FlagY{aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa} as a flag.

We notice that if it’s a correct character the return value which is stored in rax is an pointer to the vtable of the next P[N] and if it’s a wrong character it return a pointer to vtable of letters

 

After completing the loop, the final step is to verify if our input is correct. This can be achieved by checking the last value of **RAX**. If the pointer points to a **vtable of letters**, then the flag is incorrect. However, if it points to the address of P43, then the flag is correct.

To validate this approach, we can input any value and when the program reaches the checking part, the value can be edited to the address of **class P43**, which is 0x4084A0. Upon doing so, the correct message will be displayed.

We can grab all the values of the addresses from P1 to P43, and we can check if the character we input is correct or wrong by comparing the value in rax with the array we created of addresses P[0-43]. If it points to the next P[N], it will be the correct character else. We continue brute-forcing until we get the flag

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import gdb

def write_txt(val):
    with open("attempt.txt", "w") as f:
        f.write(val)

def write_end(val):
    with open("end.txt", "w") as f:
        f.write(val)

FLAG_LEN = 43
known_flag = "FlagY{"
possible_char = '0123456789_abcdefghijklmnopqrstuvwxyz}' # characters from switch case (remove known part [F-Y-{])

# v-table P[1-43] offsets -> 0x4045A0: P1 offset
    # The difference between each offset is 384
success = [f'0x%.2x' % i for i in range(0x4045A0, 0x4045A0 + (43 * 384), 384)]

gdb.execute('b *0x4012C1') # break on rax value to check the v-table addr
gdb.execute('set confirm off') # to close confirmation message when you kill the program

write_end(str(6)) # 6 -> start at index 6 => FlagY{?

while len(known_flag) != 43:
    for i, char in enumerate(possible_char): # loop through each possible character until we get the correct one
        bf = known_flag + char + 'a' * (43 - len(known_flag) - 2) + '}' # brute-force ( we sub 2 from the len -> ( 'char' and '}' ) )
        end = int( open('end.txt', 'r').read() ) # current index that we are bruteforcing

        write_txt(bf)
        
        gdb.execute("run <attempt.txt >/dev/null")

        for _ in range(end):
            gdb.execute('continue')

        x = hex(int( gdb.execute('x /2wd $rax', to_string=True).split()[1] ))
        
        if x == success[end]:
            print("-------------------------------------[success]-----------------------------", bf[end])
            known_flag += char
            print("-------------------------------------[FLAG]-----------------------------", known_flag)
            end += 1
            write_end(str(end))
            gdb.execute('kill')
            break
        else:
            print(f"-------------[LOADING]---------------------- {i} {len(known_flag)}/43")
            gdb.execute('kill')

print("Doneeeee!!!!!!!!!!!!!!!!!!", bf)

References

Share on

Ahmed Raof. AKA 50r4.
WRITTEN BY
Ahmed Raof
📚Learner🤓Nerd🔫reverse engineering👾malware analysis🔒cryptography