Skip to content

Commit 546c7e2

Browse files
author
hippwn
committed
Add paper (Linux part only)
1 parent 6ae2983 commit 546c7e2

File tree

9 files changed

+236
-1
lines changed

9 files changed

+236
-1
lines changed

README.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,23 @@ This project is a Go implementation of well-known techniques trying to detect if
44

55
Why doing this in Go ? Because there are many C programs already doing this, but none written in pure Go.
66

7+
See the [paper](https://github.com/ShellCode33/VM-Detection/blob/master/paper/paper.pdf) for more details.
8+
79
## Usage
810

911
First download the package
10-
```
12+
```bash
1113
$ go get github.com/ShellCode33/VM-Detection/vmdetect
1214
```
1315

1416
Then see [main.go](https://github.com/ShellCode33/VM-Detection/blob/master/main.go) to use it in your own project.
1517

18+
To build the paper, be sure to have Docker installed and run the following command inside the paper directory:
19+
20+
```bash
21+
$ docker run
22+
```
23+
1624
## GNU/Linux techniques
1725

1826
- Look for CPU vendor by trying out different assembly instructions ([cpuid](https://github.com/klauspost/cpuid/))

paper/00_header.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
title: Detection of execution in a virtualized environment
3+
subtitle: Malware class 2020
4+
author: ["ShellCode33", "Hippwn"]
5+
date: \today
6+
titlepage: true
7+
footnotes-pretty: true
8+
indent: true
9+
toc: true
10+
table-use-row-colors: true
11+
header-includes:
12+
- \usepackage{fvextra}
13+
- \DefineVerbatimEnvironment{Highlighting}{Verbatim}{breaklines,commandchars=\\\{\}}
14+
...

paper/10_abstract.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
\newpage
2+
3+
# Abstract
4+
5+
This paper has been written during an *Introduction to Malware* class in a
6+
French engineering school. We will be focusing on the runtime detection of
7+
virtualized environment. Most malware today uses complex techniques to detect
8+
sandboxes and prevent their own execution, thus making their analysis more
9+
complex. Through the following pages, we will come back on the differences
10+
the different virtualisation techniques and then study the state of the art of
11+
today's virtual machines detection.

paper/20_virtualisation.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
\newpage
2+
3+
# The virtualisation
4+
5+
The concept of virtualisation itself is hard to define as it is subject to many
6+
different interpretations – or at least different levels of application.
7+
In fact, even the way the different malware-auditing sandboxes work is often
8+
distinctive from one another. Moreover, the term of virtualisation sometimes
9+
hides something else (isolation, containerization, emulation...) and can be
10+
misleading. If we will be focused on the virtualisation of processes in this
11+
paper, it must be acknowledged that everything can be virtualized, from the
12+
storage to the networks.
13+
14+
## Isolation
15+
16+
This is an old concept on Linux-based systems which has more recently appeared
17+
on Windows 10 (1803) . It is not really a virtualisation but more of a way of
18+
running a process in an independent environment that we call *context*. The
19+
isolated process access and system calls are filtered so that it is not aware
20+
of the host he's running on. This is basically the way containers work (LXC,
21+
Docker), excepted on Windows which also provides a per-process virtualisation based
22+
on Hyper-V, its own Type-1 virtualisation system (see below).
23+
24+
![Container architecture by RedHat](img/container.png)
25+
26+
In more technical terms, the isolation of processes rests on kernel-level
27+
features like *cgroups* (isolation of material resources: RAM, CPU...),
28+
*chroot* (change the *root* – or **/** – directory), *namespaces*
29+
(partitioning of kernel resources) on Linux. Such functionalities also exist
30+
on Windows environment but with other names.
31+
32+
## Kernel in user-space
33+
34+
Just like the isolation, it is hard to call this virtualisation as it does not
35+
virtualize the hardware. This is mostly used in kernel development, allowing
36+
one to run the kernel above its own operating system like any other program. It
37+
is not really used in other contexts so we will not elaborate.
38+
39+
## Type-2 hypervisor
40+
41+
![The difference between hypervisors](img/hypervisors.png)
42+
43+
An hypervisor is a platform that creates and runs virtual machines. We talk
44+
about type-2 or hosted hypervisors when the software runs above an operating
45+
system (*VirtualBox*, *VMWare Workstation*, *Parallel Desktop*, *QEMU*...).
46+
This is often the solution chosen by the average users as it allows them to run
47+
different operating systems on their day-to-day computer. But in this category
48+
we have to make the distinction between the hardware virtualisation and the
49+
software that we will call emulation.
50+
51+
In the first case, the virtualisation is supported directly by the hardware,
52+
thanks to specific CPU instructions. Those technologies (*Intel VT* for the
53+
blues and AMD-V for the reds) allow the hypervisor to delegate the memory and
54+
CPU management to the hardware itself, thus simplifying the software
55+
virtualisation. But this is only possible when you virtualize a machine with
56+
the same architecture than the host: an x86_64 guest on a x86_64 host for
57+
instance.
58+
59+
The emulation, on the other hand, is required when the CPU does not support
60+
virtualisation assistance or when you run another architecture (like ARM when
61+
virtualizing an Android device). This time, the host has to simulate the whole
62+
hardware on which the guest is supposed to be running and thus translate each
63+
system call. The type-2 is already the slower way of virtualizing a system, but
64+
it is even truer with emulation as this is an inefficient process by definition.
65+
66+
## Type-1 hypervisor
67+
68+
Type-1 or bare-metal hypervisors are the closest to the hardware. In this
69+
situation, the hypervisor runs directly on the machine and serves as a
70+
lightweight operating system (*VMWare vSphere*, *Citrix Xen Server*,
71+
*Microsoft Hyper-V*...). This is way more efficient than the type-2
72+
because you do not have an host OS that would consume resources. This kind of
73+
infrastructure is mostly used in data centers to simplify the deployment of
74+
virtual machines and their performance.
75+
76+
Note that this is not incompatible with the fact of running an operating system
77+
on the host machine: *Hyper-V* can be run from Windows 10 Pro for instance,
78+
allowing good virtualisation performance and still having a user-friendly OS.

paper/30_state_of_the_art.md

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
\newpage
2+
3+
# State of the art
4+
5+
In this part, we will dive into the concept of evasion. Malware authors are
6+
ahead of the analyst and it is a lead they have to maintain. To this end, they
7+
deploy more and more techniques to make harder the analysis and comprehension
8+
of their code. Against static analysis they use obfuscation, encryption and
9+
such, as for dynamic ones they use evasion.
10+
11+
The concept of evasion refers to all the techniques used by a malware to hide
12+
its behaviour according to its environment. For instance, if a malware detects
13+
a sandbox (from an analyst or an antivirus), it will make low profile to so as
14+
not to arouse suspicion. One of the most known example of this is the Red Pill
15+
demonstration (reference to the legendary film *The Matrix*) presented by
16+
Joanna Rutkowska in 2004 – just two years before she presents the Blue
17+
Pill attack which is a type of *hyperjacking*.
18+
19+
Red Pill is a small piece of code written in C that checks the address of the
20+
*Interrupt Descriptor Table* (IDT). The address of this table has to be
21+
modified by the hypervisor to avoid memory conflicts. Therefore, there is a
22+
correlation between the address being superior to `0xD0` and the fact of being
23+
executed in a virtual machines. This technique is however less efficient on
24+
today's systems as those filter the access to certain zones of the memory, such
25+
as the DTI. It still works on *QEMU* though.
26+
27+
```c
28+
int swallow_redpill ()
29+
{
30+
unsigned char m[2+4], rpill[] = "\x0f\x01\x0d\x00\x00\x00\x00\xc3";
31+
*((unsigned*)&rpill[3]) = (unsigned)m;
32+
((void(*)())&rpill)();
33+
return (m[5]>0xd0) ? 1 : 0;
34+
}
35+
```
36+
37+
## Linux techniques
38+
39+
### The DMI table
40+
41+
DMI stands for *Desktop Management Interface*. It is a standard developed in
42+
the 90' with de goal of uniforming the tracking of the components in a computer
43+
and abstracting them from the softwares supposed to run them. Parsing this
44+
table can reveal practical information on the hardware used by the operating
45+
system and possibly detect the presence of names specific to virtualized
46+
environment, such as *vbox*, *virtualbox*, *oracle*, *qemu*, *kvm* and so on.
47+
48+
### Linux kernel's hypervisor detection
49+
50+
Linux's kernel comes with an hypervisor detection feature that can be used to
51+
identify a potential hypervisor below the operating system. Based on this, we
52+
easily can listen for the kernel event to see if an hypervisor has been
53+
detected by the kernel:
54+
55+
```c
56+
static inline const struct hypervisor_x86 * __init
57+
detect_hypervisor_vendor(void)
58+
{
59+
const struct hypervisor_x86 *h = NULL, * const *p;
60+
uint32_t pri, max_pri = 0;
61+
62+
for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
63+
if (unlikely(nopv) && !(*p)->ignore_nopv)
64+
continue;
65+
66+
pri = (*p)->detect();
67+
if (pri > max_pri) {
68+
max_pri = pri;
69+
h = *p;
70+
}
71+
}
72+
73+
if (h)
74+
// this line prints the hypervisor in the `/dev/kmsg` file
75+
pr_info("Hypervisor detected: %s\n", h->name);
76+
77+
return h;
78+
}
79+
```
80+
81+
### Checking Linux's pseudo-filesystems
82+
83+
Linux provides a lot of information via a certain type of files (mostly in
84+
`/proc`) that are generated at boot and modified during runtime. A lot of
85+
binaries use this directory like `ps`, `uname`, `lspci` and so on. These
86+
information are really helpful when trying to identify wether or not you are
87+
in a virtualized environment, like UML for instance. UML refers to the
88+
aforementioned way of executing a Linux kernel in user-space. This can easily
89+
be verified by looking for the string "User Mode Linux" in the file
90+
`/proc/cpuinfo` which describes the CPU of the machine.
91+
92+
In the same way, a lot of these virtual *files* can provide information on the
93+
environment, including &ndash; but not limited to &ndash; `/proc/sysinfo` (in
94+
which some distribution expose data about virtual machines),
95+
`/proc/device-tree` (that lists the devices on the machine), `/proc/xen` (a
96+
file created by the *Xen Server*) or `/proc/modules` (that contains information
97+
about the loaded kernel modules, modules that are used by hypervisors to
98+
optimize the guests).
99+
100+
Like *procfs* (mounted in `/proc`), *sysfs* can be useful. Its role is to
101+
provide to the user an access to the devices and their drivers. The file
102+
`/sys/hypervisor/type`, for instance, is sometimes used to store information
103+
about the hypervisor Linux is running on.
104+
105+
106+
## Windows
107+
108+
<!-- TODO -->

paper/40_sources.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
\newpage
2+
3+
# Sources
4+
5+
- https://docs.microsoft.com/fr-fr/virtualization/windowscontainers/manage-containers/hyperv-container
6+
- https://poweruser.blog/lightweight-windows-containers-using-docker-process-isolation-in-windows-10-62519be76c8c
7+
- https://fr.wikipedia.org/wiki/Virtualisation#Diff%C3%A9rentes_techniques
8+
- https://en.wikipedia.org/wiki/Hypervisor
9+
- https://fr.wikipedia.org/wiki/X64
10+
- https://www.docker.com/resources/what-container
11+
- https://securiteam.com/securityreviews/6z00h20bqs/
12+
- https://arxiv.org/pdf/1811.01190.pdf
13+
- https://daks2k3a4ib2z.cloudfront.net/5757fcb8825e8dbc6c852e3c/59ad6c357ba794000108098c_Minerva_Introduction_to_Evasive_Techniques.pdf
14+
- https://en.wikipedia.org/wiki/Desktop_Management_Interface
15+
- https://github.com/torvalds/linux/blob/31cc088a4f5d83481c6f5041bd6eb06115b974af/arch/x86/kernel/cpu/hypervisor.c
16+
- https://www.ibm.com/support/knowledgecenter/en/linuxonibm/com.ibm.linux.z.lhdd/lhdd_t_sysinfo.html

paper/img/container.png

11.9 KB
Loading

paper/img/hypervisors.png

28.3 KB
Loading

paper/paper.pdf

84.2 KB
Binary file not shown.

0 commit comments

Comments
 (0)