hacktricks/forensics/basic-forensic-methodology/specific-software-file-type.../.pyc.md

251 lines
12 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Decompile compiled python binaries (exe, elf) - Retreive from .pyc
<details>
<summary><a href="https://cloud.hacktricks.xyz/pentesting-cloud/pentesting-cloud-methodology"><strong>☁️ HackTricks Cloud ☁️</strong></a> -<a href="https://twitter.com/hacktricks_live"><strong>🐦 Twitter 🐦</strong></a> - <a href="https://www.twitch.tv/hacktricks_live/schedule"><strong>🎙️ Twitch 🎙️</strong></a> - <a href="https://www.youtube.com/@hacktricks_LIVE"><strong>🎥 Youtube 🎥</strong></a></summary>
* Do you work in a **cybersecurity company**? Do you want to see your **company advertised in HackTricks**? or do you want to have access to the **latest version of the PEASS or download HackTricks in PDF**? Check the [**SUBSCRIPTION PLANS**](https://github.com/sponsors/carlospolop)!
* Discover [**The PEASS Family**](https://opensea.io/collection/the-peass-family), our collection of exclusive [**NFTs**](https://opensea.io/collection/the-peass-family)
* Get the [**official PEASS & HackTricks swag**](https://peass.creator-spring.com)
* **Join the** [**💬**](https://emojipedia.org/speech-balloon/) [**Discord group**](https://discord.gg/hRep4RUj7f) or the [**telegram group**](https://t.me/peass) or **follow** me on **Twitter** [**🐦**](https://github.com/carlospolop/hacktricks/tree/7af18b62b3bdc423e11444677a6a73d4043511e9/\[https:/emojipedia.org/bird/README.md)[**@carlospolopm**](https://twitter.com/hacktricks\_live)**.**
* **Share your hacking tricks by submitting PRs to the** [**hacktricks repo**](https://github.com/carlospolop/hacktricks) **and** [**hacktricks-cloud repo**](https://github.com/carlospolop/hacktricks-cloud).
</details>
<img src="../../../.gitbook/assets/image (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1).png" alt="" data-size="original">
If you are interested in **hacking career** and hack the unhackable - **we are hiring!** (_fluent polish written and spoken required_).
{% embed url="https://www.stmcyber.com/careers" %}
## From Compiled Binary to .pyc
From an **ELF** compiled binary you can **get the .pyc** with:
```bash
pyi-archive_viewer <binary>
# The list of python modules will be given here:
[(0, 230, 311, 1, 'm', 'struct'),
(230, 1061, 1792, 1, 'm', 'pyimod01_os_path'),
(1291, 4071, 8907, 1, 'm', 'pyimod02_archive'),
(5362, 5609, 13152, 1, 'm', 'pyimod03_importers'),
(10971, 1473, 3468, 1, 'm', 'pyimod04_ctypes'),
(12444, 816, 1372, 1, 's', 'pyiboot01_bootstrap'),
(13260, 696, 1053, 1, 's', 'pyi_rth_pkgutil'),
(13956, 1134, 2075, 1, 's', 'pyi_rth_multiprocessing'),
(15090, 445, 672, 1, 's', 'pyi_rth_inspect'),
(15535, 2514, 4421, 1, 's', 'binary_name'),
...
? X binary_name
to filename? /tmp/binary.pyc
```
In a **python exe binary** compiled you can **get the .pyc** by running:
```bash
python pyinstxtractor.py executable.exe
```
## From .pyc to python code
For the **.pyc** data ("compiled" python) you should start trying to **extract** the **original** **python** **code**:
```bash
uncompyle6 binary.pyc > decompiled.py
```
**Be sure** that the binary has the **extension** "**.pyc**" (if not, uncompyle6 is not going to work)
While executing **uncompyle6** you might find the **following errors**:
### Error: Unknown magic number 227
```bash
/kali/.local/bin/uncompyle6 /tmp/binary.pyc
Unknown magic number 227 in /tmp/binary.pyc
```
To fix this you need to **add the correct magic number** at the beginning of the generated file.
**Magic numbers vary with the python version**, to get the magic number of **python 3.8** you will need to **open a python 3.8** terminal and execute:
```
>> import imp
>> imp.get_magic().hex()
'550d0d0a'
```
The **magic number** in this case for python3.8 is **`0x550d0d0a`**, then, to fix this error you will need to **add** at the **beginning** of the **.pyc file** the following bytes: `0x0d550a0d000000000000000000000000`
**Once** you have **added** that magic header, the **error should be fixed.**
This is how a correctly added **.pyc python3.8 magic header** will look like:
```bash
hexdump 'binary.pyc' | head
0000000 0d55 0a0d 0000 0000 0000 0000 0000 0000
0000010 00e3 0000 0000 0000 0000 0000 0000 0000
0000020 0700 0000 4000 0000 7300 0132 0000 0064
0000030 0164 006c 005a 0064 0164 016c 015a 0064
```
### Error: Decompiling generic errors
**Other errors** like: `class 'AssertionError'>; co_code should be one of the types (<class 'str'>, <class 'bytes'>, <class 'list'>, <class 'tuple'>); is type <class 'NoneType'>` may appear.
This probably means that you **haven't added correctly** the magic number or that you haven't **used** the **correct magic number**, so make **sure you use the correct one** (or try a new one).
Check the previous error documentation.
## Automatic Tool
The tool [https://github.com/countercept/python-exe-unpacker](https://github.com/countercept/python-exe-unpacker) glues together several tools available to the community that **help researchers to unpack and decompile executable** written in python (py2exe and pyinstaller).
Several YARA rules are available to determine if the executable is written in python (This script also confirms if the executable is created with either py2exe or pyinstaller).
### ImportError: File name: 'unpacked/malware\_3.exe/**pycache**/archive.cpython-35.pyc' doesn't exist
Currently, with unpy2exe or pyinstxtractor the Python bytecode file we get might not be complete and in turn, it **cant be recognized by uncompyle6 to get the plain Python source code**. This is caused by a missing Python **bytecode version number**. Therefore we included a prepend option; this will include a Python bytecode version number into it and help to ease the process of decompiling. When we try to use uncompyle6 to decompile the .pyc file it returns an error. However, **once we use the prepend option we can see that the Python source code has been decompiled successfully**.
```
test@test: uncompyle6 unpacked/malware_3.exe/archive.py
Traceback (most recent call last):
……………………….
ImportError: File name: 'unpacked/malware_3.exe/__pycache__/archive.cpython-35.pyc' doesn't exist
```
```
test@test:python python_exe_unpack.py -p unpacked/malware_3.exe/archive
[*] On Python 2.7
[+] Magic bytes are already appended.
# Successfully decompiled file
[+] Successfully decompiled.
```
## Analyzing python assembly
If you weren't able to extract the python "original" code following the previous steps, then you can try to **extract** the **assembly** (but i**t isn't very descriptive**, so **try** to extract **again** the original code).In [here](https://bits.theorem.co/protecting-a-python-codebase/) I found a very simple code to **disassemble** the _.pyc_ binary (good luck understanding the code flow). If the _.pyc_ is from python2, use python2:
```bash
>>> import dis
>>> import marshal
>>> import struct
>>> import imp
>>>
>>> with open('hello.pyc', 'r') as f: # Read the binary file
... magic = f.read(4)
... timestamp = f.read(4)
... code = f.read()
...
>>>
>>> # Unpack the structured content and un-marshal the code
>>> magic = struct.unpack('<H', magic[:2])
>>> timestamp = struct.unpack('<I', timestamp)
>>> code = marshal.loads(code)
>>> magic, timestamp, code
((62211,), (1425911959,), <code object <module> at 0x7fd54f90d5b0, file "hello.py", line 1>)
>>>
>>> # Verify if the magic number corresponds with the current python version
>>> struct.unpack('<H', imp.get_magic()[:2]) == magic
True
>>>
>>> # Disassemble the code object
>>> dis.disassemble(code)
1 0 LOAD_CONST 0 (<code object hello_world at 0x7f31b7240eb0, file "hello.py", line 1>)
3 MAKE_FUNCTION 0
6 STORE_NAME 0 (hello_world)
9 LOAD_CONST 1 (None)
12 RETURN_VALUE
>>>
>>> # Also disassemble that const being loaded (our function)
>>> dis.disassemble(code.co_consts[0])
2 0 LOAD_CONST 1 ('Hello {0}')
3 LOAD_ATTR 0 (format)
6 LOAD_FAST 0 (name)
9 CALL_FUNCTION 1
12 PRINT_ITEM
13 PRINT_NEWLINE
14 LOAD_CONST 0 (None)
17 RETURN_VALUE
```
## Python to Executable
To start, were going to show you how payloads can be compiled in py2exe and PyInstaller.
### To create a payload using py2exe:
1. Install the py2exe package from [http://www.py2exe.org/](http://www.py2exe.org)
2. For the payload (in this case, we will name it hello.py), use a script like the one in Figure 1. The option “bundle\_files” with the value of 1 will bundle everything including the Python interpreter into one exe.
3. Once the script is ready, we will issue the command “python setup.py py2exe”. This will create the executable, just like in Figure 2.
```
from distutils.core import setup
import py2exe, sys, os
sys.argv.append('py2exe')
setup(
options = {'py2exe': {'bundle_files': 1}},
#windows = [{'script': "hello.py"}],
console = [{'script': "hello.py"}],
zipfile = None,
)
```
```
C:\Users\test\Desktop\test>python setup.py py2exe
running py2exe
*** searching for required modules ***
*** parsing results ***
*** finding dlls needed ***
*** create binaries ***
*** byte compile python files ***
*** copy extensions ***
*** copy dlls ***
copying C:\Python27\lib\site-packages\py2exe\run.exe -> C:\Users\test\Desktop\test\dist\hello.exe
Adding python27.dll as resource to C:\Users\test\Desktop\test\dist\hello.exe
```
### To create a payload using PyInstaller:
1. Install PyInstaller using pip (pip install pyinstaller).
2. After that, we will issue the command “pyinstaller onefile hello.py” (a reminder that hello.py is our payload). This will bundle everything into one executable.
```
C:\Users\test\Desktop\test>pyinstaller --onefile hello.py
108 INFO: PyInstaller: 3.3.1
108 INFO: Python: 2.7.14
108 INFO: Platform: Windows-10-10.0.16299
………………………………
5967 INFO: checking EXE
5967 INFO: Building EXE because out00-EXE.toc is non existent
5982 INFO: Building EXE from out00-EXE.toc
5982 INFO: Appending archive to EXE C:\Users\test\Desktop\test\dist\hello.exe
6325 INFO: Building EXE from out00-EXE.toc completed successfully.
```
## References
* [https://blog.f-secure.com/how-to-decompile-any-python-binary/](https://blog.f-secure.com/how-to-decompile-any-python-binary/)
<img src="../../../.gitbook/assets/image (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1).png" alt="" data-size="original">
If you are interested in **hacking career** and hack the unhackable - **we are hiring!** (_fluent polish written and spoken required_).
{% embed url="https://www.stmcyber.com/careers" %}
<details>
<summary><a href="https://cloud.hacktricks.xyz/pentesting-cloud/pentesting-cloud-methodology"><strong>☁️ HackTricks Cloud ☁️</strong></a> -<a href="https://twitter.com/hacktricks_live"><strong>🐦 Twitter 🐦</strong></a> - <a href="https://www.twitch.tv/hacktricks_live/schedule"><strong>🎙️ Twitch 🎙️</strong></a> - <a href="https://www.youtube.com/@hacktricks_LIVE"><strong>🎥 Youtube 🎥</strong></a></summary>
* Do you work in a **cybersecurity company**? Do you want to see your **company advertised in HackTricks**? or do you want to have access to the **latest version of the PEASS or download HackTricks in PDF**? Check the [**SUBSCRIPTION PLANS**](https://github.com/sponsors/carlospolop)!
* Discover [**The PEASS Family**](https://opensea.io/collection/the-peass-family), our collection of exclusive [**NFTs**](https://opensea.io/collection/the-peass-family)
* Get the [**official PEASS & HackTricks swag**](https://peass.creator-spring.com)
* **Join the** [**💬**](https://emojipedia.org/speech-balloon/) [**Discord group**](https://discord.gg/hRep4RUj7f) or the [**telegram group**](https://t.me/peass) or **follow** me on **Twitter** [**🐦**](https://github.com/carlospolop/hacktricks/tree/7af18b62b3bdc423e11444677a6a73d4043511e9/\[https:/emojipedia.org/bird/README.md)[**@carlospolopm**](https://twitter.com/hacktricks\_live)**.**
* **Share your hacking tricks by submitting PRs to the** [**hacktricks repo**](https://github.com/carlospolop/hacktricks) **and** [**hacktricks-cloud repo**](https://github.com/carlospolop/hacktricks-cloud).
</details>