Developing PE file packer step-by-step. Step 12 – bugfixes

Thanks to the guy from commentaries in previous posts about the packer, one amazing bug in the code was discovered, which I tried to fix quickly. The commentator, without analyzing the packer operation in detail, stated that the code packed with it will not be able to work with SEH if DEP is turned on. Under such conditions the code worked well (because all unpacked code in memory is located within one single PE file section marked as executable. UPX has same operation logic.). However, suddenly the following bug was discovered: if the program is built in MSVC++, uses SEH and has relocations, it will likely crash on first exception (more precisely, if the file was loaded to the address other than base). DEP, of course, has nothing to do with this. The thing is in disastrous IMAGE_LOAD_CONFIG_DIRECTORY directory. It is created by Visual Studio linker. Of useful information it contains the address table (RVA) of SE handlers and a pointer to CRT internal variable __security_cookie. As it turned out, this directory is necessary not only for CRT internals (although it, as it seems, actually doesn’t care about this structure), but also for the system loader (at least, in Win7. WinXP, it seems, ignores this directory too). The packer, which I developed, moves this directory to other section (see here). Thus the issue can be fixed by adding several records to relocations table, which is created by the packer. These records will fix the addresses pointing to security cookie and SE handlers table, to let it read necessary information from this directory at loading stage.

Except correcting this bug I updated the packer code to make it buildable with latest version (1.0.0) of PE library (PE Bliss). It is always available to download here.

By the way, about PE Bliss library. Currently in my free time I develop new version, which will have the following features (the list is exemplary and can be changed):
– high-level work with additional types of PE file resources;
– detailed .NET binaries parsing (metadata, signatures, resources);
– library wrapper in C++/CLI, which allows .NET developers to use library functionality comfortably in C# or Visual Basic .NET software.

Download packer sources: packer source
Download binary: packer binary

PS. In Windows 8 and 8.1 image load configuraton directory has been exanded (to support Control Flow Guard), so the packer will be unable to pack newest binaries from that operation systems, if IMAGE_LOAD_CONFIG_DIRECTORY is present.

UPDATE 24.05.2016: relocations generation has been updated. In some rare cases (when TLS data was big enough, and load config directory was present, relocation table addresses could overflow, which resulted in corrupted packed binary).

Developing PE file packer step-by-step. Step 9. Delay-loaded DLLs and Image Config

Previous step is here.

Today we will do that little things, which I’ve put aside during my old packer development. Our new packer can do everything already, but we have a couple of small nuances and it will be good to finish with them. The first one is delay-loaded import. It allows to load required PE file libraries when they are really needed, thus saving time on loading image to memory. This mechanism is implemented by compilers/linkers only and it is not related to the loader. However, there is IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT directory in PE header, which points to delayed import data. I don’t know whether this is used by linker and built program, but the loader ignores it, but we better leave this directory and don’t zero it. Let’s remove the line

Continue reading “Developing PE file packer step-by-step. Step 9. Delay-loaded DLLs and Image Config”

Developing PE file packer step-by-step. Step 8. DLL’s and exports

Previous step is here.

Our packer can do almost everything already, except one thing – packing binaries with exports. This, in particular, is an absolute majority of DLL files and OCX components. Some EXE files also have exports. Our packer should rebuild export table and place it to available space, to make it accessible for the loader.

We can relax a bit for now – we have to add just a small amount of code to the packer (generally, the same for the unpacker, but it will be in assembler).

Continue reading “Developing PE file packer step-by-step. Step 8. DLL’s and exports”

Developing PE file packer step-by-step. Step 7. Relocations

Previous step is here. By the way, there was a bug in the code, I fixed it. It appeared when PE file had more than one callback.

Let’s turn to the next important part of many PE files – relocations. They are used, when it is impossible to load an image to the base address indicated in its header. Mainly this is a typical behavior for DLL files (basically they can’t work properly without relocations). Imagine that EXE file is being loaded to 0x400000 address. This EXE file loads the DLL, which is also loaded to this address. The addresses are the same, and the loader will look for DLL file relocations, because the DLL is loaded after the EXE. If there will be no relocations, loading will fail.

The relocations themselves are just the set of the tables with pointers to DWORDs, which should be calculated by the loader, if the image is loaded by an address other than base. There are many types of relocations, but actually only two are used in x86 (PE): IMAGE_REL_BASED_HIGHLOW = 3 and IMAGE_REL_BASED_ABSOLUTE = 0, and the second one does nothing, it is required only for relocation tables alignment.

I will just say, that loader loads EXE files to base address almost always, without using relocation. Our packer is unable to pack DLL yet, so to test relocation packing we should create an EXE file with incorrect base address, and then the loader will have to change it and apply relocations. I will not provide the test project source code here, you will find it in solution at the end of the article. I set 0x7F000000 as a base address (Linker – Advanced – Base Address).

We have to process relocations manually, like everything else, after unpacking a file. We should notify the loader in advance, that the file has relocations. Also, we need to know a new address, to which the loader moved the file.

We don’t have to do anything to notify the loader, that our file has relocations – we still have necessary flags set in PE file headers left from original file. However, we need to know at which address the file was loaded.

Continue reading “Developing PE file packer step-by-step. Step 7. Relocations”

Developing PE file packer step-by-step. Step 6. TLS

Previous step is here.

It’s time to manage such important thing as Thread Local Storage (TLS). What is it? It is a small structure, which tells PE loader where it has to place the data which should be allocated for each thread. The loader also calls TlsAlloc, and the return value is stored at the address specified in this structure (this is called index). Besides that, this structure may contain address of an array storing a set of callbacks (function addresses), which are called by loader when the file is loaded into memory or when a new thread in process is created.

To be honest, working with TLS will be somewhat more hardcore, than it was with other things, so get prepared and strain your brain. My old packer I mentioned one or two steps ago don’t support TLS callbacks, it notifies cowardly that they exist but they are not processed. Basically, this is a reasonable behaviour, as TLS callbacks are contained mainly in rather weird files, which use them as anti-debugging trick. There is no regular linker like Borland or Microsoft linker, with TLS callback creation support. However, we will add their support to make our packer cool.

Continue reading “Developing PE file packer step-by-step. Step 6. TLS”

Developing PE file packer step-by-step. Step 5. Resources

Previous step is here.

It’s time to improve our packer. It can pack and run simple binaries already, which have only import table. Binaries with exports, resources, TLS, DLL with relocations are still beyond its powers. We have to work on this. Let’s implement processing of second important thing after imports – of resource directory.

Continue reading “Developing PE file packer step-by-step. Step 5. Resources”

Developing PE file packer step-by-step. Step 4. Running

Previous step is here.

So, from previous steps we have working packer and basic unpacker, which does nothing yet. At this step we will make run simple packed programs (which have nothing, except import table and possibly relocations). First thing to do in addition to data uncompressing is to fix original file import table. Usually the loader does that, but for now we play its role for compressed file.

Continue reading “Developing PE file packer step-by-step. Step 4. Running”

Developing PE file packer step-by-step. Step 3. Unpacking

Previous step is here.

Let’s continue! It’s time to write an unpacker, this is what we are going to do during this step. We will not process import table for now, because we have some other things to do at this lesson.

We will begin from the following thing. To operate, the unpacker definitely needs two WinAPI functions: LoadLibraryA and GetProcAddress. In my old packer (that I’ve written once) I developed unpacker stub in MASM32 without creating import table at all. I looked for these function addresses in kernel, which is rather complicated and hardcore, besides that, this may cause serious antivirus suspicions. This time, let’s create import table and make loader to tell us these function addresses. Of course, set of these two functions in import table is as suspicious as their total absence, but nothing prevents us from adding more random imports from different .dll files in future. Where will the loader store these two function addresses? It’s time to expand our packed_file_info structure!

Continue reading “Developing PE file packer step-by-step. Step 3. Unpacking”

Developing PE file packer step-by-step. Step 2. Packing

Previous step is here.

Straight off I want to say that as I write this series of articles I fix some things and update my PE library (Note, that this step is for 0.1.x versions, too).

And we continue to develop our own packer. At this step it is time to turn directly to PE file packing. I shared a simple packer long time ago, which was ineffective by two reasons: firstly, it uses standard Windows functions for data packing and unpacking, which are rather slow and have low compression rate, secondly, all PE file sections were packed individually, which is not very optimal. This time I will do this differently. We are going to read data of all sections at once, assemble them into one block and pack. So, the resulting file will have only one section (actually two, I will explain this later), we can place all the resources, the packer code and helper tables into it. This will provide some benefits, because we don’t need to spend space for file alignment, besides that, LZO algorithm is much more effective than RtlCompressBuffer in all respects.

Continue reading “Developing PE file packer step-by-step. Step 2. Packing”

Developing PE file packer step-by-step. Step 1

Since I completed portable executable C++ library development, it would be totally wrong not to use it in any more or less serious project. Thus I am going to develop a packer with step-by-step explanations of what I am doing, and C++ library will make our life easier. So, where do we start the development? Maybe, from choosing some free simple compression algorithm. After short search I found such one: LZO. It supports lots of compression modes, and LZO1Z999 is the most effective by compression ratio of all available. Of course, it is not like ZIP, but its performance is close: 550 Kb file was compressed to 174 Kb with zip with maximum compression level, at the same time LZO compressed this file to 185 Kb. However, LZO has much more fast unpacker. It is also base-independent, that means, it can be placed at any virtual address and it will work without any address corrections. This algorithm will be right for us.

Continue reading “Developing PE file packer step-by-step. Step 1”