Why do you need NOPs?

I’ve read a lot of things and taken both the OSCP and OSCE courses, yet I’ve never seen anyone really break out why we use NOPsleds. There are instances where they are used to line up shellcode to a particular offset, which is self-explanatory. However, there are cases where an exploit won’t work without them. Why is that?

Looking at the TRUN command in Vulnserver, it’s a relatively easy exploit. We begin crafting our overflow string, this time with no NOPs. We already established the EIP offset is 2003, and we’re using reverse shell shellcode generated by msfvenom. The only bad character identified is \x00, we we encode with the default x86/shikata_ga_nai encoder and specify the bad character. POC code for the exploit is available here.

No NOPs this time

In this instance, we are using the JMP ESP instruction located at 0x62501205 to jump to our shellcode. Before sending the exploit, ensure a breakpoint is set at this address. We send the exploit and step through it in the debugger. Take the jump to ESP and we see, we are properly lined up.

Sitting at ESP…

If we move forward one instruction at a time, the program crashes and shellcode does not execute. If we’re paying attention, we will see that part of our shellcode was overwritten with some weird instructions.

RET instruction is gone, shellcode overwritten…

So, back to the drawing board. Modify the string to add the nops in, and let’s look at execution.

NOPsled in place…

At this point we can see our shellcode at the end of the NOPs. If we continue execution, we will receive a shell. But at this point it’s important to step back and understand the process here.

Shellcode begins…

Starting at 0x00b7fa24 we see our shellcode, but what we actually see if the msfvenom decoding stub. The important part to note is the instruction at 0x00b7fa2b: FSTENV (28-Byte) PTR SS:[ESP-C]. Floating point instructions are used for placing EIP on the stack, which is useful for all sorts of reasons but for our purposes here in order to perform relative calculations for the decoder to work. If we advance execution to 0x00b7fa2f (but don’t execute this instruction) we can see this play out.

Did not crash this time…

So we use FXCH to manipulate the floating point registers and put the value of EIP into an FPU register. Then we execute the FSTENV (28-BYTE) PTR SS:[ESP-c] instruction, which dumps the floating point environment into memory. It dumps 28 bytes of data starting at ESP-C, which in this case would be 0x00b7fa00. If we look at the stack, the memory address that FXCH instruction was performed at is now at the top of the stack. The next instruction will pop this value into EBX, and now EBX will be used by the decoder for relative calculations. At this point, if we press F9 to continue execution (in Immunity…)

It works!

I find this very interesting to see in action. I happened on information about how these instructions worked when dealing with a stack alignment problem in another exploit. Understanding how the decoder worked could have saved me some time.

Vulnserver: LTER SEH Buffer Overflow

Vulnserver is an intentionally vulnerable application used for training exploit development. It consists of several commands, some vulnerable and some not, and the the user is intended to find and exploit these vulnerabilities. For many specific vulnerabilities, there are several ways to exploit them. In my preparation for the OSCE exam, I was able to find and exploit each command in turn. However, it wasn’t until reviewing the infamous HP NNM 7.5.1 exploit that I was able to exploit the LTER command in Vulnserver by overflowing the SEH address.

To start we find the crash. I used boofuzz for this, using a template found out on this blog site. The crash should occur fairly quickly.

boofuzz template for vulnserver

After fuzzing we replicate the crash manually by sending a metasploit pattern to identify the offset. We get the “LTER /.:/” prepend string from the fuzz results.

Sending a pattern of 5000 bytes

Ensure the application is open and attached to a debugger on the target machine. The application has crashed and we can see the MSF pattern overwriting several locations in memory. Inspecting the SEH chain shows us the SEH pointer is overwritten. Querying these values in the pattern_offset utility in MSF returns an offset of 3495 for NSEH, 3499 for SEH. Note, the offset for NSEH on Windows 2003 will be 3491, on Windows Vista it will be 3515. By sending a shorter buffer, you can overwrite EIP directly instead of overwriting the SEH pointer. This can be a useful exercise for dealing with character restrictions in a simpler problem.

SEH pointer is overwritten

Further testing shows that we have 28 bytes following SEH to test bad characters manually. While testing, we send the below string:

testing bad characters…

And we find some strange results:

Everything after 7F is mangled…

It wasn’t until doing the OSCE course that I really recognized what was happening here and some of the additional options I have here. The characters after 7F are being mangled, but testing reveals that they are being mangled in a predictable way, basically subtracting 7F from any value greater than 7F. The end result is that we are left with a more or less alpha-numeric character set to work with.

The next step is to identify a “pop pop ret” pointer in the essfunc.dll module that consists of allowed characters. We can use mona or findjmp.exe in order to find this address, or just search immunity for “pop r32 pop r32 ret” but this is the least readable of the three options. We add this address into our attack string at the SEH location.

pop pop ret found at 0x6250120b in essfunc.dll

We need more space to work with, 28 bytes is not enough to function with even without character restrictions. We can take a backwards jump, with a couple of modifications. For a normal backwards jump, we would use the opcodes eb XX, where XX is equal to the number of bytes we want to jump, minus 1, subtracted from 255 and converted to hex. So if we want to jump back 64 bytes we would use c0, 128 bytes would be 80. We can’t use either of these but if we use FF it will be converted to 80 when vulnserver.exe does it’s alpha-numeric conversion. Instead of using eb for a short jump, we can use 77 for a conditional short jump. This jump relies on the zero flag and the carry flag being unset. We can ensure they are set to zero by putting an operation in front of them, such as \x42 which translates to INC EDX. It would take a very unlikely set of circumstances for INC EDX to lead to the zero flag and carry flag being set.

Backwards alphanumeric jump

We set nseh equal to “\x42\x77\xff\x42” and add it to the attack string. If we send this string and follow it in the debugger, then take the jump, we arrive at about 127 bytes of code we can use.

Jump back into the buffer, giving us some breathing room…

Now that we have more space, we can sub-encode values. Sub encoding uses alpha-numeric values and SUB instructions to put specific values we need on the stack. As an example, we will take the value 0xe7ffe775 and sub encode it. First, subtract the value from 0xffffffff and then add 1. We get the value 1800188b. We can then break the bytes out into a table, as seen below. We must ensure that the values we pick add up to our bytes in the left column, but also that they are not in the list of bad characters for the application.

Table for manual sub encoding

To break this down, if we first zero out EAX, we can then perform these instructions (SUB EAX, 15521542; SUB EAX,015e0208; SUB EAX, 01500141) and then EAX will contain the value e7ffe775. Push this value on the stack, and now we have our decoded instructions on the stack. There are calculators online for doing this kind of encoding but it’s a good exercise to do it manually and learn how it is done.

My objective here was to jump all the way back to the beginning of the buffer so that I have 3000+ bytes to work with. The first thing we need to do is align the stack in our current buffer area. To do this we increase ESP by 1188 in order to place ESP right at the end of our current buffer. So whatever we push onto the stack will get executed. To do this, we push the value of ESP onto the stack, we pop it into EAX, we adjust it using sub encoding, and then we push the value of EAX onto the stack and pop it into ESP. Now, our stack is located at the end of our current buffer segment.

Sub encoded stack adjustment

From here, we want to jump backwards from the location of our stack to the beginning of our buffer. We do the math and see that we need to jump back 0xdb9, we would normally perform a near jump to FFFFF264, opcodes would translate to E9 64F2FFFF due to endianess. We will have to write this as two instructions on the stack to account for the uneven number of bytes, since we can only write 4 bytes at a time. We must zero out the EAX register before sub encoding instructions. We want to sub encode the values 64F2FFFF and E9414141 and push them on the stack in that order.

Sub encoding a near jump back to the beginning of our buffer

In this instance, I am using the same method muts used in the NNM exploit to zero out EAX. There are other methods, the one I normally like is pushing 41414141 onto the stack, popping it into EAX, and then XORing EAX with 41414141 again. If we execute these instruction, we see our jump get revealed at our stack location.

We’re jumping back and…

And now we are back at the beginning of the buffer and have a ton of buffer space for code execution. We’re actually 8 bytes into our buffer here, which is important to note for keeping our attack string aligned.

Now we’re at 00b7f23c

Now we have plenty of space. Msfvenom can generate alpha-numeric shellcode, so at this point we’re home free. Or almost. There are two other factors to consider. The first is that msfvenom-generated alpha-numeric shellcode is prepended by a non-alpha-numeric stub which sets the shellcode location in order to to operations relative to the start of the shellcode. This is a problem in this case. Offsec is nice enough to document the BufferRegister feature of msfvenom, which sets your buffer to a particular register. In order for this to work, we need to line up a register with our shellcode. So we adjust ESP using the same method as before and we can see below that our stack now lines up with our buffer, so the BufferRegister setting in msfvenom will work, our shellcode is waiting for us about 900 bytes and change away, and we should have a shell.

Lined up buffer.

You’ll note that this doesn’t work, and this is due to stack alignment. The stack must be aligned to a DWORD in x86 processors, which means that the memory address must be divisible by 4. Since ESP has been set to a place not evenly divisible by 4, instructions get confused and execution just gets messed up. So we adjust the stack to a location divisible by 4 and ensure we adjust all of our padding as well, and that’s it.

Adjusted shellcode

And…

It works.

I have the working code posted here. It should be noted that I chose ESP, but you can use any register, I believe