Down the Rabbit Hole - A Deep Dive into an attack on an RPC interfaceJune 07, 2018 | Fritz Sands
Recently, the ZDI received and purchased a trove of vulnerability reports from an anonymous researcher who targeted the Advantech WebAccess product. The vulnerabilities were all in C string handling functions (strcpy and sprintf, for the most part) and were reached through two RPC interfaces. The proofs of concept for the vulnerability submissions were very detailed and show how to write a custom RPC client to set up attacks. This blog post walks through one of the attacks, showing all of the details. Additionally, we are releasing the source code for this attack in parallel with this post. You can download it here.
The source code and the techniques can provide a template for further research in other RPC interfaces.
Advantech WebAccess is a SCADA Human Machine Interface (HMI) system. Installation and setup opens ports 4592 and 14592 for TCP traffic. These ports are serviced by processes (webvrpcs.exe and datacore.exe) that run in the context of a local administrator. These ports use Remote Procedure Call (RPC) protocols to communicate with clients, and both of the RPC interfaces can be called from remote unauthenticated clients.
RPC! OMG! What do I do with that?
While a number of reports of vulnerabilities accessed through RPC interfaces have come through the ZDI program, they really have been a small percentage of the total cases we see. This is despite the fact that there are a LOT of software packages that deploy their services using RPC, and the fact that a lot of the code has not been revisited in years. I think it is likely that the learning curve and initial investment in analysis that is necessary to probe RPC services for vulnerable code intimidate a lot of security researchers. This is an area of security research that is well worth exploring. With proper tools, an RPC interface can be explored efficiently.
There are several RPC models. The initial specification for RPC was defined by IEEE in 1991. The Microsoft RPC implementation is compatible with the Open Group’s Distributed Computing Environment (DCE) and is interoperable with other DCE-based RPC systems.
Microsoft has a great set of documentation on its implementation of RPC here. The core of an RPC interface is the definition of the functions that are invoked by a client and executed on the server. Typically, that interface definition is contained in an Interface Definition Language (IDL) text file. The Microsoft IDL (MIDL) compiler generates stub program files for the client and the server from the IDL file. Those C files are compiled and then linked into the client and server binaries.
For some great base knowledge of reverse engineering RPC interfaces, I recommend the following resource from the Last Stage of Delirium. This presentation is old, but still quite good. The techniques in that presentation can be used to reverse engineer RPC interface definitions from the server binary file back into an IDL file. The presentation provides some basic roadmaps for research and exploitation as well as a detailed description of the protocol sequences used by Microsoft RPC.
Additionally, IDL reversing will provide you with the address in the server code of the interface method stub, which you can then use with a static analysis package to trace data from the client on its way through the service.
Let’s Take A Walk Through A Proof of Concept
In the following, please refer to the source code files that accompany this blog post.
The vulnerability report I am using for this blog is CVE-2016-0856 (ZDI-16-048). This covers a vulnerability in Advantech WebAccess version 8.0 and was fixed in version 8.2 of the product. The vulnerability is a stack buffer overflow caused by a
sprintf call and was given a CVSS v3 severity rating of 9.8. Here’s the ICS-CERT advisory.
Our first step is to take a look at the IDL that the researcher created (of course, just saying that ignores a huge amount of reverse engineering). The Last Stage of Delirium presentation referenced above gives tips and techniques for creating an IDL from the RPC stub information that compilation and linking bakes into a server binary.
Webcore.idl contains the interface we are interested in for this proof of concept. Almost all of the methods contain a “connection ID” as an input to the method. For example:
Here is the IDA Pro pseudocode of that function:
This is a C stub function generated by the MIDL compiler from information in the IDL file. The actual work of the call is done in the call to sub_402B50. Everything before and after that call is data marshaling and un-marshaling. One thing to note is how stylized the code looks. This formalism allows tools (discussed in the Last Stage of Delirium presentation referenced above) to reverse the MIDL compilation and build an IDL file from the generated routines in the binary.
One method in the IDL has a different signature from the other methods. Instead of taking
connId as an input parameter, it outputs a
connId from the server.
This method creates a
connId on the server and passes it out to the client. This
connId (more generally known as a context handle) is then used to identify the context of all of the further calls to the interface. A context handle allows the server to store state for a sequence of calls from a client. This is a standard pattern for RPC interfaces.
Here again is the prototype for sub_401000:
Taking a look at the parameters to this function, we see the handle
hBinding is created by a call to
RpcBindingFromStringBinding. More generally,
hBinding is the first parameter to all calls to the interface on the client side so that RPC knows how to access the server.
connId is the context handle that had been created by a previous call to the sub_4017C0 RPC call.
Ioctl is a function-selection parameter. This interface is constructed with a parameter that selects the functionality that should be performed by the service. It is parallel to the use of the IOCTL parameter in the Windows DeviceIoControl API, so the finder named it
ioctl. In my experience, this is not a particularly common way to construct an RPC interface – more frequently each function has its own method in an interface. That approach in this product would lead to a very large IDL file due to the number of methods in the service that are called through this interface. The developers chose instead to have a small number of methods in the interface, but each method uses a switch statement (branching off of the ioctl parameter) to execute appropriate functions.
To prepare for testing the proof of concept, use your favorite C development environment to build bwconn.dll from the provided source files. After building the DLL, move a copy of bwconn.dll to the directory containing the Python proof of concept files.
Set up your target machine – I used Windows Server 2008 R2 x64 as my base OS. You can download WebAccess 8.0 from the Advantech website. After software installation on your target, you may want to create a test project in the Advantech system, add and download a SCADA node to the project, then start the Advantech kernel. This is important if you are investigating functions in the interface within the datacore service (which only starts when you start the Advantech kernel) but is not necessary for most functions in the interface in webvrpcs (which is started automatically). The proof of concept I am exploring here calls the webvrpcs interface.
Attach a debugger to webvrpcs.exe on your target machine, then run the proof of concept Python code on your attacker machine. You are rewarded with an immediate crash in webvrpcs.exe:
Stack is smashed. Eip controlled. Life is very, very good.
One note – there is a heartbeat process (webvkeep) that keeps tabs on webvrpcs. If you crash webvrpcs, the webvkeep process will restart it in a few minutes, which can be handy when you want to restart for another run. If you want to keep webvkeep from crashing webvrpcs while you are stepping through, you should put webvkeep to sleep until you need it.
The stack smash wiped out the most recent part of the stack trace, so some debugging and stepping through is in order to locate the exact call. When you do, you will find that the vulnerable code is in BwOpcSvc.dll.
“WindowName” is a fixed-size character array on the stack and is 0x80 bytes long. You know where this is going…
On the other end of the call stack, the call originates in a server stub function:
Sub_402B50 branches based on the IOCTL to load the correct subsystem DLL and process the specific call. Here is part of the pseudocode of the routine (
arglist is the IOCTL):
The breakdown in this function of the IOCTL DWORD into ranges that call methods in different DLLs is paralleled in the proof of concept file bwconn.py that lays out different test cases for each IOCTL value.
The Python proof of concept in vuln.000.py is pretty straightforward:
RpcWebClientConnect creates the binding handle from the communication parameters, then calls the RPC service to get a context handle from the server. The script then creates a packet with an overflow and sends it to the function with an IOCTL value of 0x1388B.
Next, let’s look at the network traffic. Use your favorite packet sniffer.
The first communication is the RPC request that returns the context handle from the server. The server response is:
The returned context handle is 0x02680ce8.
This context handle will be used in all further RPC requests.
There are some further requests and responses to set up the conversation, and then the attack packet is sent:
Yellow is the context handle.
Magenta is the IOCTL (
Blue is the input string length (
Green is the return string length (
These fields are followed by the string that will be copied into the WindowName field and will produce the buffer overrun.
If you look at stack layout in the targeted function, the return address from the function is stored just after the WindowName field, since the application does not use stack cookies. This lack of mitigations makes exploitation quite simple on a stack buffer overrun.
This vulnerability was fixed in version 8.2 of Advantech WebAccess by replacing the call to sprintf with a call to snprintf. This prevents the buffer overflow.
How to get from the Vulnerability to the Interface
We know that strcpy and sprintf are dangerous functions when the source is tainted data (e.g. from the user). This is why both functions are in the Microsoft “Banned API” list. You can use your favorite disassembler to look through the target code for uses of those functions (and in the codebase in question, there are many uses of those functions), and then walk the calltree back to stub functions of the RPC interface looking for the source of the parameters. You can also go the other direction – start at the stub functions you found and then drill down through subroutines following tainted data until you get to vulnerable calls. Why, yes, I am glossing over a lot of hard Reverse Engineering work in this paragraph.
Once you find a path from a vulnerable API using tainted data up to an RPC stub, you can construct an attack packet and use the RPC framework in this blog post to test the method.
The Advantech WebAccess version 8.0 software package is an excellent product for you to test your understanding of RPC interfaces and hone your techniques. The code contains many exploitable vulnerabilities. Once you understand your process, you can use the techniques you honed to test the current Advantech product and use the framework provided in this blog post to create tests for other products that contain RPC services.
Good hunting! And remember that the Zero Day Initiative is interested in the product of your research! Until then, you can find me on Twitter at @FritzSands, and follow the team for the latest in exploit techniques and security patches.