Following yesterday’s Microsoft VPN vulnerability, today we’re presenting CVE-2022-23270, which is another windows VPN Use after Free (UaF) vulnerability that was discovered through reverse engineering and fuzzing the raspptp.sys kernel driver. This presents attackers with another chance to perform denial of service and potentially even achieve remote code execution against a target server.
Affected Versions
The vulnerability affects most versions of Windows Server and Windows Desktop since Windows Server 2008 and Windows 7 Respectively. To see a full list of affected Windows versions check the official disclosure post on MSRC:
The vulnerability affects both server and client use cases of the raspptp.sys driver and can potentially be triggered in both cases. This blog post will focus on triggering the vulnerability against a server target.
Introduction
CVE-2022-23270 is heavily dependent on the implementation of the winsock Kernel (WSK) layer in raspptp.sys, to be successfully triggered. If you want to learn more about the internals of raspptp.sys and how it interacts with WSK, we suggest you read our write up for CVE-2022-21972 before continuing:
CVE-2022-23270 is a Use after Free (UaF) resulting in Double Free that occurs as the result of a race condition. It resides in the implementation of PPTP Calls in the raspptp.sys driver.
PPTP implements two sockets; a TCP control connection and a GRE data connection. Calls are setup and managed by the control connection and are used to identify individual data streams handled by the GRE connection. The Call functionality makes it easy for PPTP to multiplex multiple different streams of VPN data over one connection.
Now we know in simple terms what PPTP calls are, lets see how they can be broken!
The Vulnerability
This section explores the underlying vulnerability. We will then move on to triggering the vulnerable code on the target.
PPTP Call Context Objects
PPTP calls can be created through an IncomingCallRequest
or an OutgoingCallRequest
control message. The raspptp.sys driver creates a call context structure when either of these call requests are initiated by a connected PPTP client. The call context structures are designed to be used for tracking information and buffering GRE data for a call connection. For this vulnerability construction of the objects by raspptp.sys is unimportant we instead care about how they are accessed.
Accessing the Call Context
There are two ways in which handling a PPTP control message can retrieve a call context structure. Both methods require the client to know the associated call ID for the call context structure. This ID is randomly generated by the server sent to the client within the reply to the Incoming or Outgoing call request. The client then uses that ID in all subsequent control messages sent to the server that relate to that specific call. See the PPTP RFC (https://datatracker.ietf.org/doc/html/rfc2637) for more information on how this is handled.
raspptp.sys uses two methods to access the call context structures when parsing control messages:
- Globally accessible Call ID indexed array.
- PPTP control connection context stored link list.
The difference between these two access methods is scope. The global array can retrieve any call allocated by any control connection, but the linked list only contains calls relating to the control connection containing it.
Let’s go a bit deeper into these access methods and see if they play nicely together…
Linked List Access
The linked list access method is performed through two functions within raspptp.sys. EnumListEntry
which is used to iterate through each member of the control connection call linked list and EnumComplete
which is used to end the current loop and reset state.
The ListIterator
variable is used to store the current linked list entry that has been reached in the list so that the loop can continue from this point on the next call to EnumListEntry
. EnumComplete
simply resets the ListIterator
variable once it’s done with. The way in which this code appears in the raspptp.sys driver can change around slightly but the overall method is the same. Call EnumListEntry
repeatedly until it returns null and then call EnumComplete
to tidy up the iterator.
Global Call Array
The global array access method is handled through a function called CallGetCall
:
This function effectively just retrieves the array slot that the call context structure should be stored in based on the provided call ID. It then returns the structure at that entry provided that it matches the specified ID and is in fact a valid entry.
So, what’s the issue? Both of these access methods look pretty harmless, right? There is one subtle and simple issue in the way these access methods are used. Locking!
Cross Thread Access?
CallGetCall
is intended to be able to retrieve any call allocated by any currently connected control connection. Since a control connection doesn’t care about other control connection owned calls the control connection state machine should have no use for CallGetCall
or at least, according to the PPTP RFC, it shouldn’t. However, this isn’t the case there are several control connection methods in raspptp.sys that use CallGetCall
instead of referencing the internal control connection linked list!
If CallGetCall
lets us access other control connection call context structures and certain parts of the PPTP handling can occur concurrently, then we can theoretically access the same call context structure in two different threads at the same time! This is starting to sound like a recipe for some racy memory corruption conditions.
Lock and Roll
Both the linked list access method and the CallGetCall
function reference a PptpAdapterSpinLock
variable on a global context structure. This is a globally accessible kernel spin lock that is to be used to prevent concurrent access to things which can be accessed globally. Using this should make any concurrent use of either call context list access method safe, right?
This isn’t the case at all. Looking at the above pseudo code the lock in CallGetCall
is only actually held when we are searching through the list, which is great for the lookup but it’s not held once the call structure is returned. Unless the caller re locks the global lock before using the context structure (spoiler alert, it does not) then we have a potential window for unsafe concurrent access.
Concurrent access doesn’t necessarily mean we have a vulnerability. To prove that we have a vulnerability, we need two code locations that could cause a further issue when running with access to the object at the same time. For example, any form of free operation performed on the structure in this scenario could be a good source of an exploitable issue.
Getting Memory Corruption
Within the raspptp.sys driver there are many places where the kind of access we’re looking for can occur and cause different kinds of issues. Going over all of them is probably an entire series worth of blog posts that we can’t imagine anyone really wants. The one we ended up using for the Proof of Concept (PoC) involves the following two operations:
- Closing A Control Connection
- When a control connection is closed the control connections call linked list is walked and each call context structure is appropriately de-initialised and freed. This operation is performed by a familiar function,
CtlpCleanup
.
- When a control connection is closed the control connections call linked list is walked and each call context structure is appropriately de-initialised and freed. This operation is performed by a familiar function,
- Sending an
OutgoingCallReply
control message with an error code set- If an
OutgoingCallReply
message is sent with an error set the call structure that it relates to is freed. TheCallGetCall
function is used for looking up the call context structure in this control message handling, which means we can use it to perform the free while the control connection close routine is running in a separate thread.
- If an
These two conditions create a scenario where if both were to happen consecutively, a call context structure is freed twice, causing a Use after Free/Double Free issue!
Race Against the Machine!
To trigger the race we need to take the following high level steps:
- Create two control connections and initialise them so we can create calls.
- On the first connection, we create the maximum allowed number of calls the server will allow us to.
- We then consecutively close the first connection and start sending
OutGoingCallReply
messages for the allocated call IDs.- This realistically needs to be done in separate threads bound to separate CPU cores to guarantee true concurrency.
- Then we sit back and wait for the race to be won?
In practice, reliably implementing these steps is a lot more difficult than it would initially seem. The window for reliably triggering the race condition and the amount of time we have to do something useful once the initial free occurs is incredibly small, even in the best case scenario.
However, this does not mean that it cannot be achieved. With a significant amount of effort it is possible to greatly increase the reliability of triggering the vulnerability. There are many different factors that can be played with to build a path towards successful exploitation.
One Lock, Two Lock, Three Lock, Four!
Let’s take a look at the two bits of code we’re hoping to get perfectly aligned and see just how tricky this race condition is actually going to be.
The CtlpCleanup Linked List Iteration
We can see here that the loop is fairly small. The main part that we are interested in is the call to CallCleanup
that is performed on each Call structure in the control context linked list. Now unfortunately this function is not as simple as we would like. The function contains a large number of different paths to execute and could potentially have a variety of ways that make our race condition harder or easier to exploit. The section that is most interesting for us in our PoC is the following pseudo code snippet.
Here, a set of detach operations are performed to remove the call structure from the lists its stored in and appropriately decrease its internal reference count. A side effect of this detach phase is that the call context structure is removed from both the linked list and global array. This means that if one thread gets to far through processing a call context structure free before the other one retrieves it from the respective list, the race will already be lost. This further adds to the difficulty in getting these two sections of code lined up.
Ultimately the final call to DereferenceRefCount
causes the release of the underlying memory which in our scenario it does by calling the call context structures internal free function pointer to the CallFree
function. Before we go over what CallFree
does, lets look at the other half of the race condition.
OutgoingCallReply Handling
The preceding excerpt of pseudo code is the bit of the OutgoingCallReply handling that we will be using to access the call context structures from a separate thread. Let’s take a look at the logic in this function which will also free the call context object!
This small code snippet from CallEventCallOutReply
represents the code that is relevant for our PoC. Effectively if the status field of the OutgoingCallReply
message is set then a call to CallCleanup
happens and again will eventually result in CallFree
being hit.
CallFree
The call free function releases resources for multiple sub objects stored in the call context as well as the call context itself:
In CallFree
, none of the sub-objects have their pointers Nulled out by raspptp.sys. This means that any one of these objects will cause potential double free conditions to occur, giving us a few different locations where we can expect a potential issue to occur when triggering the vulnerability.
Something that you may notice looking at the code snippets for this vulnerability is that there are large portions of overlapping locks. These will in effect cause each thread not to be able to enter certain sections of the cleanup and freeing process at the same time, which makes the race condition harder to predict. However, it does not prevent it from being possible.
We have knowingly not included many of the other hazards and caveats for triggering this vulnerability, as there are just too many different factors to go over, and in actuality a lot of them are self-correcting (luckily for us). The main reason we can ignore a lot of these hazards is that none of them truly stop the two threads from entering the vulnerable condition!
Proof of Concept
We will not yet be publishing our PoC for this vulnerability to allow time for patches to be fully adopted. This unfortunately makes it hard to show the exact process we took to trigger the vulnerability, but we will release the PoC script at a later date! For now here is a little sneak peak at the outputs:
A Wild Crash Appeared!
The first step in PoC development is achieving a successful trigger of a vulnerability and usually for kernel vulnerabilities this means causing a crash! Here it is. A successful trigger of our race condition causing the target server to show us the iconic Blue Screen of Death (BSOD):
Now this crash has the following vulnerability check analysis and its pretty conclusive that we’ve caused one of the intended double free scenarios.
It turns out that the double free trigger here is triggering a kernel assertion to be raised on a linked list. The cause of this is one of those sub objects on the call context structure we mentioned earlier. Now, while crashes are great for PoC’s they are not great for exploits, so what do we need to do next if we want to look at further exploitation more seriously?
Exploitation – Next Steps
The main way in which this particular double free scenario can be exploited would be to attempt to spray objects into the kernel heap that will instead be incorrectly freed by our second free instead of causing the above kernel vulnerability check.
The first object that might make a good contender is the call context structure itself. If we were to spray a new call context into the freed memory between the two frees being run then we would have a freed call context structure still connected to a valid and accessible control connection. This new call context structure would be comprised of mostly freed sections of memory that can then be used to cause further memory corruption and potentially achieve kernel RCE against a target server!
Conclusion
Race conditions are a particularly tricky set of vulnerabilities, especially when it comes to getting reliable exploitation. In this scenario we have a remarkably small windows of opportunity to do something potentially dangerous. Exploit development, however, is the art of taking advantage of small opportunities. Achieving RCE with this vulnerability might seem like an unlikely event but it is certainly possible! RCE is also not the only use of this vulnerability with local access to a target machine; it doubles as an opportunity for Local Privilege Escalation (LPE). All this makes CVE-2022-23270 something that in the right hands could be very dangerous.
Timeline
- Vulnerability Reported To Microsoft – 29 October 2021
- Vulnerability Acknowledged – 29 October 2021
- Vulnerability Confirmed – 11 November 2021
- Patch Release Date Confirmed – 12 January 2022
- Patch Release – 10 May 2022