Skip to main content

Can Machine Learning Find Path Traversal Vulnerabilities in Go? Snyk Code Can!

Written by:
0 mins read

Who doesn’t like a good security challenge? I know I do, but do you think a machine-learning algorithm is up to the challenge of detecting path traversal vulnerabilities? Well, let’s put that to the test!

In the past couple of weeks, Zeyad AbuLaban shared a series of posts about secure code and security vulnerabilities, and in his series of “Secure Code Quest” he presents vulnerabilities of different varieties and in different languages.

When Zeyad posted his Secure Code Quest #5, instead of reaching straight to the code in the picture in an attempt to review the insecure patterns, I had another thought crippling up - “if I paste this Go code into my IDE and scan it there with my Snyk VS Code extension, will it be able to detect it?”

Vulnerable code to Path Traversal in Golang - a Security challenge.

Is This Golang Code Vulnerable or Not?

Let’s begin by putting that Go program code into a nicely formatted layout to inspect it more thoroughly in a proper code review process.

This Golang program deals with files, and as such, it includes a downloadHandler() function that acts as an HTTP controller. It expects to get a filename in the query parameter, perform path clean-up, and security sanitization. Next, it constructs a file path to a storage folder on disk (baseDir := “public/files/”) by concatenating the filename into a base root directory.

package main

import (
    // ... imports ...
)

func main() {
    // ... main function code ...
}

func downloadHandler(w http.ResponseWriter, r *http.Request) {
    filename := r.URL.Query().Get("filename")
    if filename == "" {
        http.Error(w, "Missing filename parameter", http.StatusBadRequest)
        return
    }

    // Preventing Path Traversal
    sanitizedFilename := strings.Replace(filename, "../", "", -1)
    cleanedFilename := filepath.Clean(sanitizedFilename)

    baseDir := "public/files/"
    fullPath := filepath.Join(baseDir, cleanedFilename)

    // ... code to handle file serving ...
}

How does this Go program implement security controls? It uses string replace logic to exchange all the ../ instances in a given filename with an empty string, essentially removing them entirely from the filename provided as user input.

Let’s imagine the following HTTP request:

curl "http://localhost:8080/download?filename=../../app/flag.txt

This HTTP request will result in the filename variable being sanitized from all ../ paths, which would keep this filename input as only app/flag.txt. Then, that would be constructed with the baseDir path, resulting in the full path being set to public/files/app/flag.txt.

So how would you bypass it?

How would you craft the HTTP request so that you can grab the capture-the-flag style /app/flag.txt flag and beat the challenge to get the prize?

Before I reveal the payload, let me show you what happens when you paste this Golang program into the IDE.

Securing code with Machine Learning

First, I’ve added this logic as part of a new route in my web application that uses Gin, which is why you see a slightly different route declaration here. Next, you can spot how I’ve completed the code after the fullPath variable to perform a sanity file check and then finally send the file back to the client.

Vulnerable Golang program

Now imagine that you have the Snyk extension installed in the IDE.

All you have to do is save the file, and seconds later (yes, truly just seconds), you get a code security audit that analyzes your source code (including dependencies, Dockerfile, Terraform, etc.).

More than just giving you a security report, the Snyk IDE extension also highlights the vulnerable line of code in a linter style to draw attention and can further provide you with more tools to dive into the security issue at hand:

  • The security vulnerability name and details.

  • The data flow from source-to-sink shows how unsanitized data flows into sensitive APIs.

  • Learn more about this path traversal vulnerability if you hear about it for the first time.

  • Present you with fix code examples you can apply if you’re unsure how to fix this vulnerability.

In some cases, Snyk will also offer its DeepCode AI Fix to automatically resolve the security issue by refactoring the code to remove the vulnerability while keeping your logic.

 Snyk IDE extension detects Path Traversal vulnerability in Golang program code.

What does machine learning have to do with this? Well, everything.

Snyk’s SAST (static application security testing) analyzes your source code and then runs it through Snyk’s proprietary machine learning engine. This security engine is driven by a robust Symbolic AI algorithm trained on large code bases and gained intelligence into insecure code patterns, classified and labeled by security experts in a fine-tuning process.

This profound technical process runs in seconds, straight in my IDE, and provides unrivaled speed and accuracy in detecting security vulnerabilities in my code. This engine was also trained on what “good looks like” and analyzed real security fixes in open source code repositories to suggest code diff of fixed code examples that I can apply if I’m not confident in how a security fix needs to be.

The combination of generative machine learning algorithms and symbolic AI contributes to Snyk’s incredibly productive experience detecting code vulnerabilities. Learn more about how Snyk ensures the safe adoption of AI.

Exploiting Golang for Path Traversal

Oh, right. Back to our security challenge. How do we exploit the path traversal in the demonstrated Golang program and capture the flag?

As a reminder, here is the path traversal prevention logic in the Golang program:

    // Preventing Path Traversal
    sanitizedFilename := strings.Replace(filename, "../", "", -1)
    cleanedFilename := filepath.Clean(sanitizedFilename)

The string replacement logic follows a word match of the text ../. This means that if we have the classic path traversal logic in the filename text, such as “../” it will get replaced.

However, what happens if we modify the path traversal text so that it bypasses the exact matching logic? Consider the following HTTP payload with the modified filename input:

curl "http://localhost:8080/download?filename=....//....//app/flag.txt

In the case where the value of the query string filename becomes ....//....//app/flag.txt, the occurrences of “../” are matched (they are found within the filename string) and removed, leaving the remaining “..” and the one-off “/” text in the actual filename. This results in the sanitizedFilename value of ../../app/flag.txt traverses outside the base directory (public/files/) and sends the flag.txt file to the client.

Play Fetch the Flag

Test your security skills in our CTF event on February 27, from 9 am - 9 pm ET.