Google CTF 2023 Totally Not Brute Force

tl;dr bpftrace side channel to reduce brute force search space of guessed flag. Solved by Corwin & myself 🤝

Setup

Quick summary of files of interest in the handout:

powproxy/*
src/server/main.go: “Flag server” code; has implementation of CheckFlag and Ping functions
src/proxy/main.go: Handles HTTP requests endpoints
src/proto/flag.proto: Definitions for FlagService - namely interface for CheckFlag and Ping functions

What's running it's qemu? Corwin apparently.

We’re given a nice run-qemu script to set up the challenge locally.

I was able to get this working on Windows 11 + WSL 2 with the following and running sudo ./run-qemu:

# Install virt-manager
sudo apt install -y virt-manager
 
# Add youself to kvm and libvirt group
sudo usermod --append --groups kvm,libvirt "${USER}"
 
# Fix-up permission to avoid "Could not access KVM kernel module: Permission denied" error
sudo chown root:kvm /dev/kvm
sudo chmod 660 /dev/kvm
 
# Stat required services
sudo libvirtd &
sudo virtlogd &
 
# Launch virt-manager
virt-manager &

Endpoints

The server has three HTTP request endpoints we need to care about:

Flag Check: /?flag=...
Profile: /profile?t=...
Status: /status

The first endpoint gives the option to submit a guessed flag via the flag param (/?flag=...). The guessed flag is split into 10 prefixes, each kicked over to a flag server. Each flag server then itself performs a constant-time comparison on each prefix of the string it’s given. If all of these pass, then we get a 200 status code back. Otherwise, we get a 403.

Diagram of flag check setup

Here is the code that handles the flag checking on the flag servers - src/server/main.go

  


func (s *server) CheckFlag(ctx context.Context, req *proto.CheckFlagRequest) (*proto.CheckFlagResponse, error) {
	f := req.GetFlag()
	if f == "" {
		return &proto.CheckFlagResponse{Ok: false}, nil
	}

	prefix := []byte(cycled(f, s.length*(s.shard+1)))
	result := subtle.ConstantTimeCompare(prefix, []byte(s.flagPrefix))
	return &proto.CheckFlagResponse{Ok: result == 1}, nil
}

func cycled(s string, l int) string {
	var result strings.Builder
	for i := 0; i < l; i++ {
		result.WriteByte(s[i%len(s)])
	}

	return result.String()
}

The second endpoint takes an optional t param (default value is 5), which is used to construct a bpftrace command string and then execute it. We’ll touch a bit more on what bpftrace is and what this doing later.

src/proxy/main.go

  


		s := "interval:s:" + timeout + " { exit() } profile:hz:99 /pid == $1/ { @[ustack] = count() }\n"
		f.WriteString(s)
		f.Close()

		log.Println("executing probe:", s)
		c, cancel := context.WithTimeout(r.Context(), 10*time.Minute)
		defer cancel()
		out, err := exec.CommandContext(
			c,
			"bpftrace",
			f.Name(),
			strconv.Itoa(os.Getpid()),
		).CombinedOutput()
		if err != nil {
			log.Println("error while tracing:", err, "output:", string(out))
			w.WriteHeader(http.StatusInternalServerError)
			return
		}

The third and last endpoint just gives an ok back. 🙂

Brute Force??? But not too much?

The main issue is the only feedback we get on flag correctness is whether at least one of the prefixes is incorrect (HTTP 403) or the whole flag is correct (HTTP 200). Because of this, we would have to guess the whole flag at once - 2^160 guesses for a 20-character flag (minus some constraints like CTF{...} and the characters in flagPattern). 🥲

If we could get feedback on when a guess prefix was correct, then we would be able to brute force prefixes two bytes at a time for a total of 2^16 * 10 guesses (again minus some known constraints). Much more reasonable to brute force.

As mentioned earlier, we initially missed the use of subtle.ConstantTimeCompare and thought there was some timing attack we were supposed to do.

So at this point, the thing that stands out here is the whole setup of the /profile endpoint. bpftrace is a high-level tracing language for Linux enchanced Berkeley Parcket Filter (eBPF) available in recently Linux kernels. But why bpftrace???

Let’s break down this bpftrace command. The GitHub repo has an excellent bpftrace reference guide: "interval:s:" + timeout + " { exit() } profile:hz:99 /pid == $1/ { @[ustack] = count() }\n"

interval:s:<timeout> { exit () }: Operate for <timeout> seconds before exiting.
profile:hz:99 /pid == $1/ {@[ustack] = count() }: Take a profile every 99 syscalls counting user stacks of the process with the given pid (our pid!).

The first thought was that submitting a correct flag prefix vs an incorrect flag prefix would result in some substantially different results. After playing around with this for a bit, this did not seem to be the case.

Andddd there’s an injection in the bpftrace string. The t param in the HTTP request gets added into the bpftrace command without any additional validation or filtering. So what funky things can we add in there?

The first thing we tried was using system(char *fmt)*. Except it complains that we can’t use unsafe functions.

bpftrace system()

Note this is an unsafe function. To use it, bpftrace must be run with –unsafe.

After skimming the reference guide a bit more, we noted that the watchpoint() built-in seemed promising. watchpoint() takes an address, length, and mode, and then when it is hit, you can perform certain actions, such as incrementing a count or exiting. We can place a watchpoint in an area indicating success of a particular prefix. Then we can get per-prefix success feedback!

Some Go Reversing

Let’s take a look at what the proxy binary looks like. The actual version used is in initramfs.cpio.

Opening in Ghidra, we can run a few scripts to do some additional processing if needed.

  


		result := true
		for _, client := range clients {
			req := &proto.CheckFlagRequest{Flag: f[0]}
			res, err := client.CheckFlag(r.Context(), req)
			if err != nil {
				log.Println("error while checking flag:", err, "request:", req)
				w.WriteHeader(http.StatusInternalServerError)
				return
			}

			result = result && res.GetOk()   // <=== SET WATCHPOINT HERE
		}

We can set a watchpoint somewhere at that line to examine the result of the prefix check.

Looking around the string structure for "error while checking flag:"

The main loop: Main loop of flag check code

On lines 80-91 we have the equivalent of res = result && res.GetOk() - if result is false, short-circuit and finish. Otherwise, the next two checks are the result of res.GetOk().

We could set a execution watchpoint on line 82 (address 0x0088b6d7) to increment a counter whenever the flag server has encountered an incorrect prefix. We put a watchpoint here instead of line 86, as that it will only fire once before we start short circuiting at line 82. The bpftrace probe for this looks like watchpoint:0x0088b6d7:8:x { @curr++ }

To dump this information, we set an execution watchpoint on the calls to w.WriteHeader. At this point, we know the flag server has finished checking the prefix and can dump the flag. watchpoint:0x0088b6ef:8:x { printf("%d\\n", @cur); @cur = 0 }

Lastly, we can get our results by exiting bpftrace. We can hit the unused /status endpoint to trigger this. watchpoint:0x0088b262:8:x { exit() }

Putting it All Together

Putting it all together

We have our three breakpoints:

Found an incorrect prefix in some prior loop. Increment the count.
Flag server is done checking its flag shard. Print the count.
Hit the /status endpoint. Stop bpftrace.

If all N pairs of characters are correct for this flag shard, we would expect to see a count of 0. If only the first M (M<N) pairs of characters are correct for this flag shard, we would expect to see a count of N-M.

Pseudocode of strategy:

Known start of flag: "CTF{"
Known flag length: 20 characters (10 pairs of characters)

1. Start with known prefix
2. Hit /profile to start bpftrace with injected watchpoints
3. Hit /flag=... to submit a flag prefix + guess for next 2 bytes
4. Hit /status after some time to stop bpftrace
5. (For flag prefix + guess of N byte-pairs long)
   If counter is N
   Else (counter is )

We got our strategy working with small flag prefixes (e.g. CTF{). Interestingly, when scaled up, we ended up getting a number of false positives locally, but had it consistently working on remote. Sooooo… yeet on prod? 🙂

"prod is bae"

I think our script took about 40-60 mins to run.

CTF{W4s_1T_Ev3NtFu1}

Google CTF 2023 | Totally Not Brute Force

Setup

Endpoints

Brute Force??? But not too much?

Some Go Reversing

Putting it All Together

Further Reading

The BonziBuddy History Lesson Nobody Asked For

Google CTF 2022 | Mixed

Real World CTF 2018 | Magic Tunnel