Security Engineer Interview Questions & Answers

2025/05/22

(Un)-fortunately, there’s no standardized Leetcode-esque interview process for security engineers 1. There are a number of online resources for security engineer interview questions, but I found them to be too high level (explain encoding vs encryption vs hashing) compared to interviews I’ve been in. A lot didn’t have answers. I spent some time prepping when I switched jobs earlier this year, and this is my attempt to collate some interesting questions + answers all in one place!

Existing resources

These are some other resources I’ve found helpful:

Questions

Cryptography

What’s the difference between symmetric and asymmetric encryption?

Symmetric encryption is like having a shared safe combination - both Alice and Bob need to know the exact same secret to encrypt and decrypt messages. Simple, but there’s a catch: how do they securely share that secret in the first place?

Asymmetric encryption solves this with public/private key pairs. Alice can encrypt with Bob’s public key (which can be shared freely), and only Bob can decrypt it with his matching private key. When Bob wants to reply, he uses Alice’s public key, and only she can read it with her private key.

Why pick one over the other? Symmetric encryption is faster but requires that shared secret problem to be solved first. Asymmetric is perfect for one-way communication (like sending someone an encrypted email) since you only need their public key. This is why most real-world systems (like TLS) use both - asymmetric to exchange a symmetric key, then symmetric for the actual data.

Does TLS use asymmetric or symmetric encryption?

Both! TLS starts with an asymmetric handshake to securely exchange keys (solving the hard problem of how to share a secret), then switches to symmetric encryption for the actual data because it’s way faster.

When a client connects, it tells the server what cipher suites it supports. These are strings like TLS_AES_256_GCM_SHA384 that specify both the asymmetric part (like ECDHE-RSA for key exchange) and symmetric part (like AES-256-GCM for bulk encryption). A server might support multiple suites, but modern TLS prefers AEAD ciphers like AES-GCM.

Explain envelope encryption

Data at rest needs encryption, but encrypting everything with one key is asking for trouble. Instead, use envelope encryption: generate data encryption keys (DEKs) for your actual data, then wrap those DEKs with a key encryption key (KEK).

This is nice for a few reasons: it limits how much data gets encrypted with any single key (important for key wear-out), makes key rotation way easier (just re-wrap the DEKs with a new KEK), and contains the blast radius if a key is compromised.

Tink and the AWS Encryption SDK are two libraries that support this pattern. Slack wrote a great piece on their engineering blog about building their Enterprise Key Management system on these principles.

What are the differences between AES-GCM and AES-CBC? Why might you prefer one over the other?

GCM gives you encryption and integrity in one package (it’s an AEAD cipher) while CBC only handles encryption. You need to add a MAC to CBC yourself, and getting that wrong is a common source of vulnerabilities (Lucky 13).

IV reuse is interesting here - CBC degrades somewhat gracefully while GCM completely falls apart. GCM-SIV is another option similar to GCM, and is slightly slower but way more forgiving.

GCM can decrypt things in parallel which seems nice, but breaks the integrity guarantees if you modify blocks independently. This has bitten real implementations that tried to be “clever” with random access before.

Nope! This falls to a length extension attack. If you know H(k | m), you can continue hashing from there without knowing k. The attacker reconstructs the hash’s internal state and can append whatever they want to the message.

This bites a lot of hash functions - SHA-1, SHA-2, MD5 all use Merkle-Damgård construction which makes them vulnerable. SHA-3 dodges this by using a sponge construction. Interestingly, truncated variants like SHA-512/256 and SHA-384 are also safe since attackers can’t reconstruct the full internal state from the truncated output.

Want to do this right? Use HMAC. It’s specifically designed to avoid these pitfalls and it’s available in every major crypto library. H(m | k) is not vulnerable, but it’s still better to just use HMAC.

Explain certificate transparency (CT) logs

This post by Emily Stark is a great overview of CT logs. The cliff notes:

Certificate transparency (CT) logs are an auditable record of issued TLS certificates. CT logs enforce transparency for CAs and create a record of what certificates are issued for a domain. CT logs have caught CA compromises before!

Before issuing a certificate, certificate authorities (CA) will submit a precertificate with a poisoned field indicating it should not be used to the CT log. A signed certificate timestamp gets returned to the CA, who can then issue the certificate with the SCT attached.

Safari and Chrome both require that certificates trusted by the browser be recorded in a CT log. Firefox interestingly does not.

Under the hood, CT logs use a Merkle tree as an append-only ledger. Sometimes, bit-flips happen which leads to the whole CT log needing to be thrown out and replaced.

A side-effect of certificates being recorded in CT logs is that information about subdomains becomes public. You can use crt.sh to search for certificates.

What’s the difference between hashing and signing?

Anyone can compute the hash of something. Hashes are collision resistant, meaning:

Checking a message’s hash verifies its integrity – it proves the message hasn’t been tampered with. But since anyone can compute a new hash for a modified message, hashing alone doesn’t prove who created or modified it.

A digital signature provides both authenticity (proving who signed it) and integrity (proving it hasn’t been modified). Signing requires the private key, while anyone can verify the signature. Modern signature schemes like EdDSA, RSA, and ECDSA sign a hash of the message instead of the raw message. This has a couple benefits:

This also means that the security of a signature scheme relies on the hash– a second input that hashes to the same value would also verify successfully!

Web security

How can you protect against SSRF?

Input validation is the first layer here – allowlist URL schemes or domains for inputs that come from the client. There are tons of ways to bypass input validation so this is not sufficient. TOCTOU DNS rebinding attacks can also happen. CVE-2022-41924 is a great read about a RCE in Tailscale on Windows that exploited this.

In Node, you can use a http.Agent to filter requests to private ranges. Network-level protections should also be enabled, disable access to RFC 1918 ranges using iptables.

Cloud metadata endpoints (169.254.169.254) are a juicy target for SSRF - AWS IMDSv1 was particularly bad since a single SSRF could grab the credentials. Block these at the network level and enforce IMDSv2 which fixes this by requiring PUT requests and custom headers. The hop limit can also be set to a max of 2.

What is CORS? What does it protect against?

CORS (cross-origin resource sharing) is a security mechanism that allows sites to relax the same-origin policy (SOP) and indicate what other origins the browser should allow loading resources from.

CORS protects against cross-origin attacks by requiring servers to explicitly opt-in to allowing other origins reading responses. If making a complex cross-origin request (like a PUT or DELETE with custom headers), the browser will send a pre-flight request. Servers specify what headers are allowed through Access-Control-Allow-* headers in the pre-flight request response.

CORS does not protect against cross-origin requests from being made, only reading the responses. For state-changing operations, CSRF tokens should be used.

You’re building a markdown editor that lets users edit HTML, how can you protect against XSS?

Markdown parsers are XSS nightmares because they’re designed to allow HTML - that’s part of the spec! First, decide if you really need raw HTML support. If you don’t, use a strict markdown-only parser and strip all HTML tags.

If you do need HTML, there’s two main options:

One thing to watch out for is injecting dynamic content after sanitization. If the editor lets users inject variables or templates into the markdown (like ${user.name}), you need to sanitize at render time, not just at parse time.

Cloud security

You’ve found an SSRF vulnerability running on a EKS instance, how can you exploit this?

Since this is running on AWS, the natural thing to check is if you can exploit the AWS Metadata Endpoint at 169.254.169.254. Assuming you can, you can retrieve the IAM credentials associated with the underlying EC2 instance.

The default role for EKS nodes is attached to these policies:

Christophe Tafani-Dereeper has a great post that goes more in-depth on this topic. Some bad things you can do from their post:

Again, all this is AWS account-wide, comes by default when spinning up an EKS cluster, and is accessible by only having compromised a single, underprivileged pod in the cluster.

Ouch!

If you’re able to read files with the SSRF, you can check /var/run/secrets/kubernetes.io/serviceaccount/token for service account tokens associated with the cluster which often have way more permissions than needed. Found one that can list pods? Cool, now you can see all the environment variables they’re injecting, including other services’ credentials.

Application security

A secret key was committed to our Github repository, what steps need to be taken?

This happens often, it’s very easy to do (0.1% of all Github pushes have secrets!), and ideally should be prevented through better tooling. Visit HowToRotate and follow the steps there.

Trufflehog has great research on all the ways secrets can stick around. Basically, once someone runs git push, it’s impossible to scrub. If the push is to a public repository, it’ll likely be automatically revoked through GitHub Secret Scanner Auto Remediator. Hopefully before an attacker grabs it, which usually takes about 10 minutes. If the key is in a private repository, you have a little bit more time.

How can you prevent secrets from getting logged in an application?

This is actually a pretty tricky problem– secrets have a nasty habit of sneaking into logs through error messages, debug traces, and HTTP traces. In 2018, Twitter logged plaintext passwords.

Getting this right requires a lot of trial and error, and playing Whack-A-Mole as new log sites get added. At a high level, making sure secrets are scoped to areas where they’re needed and not getting passed around all over the application makes a big difference.

Languages like Python and JavaScript let you override __str__ or toJSON methods on sensitive objects. secret-value is a Typescript library that does this for you by wrapping secrets in an object. If your application uses a logging library like Winston or Log4j, you can modify the log stream to redact information.

An area that I’m not particularly familiar with, but is also worth knowing about is crash dumps. Storm-0558 hacked into US Cabinet emails through a Microsoft crash dump that didn’t have the signing keys scrubbed.

Another approach is to inject canary values and alert if they appear in logs. Then, you can go and fix where in your application they’re coming from. The core of Datadog’s sensitive data scanner library which is designed to detect secrets in logs is open-source on Github.

Why are JWTs hard to rotate and what common mitigations exist?

JWTs are designed to be stateless- the token contains all the state needed to check their validity and they often contain user session information to minimize the need for database lookups. Since there is no database session row to invalidate, it’s hard to rotate them.

There’s a few ways around this, each that comes with tradeoffs:

Design

How would you design login for a web app?

This question covers:

My first answer is “don’t - instead use oauth with Github/Google”, but that makes the question much less interesting!

We want to design a system to run user code, how can we sandbox it?

The right approach depends a lot on the use-case, some potential questions to consider:

Some easier options, that don’t provide perfect security:

Some better options:

Whatever you use has a lot more considerations than just security. How do you handle networking? Package installation? What about memory limits and CPU quotas? How much does it cost to run?

I’m planning on writing an in-depth dive comparing different sandboxing technologies and who uses what. More to come on that later!


  1. Say what you want about Leetcode, but it is easily grindable and possible to get better at with effort. There’s a huge mini-industry around interview prep and practice for software engineers that doesn’t exist in the same way for security engineers. ↩︎