Enter your email address to follow our blog with all the latest software engineering tricks, tips, and updates from the R&D team at KnowBe4!
By: JJ Ferman
Published: 9 Nov 2022
Last Updated: 9 Nov 2022
In this article I want to talk about some networking quirks that AWS has within their cloud infrastructure. I'll be covering topics such as how Security Groups work alongside Network ACLs, how to grant your Lambda functions access to private VPC resources, and the difference between public and private subnets.
By default, a Lambda function will run in what's called the Lambda Service VPC. This is a different VPC than your own account VPC — it is transparent to you and configured in AWS's private accounts. With this default configuration, your Lambda function will not have access to your private VPC resources (even though it's "your" Lambda).
There is an alternative networking configuration for Lambda functions, but for many developers this can be tricky to understand. By turning on "VPC mode" for the Lambda function, the runtime is able to connect to and access resources in a VPC that you own.
What this essentially does under the hood is creates an Elastic Network Interface (ENI) for the Lambda function which allows it to connect directly to your VPC. For Lambda functions, this interface is called a Hyperplane ENI.
Now that you've got your Lambda function running in your VPC, you've solved one problem but potentially caused another - what if you need a function that can both connect to private VPC resources and still have outbound internet access?
Once you connected your lambda to a Hyperplane ENI, the network traffic starts to go through your account VPC. If you don't route outbound internet traffic through a NAT gateway (setup in a public subnet), then your requests won't make it to their intended destination.
If you intend for your lambda to have outbound internet access with the above configuration, or you want your lambda to have a static or known public IP address, then a NAT gateway is the way to go.
Essentially a NAT (Network Address Translation) gateway provides a means for outbound connections to be made from your VPC but no inbound connections initiated into your VPC (if you want your VPC to be publicly accessible, see Internet Gateway - but note that this does not work for lambda functions directly).
Setting up a NAT gateway in a public subnet and routing traffic through it from your private subnet (where your lambda is running) allows for outbound traffic to be sent through that NAT gateway. But remember — this is not needed if you aren't connecting your lambda to your VPC or don't need a known or static public IP address. A NAT Gateway created in a public subnet must have an elastic IP assigned to it at creation. And because the NAT gateway replaces the source IP with its own, that means all outbound traffic through it would have a known public IP (that is, the elastic IP attached to it).
There are countless ways that your traffic could potentially not reach its target destination, which can make it difficult or frustrating to troubleshoot.
A Network ACL is a way of defining subnet level policies for your inbound/outbound network traffic. A network ACL consists of ordered, allow and deny rules for both inbound and outbound traffic. In order for a round trip request to be successful, the ACL has to allow both the inbound to and outbound traffic from the target resource.
Network ACLs are stateless, which means that responses to allowed inbound traffic are subject to the rules for outbound traffic (and vice versa).
Security Groups are kind of like virtual firewalls. They allow you to create rules around types of traffic that you want to allow, inbound or outbound. These groups can be connected to one or more AWS resources.
Notice how I didn't say deny traffic. In contrast to network ACLs, security groups can only allow traffic. There's no designation for denying traffic with a security group. By default, however, when you create a security group, all inbound traffic is "denied" because you need to explicitly allow traffic.
Security groups are stateful - if you send a request from your instance, the response traffic for that request is allowed to flow in regardless of inbound security group rules.
Network ACLs and Security Groups don't override each other but instead work together to govern network traffic through your subnets and resources. This means you could have a security group connected to a resource that allows outbound traffic but then a network ACL on the same subnet that resource is connected to that blocks outbound traffic. This might cause some confusion as to why your traffic isn't making it to where you intended.
Fortunately AWS has a handy tool to help with that called the VPC Reachability Analyzer. With this tool, you can quickly check if the network traffic you're intending to send or receive (or, block) is configured correctly.
Subnets are a group of IPs in your VPC. What makes a subnet public or private is simply whether those IPs are directly accessible to the outside internet or not.
In general, it's good practice to make use of public and private subnets. That way any sensitive data can be behind a protected network and you won't have any directly accessible port open for some bad actor to bang against.
There are lots of different ways these subnets could be configured, but I've created a simple example below to illustrate how it could look.
In summation, there's a lot to know about AWS networking and its intricacies. One way to quickly get familiar is to just try something out and see how it works.
KnowBe4 Engineering heavily uses On-Demand environments for quick iterations on native cloud-based…
How KnowBe4 solved the "It Works on My Machine" problem with a new approach to provisioning test…