Ludwin Fuchs, the authority on Host Identity Protocol (HIP), covers identities, trust, the networking data plane, and intelligent multi-homing in Tempered's Airwall Solution.
Tempered Host Identity Protocol (HIP)_ Identities and Trust and Achieving Zero Trust Security

 

Other Tempered presentations from Security Field Day 3

Tempered's use of Cloud, Virtualization, Containers, and APIs
What is Airwall - a technical introduction and demo
Who is Tempered - an introduction by Jeff Hussey, CEO and Founder

Additional resources

Download Airwall whitepaper - we make networks invisible
Download Gartner Hype Cycle for Threat-Facing Technologies
Secure critical infrastructure with Airwall
Airwall Solution - Talk with an Expert

Transcript

Ludwin Fuchs: [00:00:08] So good afternoon, everybody. My name is Ludwin Fuchs. I'm a Tempered engineer since eight years now. And although my day to day work here is usually focused more on the control plane side, what I'm going to do during the next 30 minutes is I'm going to dive a little deeper into the Tempered data plane. And, you know, I don't I don't assume people. So at the core of the Tempered data plane is the host identity protocol. And I don't assume, you know, everybody is familiar with that. So what I'm going to do is I'm gonna go a little bit into that. I'm going to talk about, you know, what what was the motivation behind doing this? What were the goals that the authors had? When they when they worked in the protocol and I'm gonna I'm gonna talk a little bit about the core key exchange mechanism off the host identity protocol, which is called the hit base exchange. And it's sort of the central piece of the work we are where all of the security posture also derives from. And I'm going to you know, I'm going to spend a little bit of time talking about that, kind of gonna tell a little bit about what the assertions are that you can make, you know, the security assertions following out of that protocol. And I'm going to follow that up with talking a little more about how does this fit altogether with the orchestration environment that Bryan has shown? You know, how do the pieces work together? And and then I'm gonna I'm gonna move along talking a little bit about some of the architecture pieces that our technology consists of, things like the relay, you know, how we do port isolation and also, you know, how are we dealing with things like mobility and multi homing and all those kinds of things.

 

Ludwin Fuchs: [00:02:02] All right.

 

Ludwin Fuchs: [00:02:04] So that the host identity protocol is essentially an authenticated key exchange. So it's it's essentially it's what is called a Zema compliant, authenticated key exchange, which which is basically saying that it's got a perfect forward secrecy. So it's hardened against a lot of, you know, typical cryptographic network attacks and things like that. What's nice about HIP is that it also really comes with a lot of network infrastructure answers. So, you know, things like a mobility, multi homing traversal of NAT all of all those kinds of things either addressed directly in in the core standards documents or, you know, in related RFCs. So it has a really a rich infrastructure environment which makes it really attractive. What what we also like about HIP is that even though it does not really have any reliance on, you know, a public key infrastructure and certificates and all those kinds of things, it does actually really play really nice along. So if you if you really like to have those kinds of things, you can and HIP will work with it. All right, so a few of words about the actual protocol standard. So the work on HIP started to get into gear for real around 2003. There is RFC 5201, which is sort of the the first standards document of what is today called HIP v1.

 

Ludwin Fuchs: [00:03:51] This was followed up in 2015 by RFC 7401, which covers version two of the protocol standard. And so the major changes between the two versions is essentially all around cryptographic agility. So the a few of the crypto algorithms were being kicked out. So they gave they gave the boot to things like SHA-1. They added various mandatory transforms. You know, there's a few elliptic curve related transforms that were added. And they they also allowed various things to be negotiated that that version one didn't allow in, particularly around Diffie–Hellman key exchange. And I think there were also several changes related to the packet processing in the version two standard. So those are the core documents. There are other ones that are really interesting to look at in particular; there's RFC 5202 talking about integration with ESP, which, as you know, is what we are doing. There is RFC 5207, which is about NAT traversal and firewalls, and we have RFC 8005, which is a DNS extension, and there's RFC 8046 and 8047 talking about host mobility and multi homing.

 

Ludwin Fuchs: [00:05:13] All right.

 

Ludwin Fuchs: [00:05:20] All right.

 

Ludwin Fuchs: [00:05:21] So the sort of the core premise when when the authors of HIP started working on this protocol was sort of that in IP networking, the the the address namespace is kind of overstretched. Right. So basically, IP addresses as used today, they have this dual role. Right. They have this dual role as locators and as identifiers. And so IP networking doesn't really deliver on both of those roles. So so essentially, right, when we when we are going and when we are typing, you know, KeyBank.com into our browser or something like that, we we expect, of course, that the browser knows how do you know how the packets go from the browser to you know, we have a KeyBank is and find that way back. But what we also expect is that, you know, at the end of that of that path is, in fact, KeyBank. So, you know, we we wouldn't want anybody else to, you know, to get those packets. And so essentially, that's that's what this whole dichotomy is about. So, of course, the Internet, you know, IP networking in general has been, you know, really impressive at dealing with the first aspect. So, you know, we've got like billions of nodes on the Internet nowadays.

 

Ludwin Fuchs: [00:06:43] And, you know, by and large, it works just really great. But, of course, the identity aspect of the whole situation has been has been a miserable failure. All right. Which is sort of the reason why we're having a network security industry nowadays. So the people behind HIP, they you know, they thought they can fix this situation. Right. So the and the idea they had was, you know, nothing short of. All right. We're we're basically going to change the network stack. Right. So we're going to come along and we're going to have this protocol and we're going to we're going to have this layer in between the network and the transport layer. Right. And so and so that that's going to be our host identity layer. And it's going to deal with all of these problems. Right. And, you know, this is gonna turn untrusted traditional networking into trusted, secure networking. Well, of course, nobody wants to change their networking stack, and the good news is you don't have to because, you know, with with our overlay a routed solution, you know, you just don't have to do it. So a little bit about the HIP protocol here.

 

Ludwin Fuchs: [00:08:03] So first of all, what is what is the host identity?

 

Ludwin Fuchs: [00:08:08] So that's that's the first thing to understand with HIP. Right. And in case of HIP, host identity is simply the public part of of of an asymmetric key pair. And so, you know that that seems like a good approximation for representing identity. However, public keys are rather large. Right. And depending on on the crypto algorithm and all of that, the the the keys also can vary in size. Right. So that they're unwieldy. You can't really you can't really use them and, you know, in places where you would want to have identifiers for those identities. So, you know, you wouldn't want to type them in on on a console or something like that. They're just that's just too unwieldy for these kinds of things. So so the people in HIP had, you know, what they created what is called a host identity tag, which is basically a 128-bit condensation of the entire public key.

 

Ludwin Fuchs: [00:09:06] And so the that they're called HITs. And so they essentially what is called an ORCHID, which stands for overlay, routable,  cryptographic, hash identifier. And that is basically a fancy way of saying they they look like IPV6 addresses and they can be used like IPV6 addresses. So, you know, an example here is is on the slide. So so HITs in particular that they are built up by reserving 28-bits for and you know, a global Ianna assigned routing prefix. And then there are four bits reserves which essentially encode the the encryption and hashing algorithm used all the hashing and truncation methods used in order to obtain the HIT. So that that leaves you with ninety six bits in that identifier for the actual hash. Just to, to put in and and that's great.

 

Ludwin Fuchs: [00:10:01] So you have, you know, we have these identifiers now and you can put them into all kinds of places. You can stick them into packet headers. And if you've got your your routing set up, you can actually use them as IP addresses on your overlay network, which is exactly what what we are doing, at Tempered.

 

Ludwin Fuchs: [00:10:20] So with this, we can we can start talking about the actual HIP base exchange, so the base exchange has two parties there. There's an initiator and the receiver and there are four messages being passed between those. And, you know, they're called I1, R1, I2, R2. And, you know, if you've gone through the whole messaging sequence, what you've got is a HIP association. And so I'm not going to go into, you know, all the nitty gritty details here. But I will point out what the roles of all of these messages is and what the security implications are and why those messages exist. Right. So you've got the first message, the I1. It will simply contain the HITs of both of the parties. Right. And the receiver's HIT is actually not mandatory. So implementations actually may allow you to omit that in order. So you could you could eventually build up things such as a discovery system where the initiator of the communication doesn't actually know the identity of the receiver. But typical exchange would have both. Now, the receiver, you know, after receiving this trigger packet, will send the reply, the R1 to the initiator. And this this reply will contain. Which is like a little cryptographic puzzle. And the the the point of this puzzle is that that essentially the receiver will make, you know, wants to make sure that the initiator is sincere about the communication.

 

Ludwin Fuchs: [00:12:05] So, you know, because he has to spend C.P.U cycles in order to solve the puzzle. Now the receiver goes off, you know. So works on the puzzle solution. And once he found the solution, he can send the I2 message back to the receiver. And then the receiver goes along, checks the solution. And if all goes well and if the solution is right, the receiver will basically send an R2, which concludes the whole base exchange. And at the end of this, you you'll have the HIP association. So what are the security relevant roles of this? Right. So, first of all, steps one, two and three essentially consists of a authenticated Diffie–Hellman key exchange. Right. So in the I1 packet, the initiator will also include his preferred list of Diffie–Hellman group IDs and basically will also. And then the receiver will send his own group I.D. in R2. And then the receiver will also already pick a group ID, and will add his own Diffie–Hellman parameters. Right. And then on the third message, the initiator will just send his own Diffie–Hellman parameters into the receiver. And at that point, both parties have a shared cryptographic material that they can draw session keys from. Right. So it's great. The other point is that steps three and four essentially constitute the actual identity check.

 

Ludwin Fuchs: [00:13:40] Right. So an important thing to understand with the first and the second messages is that even though the second message is signed by the receiver, it is actually not possible. So it would be it would be basically prone to replay attack because there is nothing that is really verifiable by the initiator in that signature in terms of identity. But during the during the third and fourth messages, both participants have shared key material rights so they can produce things like HMX and signatures that allows them to actually verify the identity. And so they are also exchanging their public keys in R1 and R2. So they have they have a strong cryptographic means to to verify each other's identity. Now, I've already said that that the puzzle solution is is effectively a defense against Denial-of-service. So the reason is because during this whole exchange, up to and including the third message, all of the effort is on the initiator side. The receiver essentially can can answer to that first initiator message with the puzzle - it can, for example, be pre-created and simply be reused. And the the signed the signed part of that message also contains only parts that never change. Right. So the so that the receiver can also pre-create that part of the R1 message. So there's there's really very little effort that the receiver has to spend in order to deal with this messages.

 

Ludwin Fuchs: [00:15:23] And the receiver also does not need to actually create any state, you know, about that ongoing conversation until he has received the I2 message. So, you know, that's that is sort of an effective way to deal with Denial-of-service. Also, what about replay attacks? Well, that is essentially prevented also by the third and fourth message, because, again, at that point, we have shared keys on both sides. So we can add things like HMAX and things like that. So even the first two messages would be, in theory, re-playable by an attacker. The attacker would not be able to bring the whole conversation off the base exchange across the finish line if he didn't actually have access to the shared key. Finally, there is the HIP protocol also talks about preventing downgrade attacks and and basically that that is being achieved by sort of prescribing how the first two messages need to be handled. So essentially, protocol downgrade. So in the first two messages of the participants exchanged, Diffie–Hellman Keys and things or preferred Diffie–Hellman parameters and things like that. So the prevention of that is essentially by how that's supposed to be handled. All right. So do we have any questions at this point?

 

Delegate: [00:16:56] Diffie–Hellman groups are supported?

 

Ludwin Fuchs: [00:16:59] Yes, well. So it's one of the things that was added or so essentially preferred, Diffie–Hellman groups, were supported in in HIP v2. So the initial version of the protocol did not allow that.

 

Ludwin Fuchs: [00:17:17] All right. So just a little bit of how the Tempered solution has a little bit of a slant here on this. So one of the things I mentioned earlier, is that we do actually include certificates in the base exchange. So all the Tempered Airwalls get provisioned with certificates. So we and the Conductor will distribute these certificates to the Airwalls. So as part of the key exchange, all of the Airwalls will check the certificates, not just the public keys that comes from the other side. So we've added that. So in terms of where we are with a V1 and v2 of the base exchange, we're sort of in between. So we we are supporting all of the new crypto algorithms. So, you know, we have AES-256. And, you know, all the all the mandated crypto algorithms are supported. But we do not have all of the features of V1 in place at this time.

 

Ludwin Fuchs: [00:18:23] All right, so.

 

Ludwin Fuchs: [00:18:26] Now, after the base exchange completed, we have a shared HIP association between two parts. So so what's happening now? So one of the things that was also going on during the base exchange during the last two messages is that the parties exchange ESP parameters. Right. So basically the security parameter index and and also the the preferred the the preferred transform is is all exchanged also as part of the exchange. So at the end of that, both parties are essentially ready to start sending ESP packets to one another. And just to come back, I know we had we had this question during Bryan's session early on. So what are the kinds of packets that you see on the data plane? Well, you basically see two kinds of packets, both of which are UDP encapsulated. But in the first case, you see actual HIP packets. So HIP have their own protocol ID. Right. So you see HIP control packets. But the great myth that the vast majority of packets that you would see on your network would be UDP encapsulated ESP packets. And that's that's where the actual data traffic is being carried along.

 

Delegate: [00:19:45] That raises the question that I kind of have about this overlay underly setup that you've got. What's the reachability in the underlay network required to be able to set up this overlay network?

 

Ludwin Fuchs: [00:19:58] So you have two options, right? So if you have no relay, you know, we have the relay architecture. Without a relay, you have to have ports 10500, a reach-ability between one of the peers in order to build a tunnel with another peer. So, you know, you can traverse NATs easily that way. But as long as you have, you know, the HIP port being reachable from one peer to another peer, you can build a top.

 

Delegate: [00:20:28] Right, because one of the use cases I think we discussed earlier was around operational technology networks and SCADA networks and so on, where part of the protection we had was that it wasn't actually connected to these networks before. But now we have to have some kind of IP reach-ability into a network that what previously didn't actually speak IP. So I'm just trying to understand how how do we do this in a safe way? So we're not actually increasing the attack surface while we then try to put this overlay network on top?

 

Ludwin Fuchs: [00:20:57] Right. Yeah. Yeah. And, you know, as we mentioned, with the relay in the loop, you really actually need to have no direct path between the two peers, because because you can go through the relay.

 

Ludwin Fuchs: [00:21:12] All right. So, OK, one a few more things actually to mention here is that also as part of the HIP protocol, additional sorts of packets, the most important one is called the HIP update packet. It's also a packet that provides strong assurance. It has an H-Mac and it has you know, it has the signature and all of that. And it allows the two peers to do things like rekeying the session key or notifying each other that are that the underlay IP addresses have changed. And, you know, all those kinds of things are being carried over HIP control packets such as such as update.

 

Ludwin Fuchs: [00:21:55] Also, what I'd like to say a little bit about our ESP again is, so, Tempered uses ESP in tunneled mode.

 

Ludwin Fuchs: [00:22:05] So, we tunnel everything, but actually we're tunneled more than the standard ESP tunnel does so.

 

Ludwin Fuchs: [00:22:13] We also include the ethernet header in, you know, in the in the encryption. So this leaves us with roughly 90 bytes of packet overhead, you know, on the data plane. And also, you know, now now that you look at the architecture here. So we've got we've got tunnel set up and we have we have this whole identity framework. So what does that actually mean? Right. Well, so it's a little bit different between our our agent platforms and our gateway platforms. So far, our agents, the the the identity really only covers the agent itself. So so trust does not go beyond the device that actually runs the agent software. This is different from our gateways, where the gateway identity is really sort of like a place holder for multiple devices behind that gateway or, it could be an entire network. Right. And so also what we what we also add with the port isolation is that we also allow you not just to north-south segmenting of your network. We also allow you to go east to west on the single gateway. And it's all controlled by the by the Conductor. So basically, you know, this is this is really the place where the Conductor orchestration comes in. Right. Because, you know, host to host identity is is only half of the picture. Right. So what comes on top of that is the entire policy that gets pushed out from the Conductor. And so essentially, for any packet that that hits to the HIP demon, we we basically can tell by looking at the source and the destination eyepiece, you know, is this packet actually going to go in the tunnel or, you know, out of the tunnel? And it's only going to do that if we have policy for that.

 

Delegate: [00:24:09] Just a quick question regarding scalability. How many of these sessions can you have in parallel? What is memory or what is the limiting factor and how large can you scale?

 

Ludwin Fuchs: [00:24:23] So, you're talking about like how many tunnels from a single Airwall?

 

Delegate: [00:24:28] Exactly.

 

Delegate: [00:24:29] So we can we can scale up. Actually, I have the expert here in terms of performance questions. So, I'm going to bring in Dustin here. Dustin is the person that can answer this.

 

Dustin Lundquist: [00:24:42] Yeah. So we can scale depending on platform, memory footprint - the footprint per tunnel is relatively small. We ship with a 1K (1000) tunnel limit and that's configurable. And then we'll bump it up. We've bumped it up to 4K (4000) plus. But most of our customers aren't aren't looking for that, though.  We're seeing usually several hundred is the upper upper limit there.

 

Ludwin Fuchs: [00:25:17] All right, thank you.

 

Ludwin Fuchs: [00:25:21] Ok. Any more questions? I think I'm going to have to start flying along here. So. A few things here that I'd like to bring up.

 

Ludwin Fuchs: [00:25:40] We've talked about identity. So so how does the identity actually get on the Airwall? All right. So that's that's one of the things where we have a process for. So essentially, it goes like this. As the customer fires up the Airwall, the Airwall will go along create a key pair. And, you know, the customer will go to the Airwall; pair it with a Conductor, and as soon as the Airwall connects to the Conductor, it will send a certificate signing request to that Conductor. And so what we have here at Tempered also is an identity server. So it's essentially like a provisioning server that we expose on the Internet. And so as as the Airwall shows up in the Conductor UI, the customer can first of all, you can say, you know, has to actually has to actually allow it to go through there. And and then the Conductor will communicate with the Tempered identity server, and get a signed certificate from from Tempered. And that's what's what's going to end up on the Airwall. So from that point on the Airwall is provisioned.

 

Ludwin Fuchs: [00:26:57] A few things about, you know, what's inside the box. So this is essentially a very, very, you know, high level view on the gateway architecture, on the Airwall gateway architecture.

 

Ludwin Fuchs: [00:27:08] So we're running OpenWIT as the base OS, essentially. So it's a firmware - we are pretty happy with OpenWIT - it's a pretty nice ecosystem. You can boil it down very small and you know, it has already all of the network management features and all the kinds of things that, you know, that you'd like to have on your base OS. So we've been happy with it. So on the architecture side, you essentially, you have on the left you have the overlay side and on the right you have sort of the underlay side. So it's just a schematic here, but essentially. So we have what's called a port groups. So those are those are essentially Linux named spaces that allow us to put the entire interfaces of those ports into a separate namespace. And so they're isolated from one another. And so it makes it easy to have port isolation going. You can just so we're basically supporting two types of ports there. One is what they're called a hybrid layer 2 / layer 3 ports. So essentially, if you wanted to have an overlay network that spans a LAN on either side of the tunnel, then that would be the kind of port group configuration that you would use.

 

Ludwin Fuchs: [00:28:30] And we have routed only mode. So basically, if you want to if you wanted to have separate, separate subnets, you know, behind your Airwalls, then that is the preferred way of configuring those. Now, to talk with the actual HIP demon and the Airwall there are tap interfaces. So essentially, you know, they're bridged in case of the hybrid ports and un-bridged in the other case. And, you know, inside of HIP at that point, it's really all pushing packets around and, you know, encrypting packets coming in from one side and pushing them out on the other side. And the other important part of the architecture is we have a control demon. So the most important role of that is to connect to the Conductor and basically get all of the Conductor policy configuration. And, you know, essentially that's the that's the role of that control demon - it also runs things such as the monitoring framework and, you know, the link managing framework. And it's also responsible to do all of the OS level configuration jobs.

 

Ludwin Fuchs: [00:29:47] A few things about the relay.

 

Ludwin Fuchs: [00:29:50] So the relay is, you know, Tempered's answer to defeat double NAT. And so essentially the way this works is the Conductor provides what's called relay routes. So any Airwall can be told to have, you know, various types of relay connections to peer Airwalls. So from the perspective of the single Airwall, you end up with basically multiple relays that could be used to talk to a peer Airwall. So we basically have to figure out which relay, would be the best to use in order to to have a base exchange. And so we have we have two approaches to do that. One approach is what we call the shotgun approach. Right. And so using that approach, you have all the Airwalls essentially continuously try to build HIP tunnels. And while they do that, they they actually send I1 packets to each of the relays. And the relays really aren't involved in in any of the base exchange. But what they do have to do is they they basically learn the mapping of the identities and the IPs and ports of each of the participants. So the relay is no, you know, which Airwall has which IP address and all of that. And then, the other approach is what we call relay probes. And relay probes mean that it's a more lightweight approach so that where each Airwall sends a special I1 packet to each of the relays that it has and the relays will just respond with an R1 packet. The Airwall uses that to time the exchange and just keeps an exponential average. And then, you know, it's just going to try the fastest relay first in order to do the base exchange.

 

Ludwin Fuchs: [00:31:39] All right. I think I'm kind of I'm thinking I'm at the end of the time here. How am I doing here? Can I get five minutes? OK, so last last thing to talk about here is our link managing framework. Right. So one of the one of the nice things about our Airwalls is, for example, we have we have models that that allow you to equip them with with dual Sims. So we have Airwalls where, you know, you could you could basically have a sort of like a remote AJ configuration for those kinds of Airwalls where, you know, you have one SIM configured with one carrier. You have like failbacks SIM to talk to a different carrier. And we have a we have software on the Airwall that that will sort of monitor the network and will basically provide failover. And the way the way we allow you to do that is we have what is called these failover groups.

 

Ludwin Fuchs: [00:32:41] So basically failover groups can be used to sort of put in different types of link. They don't have to be cellular. You can have any any type from the under link you would like. You can you can group them and you can assign them priorities. And you can also assign a traffic class or maybe you want different failover groups for different types of traffic. So, for example, you could have a different group for the Conductor connection and you could have, you know, another group for your for your data plane connectivity. And the link manager will basically run and cooperate with with the with the monitoring framework and, you know, do things like run run pings and find out, you know, which which one of the links actually works, which is the one that we should be using now. And so, you know, this this is giving you fast fail over and you know, many you know, things like that. It's just it's just a must have in many industrial applications scenarios. You know, we've had customers who had like factory factory applications where the Airwall was part of, you know, moving, you know, moving robots that had to be wireless. But then, you know, all of a sudden they have to be wired. And so fail over in those kind of situations is is a really nice feature.

 

Delegate: [00:33:58] But this is pure, active-passive, right?

 

Ludwin Fuchs: [00:34:01] No, it's actually active-active. So both of the links are up at all times or, you know, how ever many links you have. So it's it's active-active. So it's it's basically just a matter of the link manager telling the control plane, you know, use this interface instead of that. But it's not like, you know, oh the link needs to be wound up.

 

Delegate: [00:34:27] And you have kind of a load sharing as well? That you say some of the connections uses link one; the other link two to optimize bandwidth?

 

Ludwin Fuchs: [00:34:37] That's a good point. No, I don't think we have that at this time.

 

Delegate: [00:34:43] Ok.

 

Delegate: [00:34:44] It was mentioned earlier about the sort of the smaller, almost throwaway gateways, are any of those POE enabled or POE pass through?

 

Ludwin Fuchs: [00:34:54] I think we have, yes. I think we have POE enabled gateways. I'm not too familiar with the details. Does anyone here in the room know about it? Yeah, I hope so. All right. So I've got one of these sitting right there.

 

Ludwin Fuchs: [00:35:09] So this is one of our newest models. And, yes, it is POE enabled.