Background
The adoption of Zero Trust security is gaining momentum, particularly in the U.S. government, driven by the impending deadline set by the Office of Management and Budget Memorandum M-22-09. This memorandum requires federal agencies to meet specific cybersecurity standards and objectives by the end of fiscal year 2024.
At the heart of those standards is Special Publication (SP) 800-207, a foundational paper on Zero Trust by the National Institute of Standards and Technology (NIST) which establishes the groundwork for Zero Trust Architecture. NIST has recently released a draft of a companion publication, SP 800-207A—co-authored by Tetrate founding engineer and Istio steering committee alum Zack Butcher. This publication defines Zero Trust access control standards for cloud-native applications in multi-cloud environments.
In this blog, we will highlight the key points from the draft publication and provide additional resources to help you gain a deeper understanding of this emerging standard.
Identity-Based Segmentation
The core concept of SP 800-207A, according to NIST, is referred to as “identity-based segmentation.” Traditionally, security policy has been defined using network-based primitives such as IP addresses and subnets. However, in dynamic systems like Kubernetes and other cloud-native deployment models, it becomes challenging to create flexible policies based on those network primitives.
Instead, identity-based segmentation relies on tamper-proof, cryptographically verifiable identities that are associated with services, users and devices, rather than network parameters like IP addresses and subnets. This approach enables the definition of policies using higher-order, application-level objects in addition to traditional, network-based controls. By leveraging these cryptographic identities, organizations can establish policies that align with the specific needs of their applications and users, ensuring a more granular and secure approach to access control and segmentation.
Five Policy Checks for a Zero Trust Architecture
There are five policy checks at the core of a Zero Trust Architecture implementing identity-based segmentation that are designed to mitigate attacks by bounding access in space and time and preventing intruders from pivoting to other parts of the system:
1: Encrypted communication between service endpoints. Various service endpoints may be located across a range of subnets, availability zones, or regions within a cloud provider environment, multiple clouds, or on-premises. Regardless of their location, it is crucial that communication between any two service endpoints is encrypted to safeguard against eavesdropping and to ensure message authenticity.
2: Service authentication based on cryptographically verifiable runtime identities. Every service should provide a short-lived, cryptographically verifiable identity authenticated for each connection and regularly re-authenticated. Ideally, this authentication should happen for each service request, but may not be feasible on high-volume traffic. As an alternative, a mutual TLS (mTLS) channel that authenticates both parties during the initial connection handshake can be used. These connections should be short-lived, typically only as long as the TTL of the service’s identity certificate or as short as 15-30 minutes, depending on the configuration.
3: Service-to-service authorization. Granular access policies between services based on their identities should be enforced on every request, with verdicts rendered either locally or in conjunction with external authorization services.
4: End-user identity & authentication. Since all application requests are triggered by user actions, there must be a robust user identity management system in place. That system should issue cryptographically verifiable runtime tokens that can be used to authenticate the end user throughout the system. End-user credentials should be checked at every hop within the system, but calling out to a global identity management system at every hop is impractical at scale. As an alternative, short-lived, external end-user credentials such as OAuth 2.0 tokens should be presented on the way into the system and exchanged for a local credential—like a JSON Web Token (JWT)—that can be efficiently authenticated locally at every hop.
5: End-user-to-resource authorization. As with service-to-service authorization, end-user access policies should be enforced in front of every resource against the authenticated end-user principal. This can be performed either by the application itself or by locally checking against a set of claims in a JWT, for example. Alternatively, authorization can be verified against a policy decision point (PDP) in an external authorization system.
The Enterprise Cloud-Native Platform: Orchestrator, Service Mesh and a Global Management Plane
In its SP 800-204 series of security standards for microservices applications, NIST establishes a reference platform consisting of Kubernetes for orchestration and resource management, with the Istio service mesh providing core security features, including:
- service identities and service discovery
- Traffic routing and resiliency functions such as retries, timeouts, blue-green deployments, and circuit breaking.
- Ensuring application integrity and confidentiality through service-to-service and user-to-resource authentication and authorization
- Integration with external policy-based authorization engines (e.g., Next Generation Access Control (NGAC), Attribute-based Access Control (ABAC), and Open Policy Agent (OPA))
How Istio Implements NIST’s Five Zero Trust Policy Checks
1: Encryption in transit is available out of the box via mTLS between services in the mesh and either mTLS or TLS to external services.
2: Service identity & authentication is provided via SPIFFE identities for workloads.
3: Service-to-service authorization based on strong workload identities and built-in policies are enforced at the Envoy sidecar proxies in the service mesh data plane which act as a policy enforcement point (PEP). More mature implementations should leverage dedicated authorization infrastructure mentioned above for richer policy decisions.
4: End-user identity & authentication is provided by a trusted identity provider or IDaaS via Envoy’s external authentication integration capabilities; thereafter, the Envoy data plane enforces the verdict throughout the system.
5: End user-to-resource authorization verdicts are also rendered by external authorization systems and enforced by Envoy throughout the system.
Managing a Consistent Set of Policies for an Enterprise Zero Trust Architecture
Across an enterprise, the orchestrator and service mesh platform can be found in both on-premises data centers and various cloud service locations, typically with a dedicated service mesh instance in each Kubernetes cluster.
Each service mesh instance has two main logical components: a control plane to facilitate APIs for defining configuration and policies; and a data plane that enforces those policies at runtime. While this works well in a single cluster context, maintaining a uniform set of policies to govern access between any two microservices across the entire enterprise, regardless of their location or which service mesh instance they belong to, can be a challenge.
Implementing a global management plane hosted within a global control plane can bridge this gap by establishing a consistent set of policies applicable to all services operating within the enterprise and apply them to the individual service mesh instances.
Multi-Tier Policies
Identity-based policy alone is not a magic bullet. In fact, neither traditional network-level policies nor identity-based policies alone are sufficient. Network-level policies tend to be brittle and hard to manage for highly dynamic cloud environments. It’s hard to keep policy up to date using traditional tools, especially at the workload level.
Identity-based policy alone can also be difficult to administer. Since we need to manifest those policies across heterogeneous infrastructure, we need a way to maintain a consistent identity for services across the identity domains of multiple clusters, clouds, and on-premises.
Also, while there is a move towards identity-based policy and a reduced emphasis on network perimeter controls, network-oriented policies cannot be completely eliminated given the current regulatory compliance landscape.
Instead, a multi-tier approach where application-level identity-based policy is layered onto traditional network policy can provide the best of both worlds. Coarse grained network-level policy can be relaxed somewhat and remain relatively static where it is augmented by fine-grained, application-level identity-based policy.
Best of Both Worlds: Use Transit Gateways to Enforce Network and Identity-Based Policy
For an example of this multi-tier approach, take the case where two applications need to communicate with each other between cloud infrastructure and a data center through a DMZ with firewalls on both sides (Figure 1).
Using only network-based policy to allow this, we need to update firewall rules on both sides, which is brittle, hard to manage and operationally slow. Asking the firewall team to make rules updates takes time—maybe days to weeks or more— to get approved. And, these policies—defined in terms of subnets, ports and IP addresses—may be recorded in a spreadsheet somewhere that documents to varying degrees of quality what the policy is for and how it’s implemented. Policy at this level is hard to maintain, hard to reason about, hard to keep consistent and slow to change.
A solution that layers dynamic, identity-based policy on top of more static, network-based policy is to use an Envoy-based transit gateway to move application traffic through (Figure 2).
In this model, a static set of L3/L4 firewall rules allows the Envoy transit gateways on either side to communicate with each other across the gap. The Envoy gateway instances, meanwhile, are configured to enforce dynamic, fine-grained identity-based policy that describes who and what is allowed to communicate across the tunnel.
This best of both worlds allows application traffic to easily traverse the underlying network—and do it transparently with the use of a service mesh and a global management & control plane to handle consistent policy and identity across domains. At the same time, while we’ve relaxed the network-based policies, we still have inbound firewall rules, but they are now augmented by dynamic, identity-based policies in the transit gateways.
Conclusion
There are clear drivers for organizations to adopt Zero Trust security principles, especially in federally regulated environments. How to realize Zero Trust across the enterprise has not always been clear—especially as standards and best practices continue to evolve. NIST’s new Zero Trust standards for cloud-native applications are providing some much needed clarity: don’t throw away existing network-based controls. Instead, use a dedicated infrastructure layer like a service mesh to incrementally layer identity-based L7 controls on top of existing L3/L4 controls—and bridge multi-site deployments with an enterprise infrastructure consisting of a global control and management plane that configures site-local service meshes for unified and consistent policy enforcement across environments.
Learn More
For a deep dive from the source, watch this presentation by the co-authors of the new NIST standard—Ramaswamy Chandramouli of NIST and Zack Butcher of Tetrate—given at the 2023 CloudNativeSecurityCon.