Policy Management at Cloud-Scale with Anders Eknert from Styra (video + interview)

Jun 07th, 2024 | 23 min read

Key Takeaways

This interview is part of the simplyblock’s Cloud Commute Podcast, available on Youtube , Spotify , iTunes/Apple Podcasts , Pandora , Samsung Podcasts, and our show site.

In this installment of podcast, we’re joined by Anders Eknert ( Twitter/X , Personal Blog ), a Developer Advocate for Styra, who talks about the functionality of OPA, the Open Policy Agent project at Styra, from a developer’s perspective, explaining how it integrates with services to enforce policies. The discussion touches on the broader benefits of a unified policy management system and how OPA and Styra DAS (Declarative Authorization Service) facilitate this at scale, ensuring consistency and control across complex environments. See more information on what the Open Policy Agent project is, what ‘Policy as Code’ is and what tools are available as well as how OPA can help make simplyblock more secure. Also see interview transcript section at the end.

EP15: Policy Management at Cloud-Scale with Anders Eknert from Styra

Key Learnings

What is the Open Policy Agent (OPA) Project?

The Open Policy Agent (OPA) is a framework designed for defining and running policies as code, decoupled from applications, for use cases like authorization, or policy for infrastructure. It allows organizations to maintain a unified approach to policy management across their entire technology stack. Styra, the company behind OPA, enhances its capabilities with two key products: Styra DAS and an enterprise distribution of OPA. Styra DAS is a commercial control plane for managing OPA at scale, handling the entire policy lifecycle. The enterprise distribution of OPA features a different runtime that consumes less memory, evaluates faster, and can connect to various data sources, providing more efficient and scalable policy management solutions.

What is Policy as Code?

Policy as code is a practice where policies and rules are defined, managed, and executed using code rather than through manual processes. This approach allows policies to be versioned, tested, and automated, similar to software development practices. By treating policies as code, organizations can ensure consistency, repeatability, and transparency in their policy enforcement, making it easier to manage and audit policies across complex environments. Tools like Open Policy Agent (OPA) (see above) facilitate policy as code by providing a framework to write, manage, and enforce policies programmatically.

What are the available Policy as Code Tools?

Several tools are available for implementing Policy as Code. Some of the prominent ones include:

Open Policy Agent (OPA) : An open-source framework, to easily write, manage, test, and enforce policies for infrastructure modifications, service communication, and access permissions. See our podcast episode with Anders Eknert from Styra .
HashiCorp Sentinel : A policy as code framework deeply integrated with HashiCorp products like Terraform, Vault, and Consul.
Kubernetes Policy Controller (Kyverno) : A Kubernetes-native policy management tool that allows you to define, validate, and enforce policies for Kubernetes resources.
Azure Policy : A service in Microsoft Azure for enforcing organizational standards and assessing compliance.

These tools help ensure that policies are codified, version-controlled, and easily integrated into CI/CD pipelines, providing greater consistency and efficiency in policy management.

How will OPA help to Make Simplyblock even more Secure?

Integrating Open Policy Agent (OPA) with simplyblock and Kuberntes can enhance security in several ways: Centralized Policy Management: OPA allows defining and enforcing policies centrally, ensuring consistent security policies across all services and environments. Fine-Grained Access Control: OPA provides detailed control over who can access what, reducing the risk of unauthorized access. Policies can, for example, be used to limit access to simplyblock block devices or prevent unauthorized write mounts. Compliance and Auditing: OPA’s policies can be versioned and audited, helping simplyblock to meet your compliance requirements. Using simplyblock and OPA, you have proof of who was authorized to access your data storage at any point in time. Dynamic Policy Enforcement: OPA can enforce policies in real-time, responding to changes quickly and preventing security breaches.

Transcript

Chris Engelbert: Hello everyone, welcome back to this week’s episode of simplyblock’s Cloud Commute Podcast. Today I have a guest with me that I actually never met in person as far as I know. I don’t think we have. No. Maybe just say a few words about you, where you’re from, who you are, and what you’re working with.

Anders Eknert: Sure, I’m Anders. I live here and work in Stockholm, Sweden. I work as a developer advocate, or a DevRel lead even, for Styra, the company behind the Open Policy Agent (OPA) project.

I’ve been here for, I think it’s three and a half years or so. Before that, I was at another company where I got involved in the OPA project. We had a need for a solution to do access control or authorization across a very diverse and complex environment. We had development teams in 12 different countries, seven different programming languages in our cluster, and it was just a big mess. Our challenge was how to do authorization in that kind of environment without having to go out to all of these teams and try to coordinate development work with each change we needed to do.

So that’s basically how I got involved in the OPA project. OPA emerged as a good solution to our problem at that time and, yeah, all these years later I’m still here and I’m having a lot of fun.

Chris Engelbert: All right, cool. So you mentioned Styra, I always thought it was Styra [Steera] to be honest, but okay, fair enough.

Anders Eknert: Yeah, no, the Swedish pronunciation would be ‘Steera’. So you’re definitely right. It is a Swedish word, which means to steer or to navigate.

Chris Engelbert: Oh, okay, yeah.

Anders Eknert: So you’re absolutely right. I’m just using the Americanized, the bastardized pronunciation.

Chris Engelbert: That’s fair, probably because I’m German that would be my initial thought. And it kind of makes sense. So in German it would probably be “steuern” or something.

All right, so tell us a little bit about Styra. You already mentioned the OPA project. I guess we’re coming back to that in a second, but maybe a little bit about the company itself.

Anders Eknert: Yeah, sure. Styra was founded by the creators of OPA and the idea, I think, is like the main thing. Everything at Styra revolves around OPA and I think it always has and I’m pretty sure it always will to some extent.

So what Styra does is we created and maintain the OPA project. We created and maintain a lot of the things you’ll find in the ecosystem around OPA and Styra. And also of course we’re a commercial company. So there are two products that are both based around OPA. One is Styra DAS, which is a commercial control plane, which allows you to manage OPA at scale. So like from the whole kind of policy lifecycle. And then there’s an enterprise distribution of OPA as well, which has basically a whole different runtime, which allows it to consume much less memory, evaluate faster, connect to various data sources and so on. So basically both the distributed component and the centralized component.

Chris Engelbert: Right, okay. You mentioned OPA a few times, I think you already mentioned what it really means, but maybe we need to dig into that a little bit deeper. So I think OPA is the Open Policy Agent. And if I’m not mistaken, it’s a framework to actually build policy as we call it policy as code.

Anders Eknert: That’s right, that’s right. So yeah, the idea behind OPA is basically that you define your policies as code, but not just code as like any other code running or which is kind of coupled to your applications, but rather that you try and decouple that part of your code and move it outside of your application so you can work with that in isolation.

And some common use cases could be things like authorization. And I mentioned before this need where you have like a complex environment, you have a whole bunch of services and you need to control authorization. How do we do authorization here? How do we make changes to this at runtime? How do we know what authorization decisions got logged or what people did in our systems? So how do we do auditing of this? So that is one type of policy and it’s a very common one.

But it doesn’t stop there. Basically anywhere you can define rules, you can define policy. So other common use cases are policy for infrastructure where you want to say like, I don’t want to allow pods to run in my Kubernetes cluster unless they have a well-defined security context or if they don’t allow mounts of certain types and so on. So you basically define the rules for your infrastructure. And this could be things like Terraform plans, Kubernetes resource manifests, or simply just JSON and YAML files on disk. So there are many ways to, and many places where you might want to enforce policy. And the whole idea behind OPA is that you have one way of doing it and it’s a unified way of doing it. So there are many policy engines out there and most of them do this for one particular use case. So there might be a policy engine that does authorization and many others that do infrastructure and so on. But that all means that you’re still going to end up with this problem where policy is scattered all over the place, it looks different, it logs different and so on. While with OPA, you have one unified way of doing this and to work with policy across your whole stack and organization. So that is kind of the idea behind OPA.

Chris Engelbert: So that means if I’m thinking about something like simplyblock being a cloud native block storage, I could prevent services from mounting our block devices through the policies, right? So something like, okay, cool.

Anders Eknert: Right

Chris Engelbert: You mentioned authorization, I guess that is probably the most common thing when people think about policy management in general. What I kind of find interesting is, in the past, when you did those things, there was also often the actual policies or the rules for permission configuration or something. It was already like a configuration file, but with OPA, you kind of made this like the first first-class spot. Like it shouldn’t be in your code. Here’s the framework that you can just drop into or drop before your application, I think, right? It’s not even in the application itself.

Anders Eknert: No, I guess it depends, but most commonly you’ll have like a separate policy repo where that goes. And of course, a benefit of that is like, we’re not giving up on code. Like we still want to treat policy as code. We want to be able to test it. We want to be able to review it. We want to work with all of these things like lint it or what not. We want to work with all these good tools and processes that we kind of established for any development. We want to kind of piggyback on that for policy just as we do for anything else. So if you want to change something in a policy, the way you do that is you submit a pull request. It’s not like you need to call a manager or you need to submit a form or something. That is how it used to be, right? But we want to, as developers, we want to work with these kinds of things like we work with any other type of code.

Chris Engelbert: Right. So how does it look like from a developer’s point of view? I mean, you can use it to, I think automatically create credentials for something like Postgres. Or is that the DAS tool? Do you need one of the enterprise tools for that?

Anders Eknert: No, yeah, creating credentials, I guess, you could definitely use OPA for that. But I think in most cases, what you use OPA for is basically to make decisions that are either most commonly they’re yes or no. ‘So should we allow these credentials?’ would be probably a better use case for OPA. ‘No, we should not allow them because they’re not sufficiently secure’ or what have you. But yeah, you can use OPA and Rego, the policy language, for a whole lot of things and a whole lot of things that we might not have decided for initially. So as an example, like there’s this linter for Rego, which is called Regal that I have been working on for the past year or so. And that linter itself is written mostly in Rego. So we kind of use Rego to define the rules of what you can do in Rego.

Chris Engelbert: Like a small exception.

Anders Eknert: Yeah, yeah. There’s a lot of that.

Chris Engelbert: All right. I mean, you know that your language is good when you can build your own stuff in your own language, right?

Anders Eknert: Exactly.

Chris Engelbert: So coming back to the original question, like what does it look like from a developer’s point of view if I want to access, for example, a Postgres database?

Anders Eknert: Right. So the way OPA works, it basically acts as a layer in between. So you probably have a service between your database and your user or another service. So rather than having that user or service go right to the database, they’d query that service for access. And in that service, you’d have an integration with OPA, either with OPA running as another service or running embedded inside of that service. And that OPA would determine whether access should be allowed or not based on policy and data that it has been provided.

Chris Engelbert: Right. Okay, got it, got it. I actually thought that, maybe I’m wrong because I’m thinking one of the enterprise features or enterprise products, I thought it was its own service that handles all of that automatically, but maybe I misunderstood to be honest. So there are, as you said, there’s OPA enterprise and there is DAS, the declarative authorization service.

Anders Eknert: Yeah, yeah, that’s right. You got it right. I remembered right.

Chris Engelbert: So maybe tell us a little bit about those. Maybe I’m mixing things up here.

Anders Eknert: Sure. So I talked a bit about OPA and OPA access to distributed component or the decision point. So that’s where the decisions are made. So OPA is going to tell the user or another service, should we allow this or not. And once you start to have tens or twenties or hundreds or thousands of these OPAs running in your cluster, and if you have a distributed environment and you want to do like zero trust, microservice authorization or whatnot, you’re going to have hundreds or thousands of OPAs. So the problem that Styra DAS solves is essentially like, how do we manage this at scale? How do I know what version or which policy is deployed in all these environments? How do I manage policy changes between like dev, test, prod, and so on? But it kind of handles the whole policy lifecycle. We talked about testing before. We talked about things like auditing. How are these things logged? How can I search these logs? Can I use these logs to replay a decision and see, like, if I did change this, would it have an impact on the outcome and so on?

So it’s basically the centralized component. If OPA is the distributed component, Styra DAS provides a centralized component which allows things like a security team or even a policy team to kind of gain this level of control that would previously be missing when you just let any developer team handle this on their own.

Chris Engelbert: So it’s a little bit like fleet management for your policies.

Anders Eknert: Yes, that is right.

Chris Engelbert: Okay, that makes sense. And the DAS specifically, that is the management control or the management tool?

Anders Eknert: Yeah, that it is.

Chris Engelbert: Okay.

Anders Eknert: And then the enterprise OPA is a drop-in replacement for OPA adding a whole bunch of features on top of it, like reduced memory usage, direct integrations with data sources, things like Kafka streaming data from Kafka and so on and so forth. So we provide commercial solutions both for the centralized part and the distributed part.

Chris Engelbert: Right, okay. I think now I remember where my confusion comes from. I think I saw OPA Enterprise and saw all the services which are basically source connectors. So I think you already mentioned Kubernetes before, but how does that work in the Kubernetes environment? I think you can, as you said, deploy it as its own service or run it embedded in microservices. How would that apply together somehow? I mean, we’re a cloud podcast.

Anders Eknert: Yeah, of course, of course. So in the context of Kubernetes, there’s basically two use cases. Like the first one we kind of covered, it’s authorization in the level, like inside of the workloads. Our applications need to know that the user trying to do something is authorized to do so. In that context, you’d normally have OPA running as a sidecar or in a gateway or as part of like an envoy proxy or something like that. So it basically provides a layer on top or before any request is hitting an actual application.

Chris Engelbert: In the sense of user operated.

Anders Eknert: Yeah, exactly. So on the next content or the next use case for OPA and Kubernetes is commonly like admission control, where Kubernetes itself or the Kubernetes API is protected by OPA. So whenever you try and make a modification to Kubernetes or the database etcd, the Kubernetes API reaches out to OPA to ask, like should this be allowed or not? So if you try and deploy a pod or a deployment or I don’t know, what have you, what kind of resources, OPA will be provided at resource. Again, it’s just JSON or YAML. So anything that’s JSON or YAML is basically what OPA has to work with. It doesn’t even know, like OPA doesn’t know what a Kubernetes resource is. It just seems like here’s a YAML document or here’s a JSON document. Is this or that property that I expect, is it in this JSON blob? And does it have the values that I need? If it doesn’t, it’s not approved. So we’re going to deny that. So basically just tells the Kubernetes API, no, this should not be allowed and the Kubernetes API will enforce that. So the user will see this was denied because this or that reason.

Chris Engelbert: So that means I can also use it in between any Kubernetes services, everything or anything deployed into Kubernetes, I guess, not just the Kubernetes API.

Anders Eknert: Yeah, anything you try and deploy, like for modifications, is going to have to pass through the Kubernetes API.

Chris Engelbert: That’s a really interesting thing. So I guess going back to the simplyblock use case, that would probably be where our authorization layer or approval layer would sit, basically either approving or denying the CSI deployment.

Anders Eknert: Yeah.

Chris Engelbert: Okay, that makes sense. So because we’re already running out of time, do you think that, or well, I think the answer is yes, but maybe you can elaborate a little bit on that. Do you think that authorization policies or policies in general became more important with the move to cloud? Probably more people have access to services because they have to, something like that.

Anders Eknert: Yeah, I’d say like they were probably just as important back in the days. What really changed with like the invent of cloud and this kind of automation is the level of scale that any individual engineer can work with. Like in the past, you’d have an infra engineer would perhaps manage like 20 machines or something like that. While today they could manage thousands of machines or virtual machines in cloud instances or whatnot.

And once you reach that level of scale, there’s basically no way that you can do policy like manually, that you have a PDF document somewhere where it says like, you cannot deploy things unless these conditions are met. And then have engineers sit and try and make an inventory of what do we have here? And are we all compliant? That doesn’t work.

So that is basically the difference today from how policy was handled in the past. We need to automate every kind of policy check as well just as we automated infrastructure and so with cloud.

Chris Engelbert: Yeah, that makes sense. I think the scale is a good point about that. It was not something I thought about it. I thought in the sense or my thought was more in the sense of like you have probably much bigger teams than you had in the past, which also makes it much more complicated to manage policies or make sure that just like the right people have access. And many times have to have this like access because somebody else is on vacation and it will never be removed again. We all know how it worked in the past.

Anders Eknert: Yeah, yeah. Now, and another difference I think like today compared to 20 years ago is like, at least when I started working in tech, it was like, if you got to any larger company, they’re like, ‘Hey, we’re doing Java here or we’re doing like .NET.’ But if you go to those companies today, they’re like, ‘There’s going to be Python. There’s going to be Erlang. There’s going to be some closure running somewhere. There’s going to be like so many different things.’

This idea of team autonomy and like teams and deciding for themselves what the best solution for any given problem is. And that is, I love that. It’s like, it makes it so much more interesting to work in tech, but it also provides like a huge challenge for anything that is security related because in anything anywhere where you need to kind of centralize or have some form of control, it’s really, really hard. How do you audit something if it’s like in eight different programming languages? Like I can barely understand two of them. Like how would I do that?

Chris Engelbert: How to make sure that all the policies are implemented? If policy change happens, yeah, you’re right. You have to implement it in multiple languages. The descriptor language for the rules isn’t the same. Yeah, that’s a good point. That’s a very good point actually. And just because time, I think I would have like a million more questions, but there’s one thing that I always have to ask. What do you think is like the next big thing in terms of cloud, in your case, authorization policies, but also in the broader scheme of everything?

Anders Eknert: Yeah, sure. So I’d say like, first of all, I think both identity and access control, they are kind of slow moving and for good reasons. There’s not like there’s going to be a revolutionary thing or disruptive event that turns everything around. I think that’s basically where we have to be. We can rely on things to not change or to update too frequently or too dramatically.

So yeah, what would the next big thing is, I still think like this area where we decoupled policy and we worked with it consistently across like large organizations and so on, it’s still the next revolutionary thing. It’s like, there’s definitely a lot of adopters already, but we’re just at a start of this. And again, that’s probably like organizations don’t just swap out their like how they do authorization or identity that could be like a decade or so. So I still think this policy as code while it’s starting to be like an established concept, that it is still the next big thing. And that’s why it’s also so exciting to work with in this space.

Chris Engelbert: All right, fair enough. At least you didn’t say automatic AI generation.

Anders Eknert: No, God no.

Chris Engelbert: That would have been really the next big thing. Now we’re talking. No, seriously. Thank you very much. That was very informative. I loved that. Yeah, thank you for being here.

Anders Eknert: Thanks for having me.

Chris Engelbert: And for the audience, next week, same time, same podcast channel, whatever you want to call that. Hope to hear you again or you hear me again. And thank you very much.

Key Takeaways

In this episode of simplyblock’s Cloud Commute Podcast, host Chris Engelbert welcomes Anders Eknert, a developer advocate and DevRel lead at Styra, the company behind the Open Policy Agent (OPA) project. The conversation dives into Anders’ background, Styra’s mission, and the significance of OPA in managing policies at scale.

Anders Eknert works as a Developer Advocate/DevRel at Styra, the company responsible for the Open Policy Agent (OPA) Project. He’s been with the company for 3.5 years and was previously involved in the OPA project at another company.

Styra created and maintains the OPA project with 2 key products around OPA; 1) Styra DAS, a commercial control plane for managing OPA at scale, handling the entire policy lifecycle and 2) an enterprise distribution of OPA, which has a different runtime and allows it to consume less memory, evaluate faster, connect to various data sources etc. If OPA is the distributed component, Styra DAS is a centralized component.

OPA is a framework to build and run policies – a project for defining policies as code, decoupled from applications, for use cases like authorization, or policy for infrastructure. The idea behind OPA is that it allows a unified way of working with policy across your whole stack and organization.

In the context of Kubernetes, there are 2 key use cases: 1) authorization inside of the workloads where OPA can be deployed as a sidecar or in a gateway or as part of an envoy proxy; 2) admission control where Kubernetes API is protected by OPA.

Anders also talks about the advent of the cloud and how policy management and automation has become essential due to the scale at which engineers today operate. He also discusses the use of diverse programming environments today and team autonomy, both of which necessitate a unified approach to policy management, making tools like OPA crucial.

Anders predicts that policy as code will continue to gain traction, offering a consistent and automated way to manage policies across organizations.

Topics

Simplyblock

Supported Environments

Use Cases

Policy Management at Cloud-Scale with Anders Eknert from Styra (video + interview)

Table Of Contents