← iamidentity.ai/blog
IAMIDENTITY.AI
Microsoft Entra ID Okta IBM Verify

Keep Your IDP. We'll Secure the Agent and the Last Mile.

Bring Microsoft Entra, Okta, or any standards-based provider. IBM Verify makes the per-call decision and HashiCorp Vault secures the last mile with a credential that lives five minutes, so the agent and the database hop are both covered without changing how anyone signs in.

Robert Graham 12 June 2026 ~14 min read
Use case: an AI agent secured by Microsoft Entra ID, IBM Verify, and HashiCorp Vault. The customer-support rep signs in to Entra, the agent forwards the Entra token to the MCP server, the MCP server runs an RFC 8693 token exchange against IBM Verify with an RFC 9396 rich authorization request, and HashiCorp Vault mints a five-minute PostgreSQL credential per call.

One of the first questions that comes up in these agentic AI conversations is simple. If you want real per-call authorization in front of an AI agent, do you have to replace the identity provider you already use?

No. You should not have to change how people sign in just to secure what happens after they do.

In most environments the problem is not the existing sign-in flow. The problem is the long-lived credential sitting behind the agent, or the absence of a policy decision tied to the exact action the agent is trying to take. One pattern that Verify really shines at, is to keep your current identity provider for authentication, then add a policy layer that evaluates each request and issues short-lived access only when that specific action is allowed. That holds whether you are securing a workforce of a few thousand or a consumer identity estate in the tens of millions.

To make the point concrete rather than theoretical, I built two end to end integrations. One uses Microsoft Entra ID as the identity provider. One uses Okta. In both, the user signs in exactly the way they already do, and IBM Verify still makes every per-call authorization decision, still fires step-up MFA on the actions that warrant it, and still hands HashiCorp Vault the cryptographic proof it needs to mint a credential that lives five minutes and then does not exist. The same pattern works for any standards-based provider, Ping Identity and Transmit Security included.

Your IDP owns who the user is. IBM Verify owns what the agent is allowed to do, on this call, right now. Those are two different jobs, and you do not have to merge them.

This is the pattern I have been demoing at the Gartner Identity & Access Management Summit, at IBM THINK, RSAC, at customer workshops, and that I will be walking through at Identiverse. Along with a myriad of integrations to show the flexibility of Verify and Vault.

→ Two demos
Bring your own IDP, keep the Verify chain
Entra variant · Okta variant · same MCP server · same Vault role · same audit chain
This post is the architecture behind both. What federates where, which RFC does what, and the two places you can choose to enforce the policy. The hosting and the IDP are your call. The security posture does not change.

01The reframe: Verify is a policy engine, not an identity store

Here is the mental model that makes the rest of this make sense. In a classic single-sign-on deployment you think of your identity provider as the thing that owns identity, full stop. Login, MFA, sessions, and authorization. For the human sign-in, that is exactly right. Your identity provider authenticates your users beautifully, whether that is a few thousand employees or millions of consumers. 

But an AI agent calling a tool is a different question. The question is not "who is this user." Entra or Okta, your IDP already answered that. The question is "is the agent allowed to do this specific thing, with this data, for this user, at this amount, right now, and can you show PROOF." That is an authorization decision, it is per-call, and it is exactly the decision IBM Verify is built to make.

The bridge between the two is a standard. The user signs in to Entra or Okta and gets an access token. That token becomes the subject_token in an RFC 8693 OAuth 2.0 Token Exchange call to IBM Verify. Verify validates the incoming token's signature against the IDP's published JWKS endpoint, decides the policy, and mints a brand new on-behalf-of token and includes the Rich Authorization Resource data (RAR 9396). From that point on, the chain is identical no matter which IDP you started from.

The federation, in one sentence

Verify trusts Entra or Okta the same way any relying party trusts an issuer: it fetches the issuer's JWKS, validates the token signature, and reads the claims. No shared secret with the IDP, no proprietary connector, no migration. If your IDP can mint a standard, validatable JWT, Verify can exchange it.

That one move is what lets you keep your existing sign-in untouched. Verify never sees the user's password, never owns their MFA enrollment for sign-in, never becomes the system of record for who your users are. It sits one layer in, as the policy decision point for what the agent does. That is the role it is best at, and it is a role most identity providers were never designed to fill at the per-call granularity an agent needs.

The objection I hear most

"We already use Okta or Ping for everything. Can we just extend their API security features to handle our new AI agents?"

How I answer it

You can absolutely use them to authenticate the initial connection. That is what they are built for, and it is the part you should keep. But traditional IAM checks identity at the door. It is not designed to evaluate what an agent does once it is inside. An assistive agent does not make one call and stop. It chains tasks together, and a platform that authorized the session is not designed to tell the difference between a single ordinary API call and an agent stringing together a multi-step workflow nobody approved. This pattern evaluates the intent and context of every single action in real time. Each step is an RFC 9396 authorization request that IBM Verify decides on per call, which is what stops an agent from walking a sequence of individually plausible calls all the way to data exfiltration.

02Three standards and one Vault plugin

I want to slow down here because these are the load-bearing pieces, and they are all open standards. Nothing in this chain is proprietary glue.

RFC 8693, Token Exchange

This is the OAuth 2.0 grant type that lets one token be swapped for another. The MCP server (or the agent's wrapper, more on that in section 05) presents the user's Entra or Okta token as the subject_token, names the type of token it is, and asks Verify for an access token back. The grant type is the giveaway in the wire format:

# The token exchange the chain runs on every tool call POST /oauth2/token grant_type=urn:ietf:params:oauth:grant-type:token-exchange client_id=<the STS client Verify issued you> client_secret=<held by whoever runs the exchange> subject_token=<the Entra or Okta access token> subject_token_type=urn:entra:token-type:user-jwt scope=refund:write authorization_details=<the RFC 9396 RAR, see below>

RFC 9396, Rich Authorization Requests

This is the part that turns a coarse OAuth scope into a precise statement of intent. Instead of asking for "x access" in the abstract, the caller attaches an authorization_details object that says exactly what is about to happen. In the refund demos it looks like this:

[ { "type": "urn:smt:agent:refund", "operationDetails": { "action": "process_refund", "instructedAmount": { "currency": "USD", "amount": 8000 }, "creditorAccount": { "identification": "R-884-2233" } } } ]

That amount of 8000 cents is not decoration. IBM Verify's access policy reads it, compares it against the user's per-user refund_limit attribute, and decides whether this call sails through or needs a human to approve it. The authorization is bound to the content of the request, not to a role the user happens to hold.

The on-behalf-of token, and why Vault trusts it

When Verify approves, it signs the approved authorization_details back into the access token it returns. That signed token is the on-behalf-of token, an RFC 9068 JWT. It is the cryptographic spine of the whole chain, and it is what makes the last hop safe.

HashiCorp Vault, running the verify-rar plugin, takes that on-behalf-of token and does three things in order: it validates the JWT signature against IBM Verify's JWKS, it reads the rich authorization request that Verify signed in, and it matches that request against the role's mappings before it mints anything. Only then does it create a fresh PostgreSQL credential with a five-minute lease.

Why this is the load-bearing piece

Vault does not trust the authorization request because the MCP server says so. It trusts it because IBM Verify signed a JWT that contains it, and Vault checks that signature before it acts. If a compromised MCP server tried to forge a wider request and hand it straight to Vault, the signature check fails. The credential follows the request, and the request is attested by Verify. 

We can call this the Verify plus Vault chain, and the framing is three lines: the model runs on the agent's infrastructure, the secrets live in your Vault, and the authorization decision happens in IBM Verify. Bringing your own IDP just adds a fourth line at the top. The sign-in stays in Entra, Okta or whatever IDP. Nothing downstream cares.

03Microsoft Entra at the front door

Let me show what actually changes when Entra is the IDP, because it is a small, specific list and one item on it will cost you an afternoon if nobody warns you.

On the Verify side you create two objects. A Custom Token Type declares the trust anchor: Entra's v2.0 issuer and its JWKS URI, plus the validation rules. An STS Client registers the OAuth client the exchange runs as and binds it to the step-up access policy. That is the federation. Verify now accepts Entra-issued JWTs as a valid subject_token.

The Entra piece: make the access token validatable by IBM Verify. By default a chat UI requests a Microsoft Graph scope, which produces a Graph-only token Verify cannot validate. The fix is to Expose an API, define a custom scope, request that scope so the audience becomes your own API, and Verify can then validate the standard v2.0 JWT against Entra's JWKS.
Figure 1. The single hardest blocker in the Entra variant, and the one-step fix. Request your own API scope, not a Graph scope.

There are a few more Entra-specific settings that each map to a specific error code if you miss them..

The payoff: once those are set, the chain is byte-for-byte identical to the Verify-direct version. Same RAR, same Vault role, same five-minute PostgreSQL credential, same refund SQL. The one user-visible difference is step-up MFA. As a transient user, when the policy fires ACTION_MFA_ALWAYS, Verify delivers a six-digit one-time code to the email or mobile from the claim in the Entra token. No enrollment, no friction, and the user never leaves the chat. Note we could also make api call to host IDP if needed. 

04Okta at the front door, same shape

The Okta demo is the same architecture with Okta-specific plumbing. You register a public PKCE single-page app in Okta, you point a Verify Custom Token Type at Okta's issuer and JWKS URI, and you wire up the STS Client exactly as before.

Use case: an AI agent secured by Okta, IBM Verify, and HashiCorp Vault. The customer signs in to Okta with authorization code plus PKCE, the agent forwards the Okta token to the MCP server, the MCP server runs an RFC 8693 token exchange against IBM Verify with an RFC 9396 rich authorization request, and HashiCorp Vault mints a five-minute PostgreSQL credential per call.
Figure 2. Swap Entra for Okta and the diagram barely changes. Everything below the token exchange is vendor-neutral.

The Okta trap is different from the Entra one but just as easy to trip over. Okta has two authorization servers, and they are not interchangeable. The Org Authorization Server at /oauth2/v1 issues tokens with OIDC-standard claims only and no custom claims. The Custom Authorization Server at /oauth2/default is the one that can carry a custom claim like refund_limit. If you point Verify at the wrong issuer, sign-in still succeeds, which is what makes it sneaky, but the custom claims your policy needs are simply not there. Decode a real token, read the iss, and make sure it is the /oauth2/default form.

Step-up MFA on the Okta path works the same way as Entra: a transient one-time code by email, or SMS if you want it, delivered to the contact on the Okta token. No Verify-side factor enrollment required, because Verify is not pretending to own the user. It is borrowing an identity claim long enough to make a policy decision.

The thesis, stated plainly

Entra, Okta, or Verify-direct, the on-behalf-of token Verify mints looks identical and the Vault chain below it is fully agnostic to which IDP you started from. You preserve your workforce-identity investment and still get per-call RFC 9396 authorization, step-up MFA, and tenant-wide enforcement. Without migrating a single user.

05Two places to enforce: the MCP server, or the agent

This is the part I most want practitioners to internalize, because it is the difference between "nice demo" and "I can actually roll this out across the agents I already have." There are two security topologies, both ship in both demos, and they produce an identical audit chain. The only thing that moves is where the token exchange runs, and therefore where the trust boundary lives.

Before we place that boundary, it helps to see the whole loop in one picture, including the one actor the diagrams so far have left implicit: the model itself. The agent does not act on its own. It asks the model which tool to call, the model names a tool, the agent makes the call, the result goes back to the model, and the model decides whether it is finished or needs another tool. That loop can run several times for a single request. The thing to notice is what the model never touches. It picks the next tool by name, and that is the entire extent of its power. The user's bearer, the on-behalf-of token, and the database credential never enter the model's context. They live below the line, inside the trust boundary, which is exactly why the model being clever, or being fooled, cannot widen what the agent is allowed to do.

One tool call, the whole loop. In the reasoning loop the user asks the agent, the agent asks the LLM which tool to call, and the LLM names the tool with no token attached. The agent then forwards the tool call plus the user bearer into the trust boundary, where the MCP server runs an RFC 8693 token exchange with an RFC 9396 rich authorization request to IBM Verify, presents the resulting on-behalf-of token to HashiCorp Vault for a five-minute credential, runs one statement against the database and revokes the lease, and returns the result to the agent. The LLM sits outside the trust boundary and never sees the bearer, the on-behalf-of token, or the credential.
Figure 3. The model reasons and picks the tool. The agent transports. Verify decides, Vault mints, and the credential lives five minutes. The model sits outside the trust boundary the whole time.
Two security patterns, one identity chain. Pattern A, MCP-perimeter: the agent forwards the bearer verbatim, the MCP server runs the token exchange and talks to Vault. Pattern B, wrapper-secured: an agent wrapper runs the token exchange and threads the on-behalf-of token to the MCP server, which only validates the JWT. Both share the same RAR shape, the same Verify policy, the same Vault role, the same five-minute Postgres lease, and the same on-behalf-of jti audit key.
Figure 4. Same policy, same Vault role, same Postgres lease. The trust boundary lives in different places.

Pattern A, secure at the MCP server

The agent is a thin transport. It forwards the user's bearer token to the MCP server verbatim, about five lines of code, and it does nothing else. The MCP server runs the token exchange, handles the step-up MFA dance, and talks to Vault. In the demos the Strands agent defaults to this, and the MCP server runs with MCP_AUTH_MODE=te.

Here is when this is the right call. You have a fleet of agents already running, maybe thousands of them, maybe across teams that do not share a single auth library. You are not going to go re-instrument every one of them. So you do not. You put the enforcement in the MCP servers those agents call, and the agents themselves can stay exactly as they are. The agent's framework, host, or programming language becomes irrelevant to your security posture. The MCP server is the perimeter, and the agents just call it.

Pattern B, secure at the agent or wrapper

The security work moves up into a shared wrapper library that the agent loads. The wrapper runs the token exchange itself, threads the resulting on-behalf-of token downstream, and the MCP server's job narrows to validating that JWT: signature against the Verify JWKS, audience, and the rich authorization request shape. In the demos the LangChain agent defaults to this, and the MCP server runs with MCP_AUTH_MODE=validate.

And here is when this is the right call. The MCP server is supplied by a vendor and you cannot modify it. You do not own that code, so you cannot put the enforcement there. Fine. You put it in your own wrapper, before the call ever reaches the vendor's MCP server. Same outcome, different boundary. The same logic applies when you have many agent frameworks and you want one shared auth library doing token exchange uniformly across all of them, instead of N copies of the same logic living in N different MCP servers.

The wrapper is small. This is the heart of it in the LangChain variant, a single function the agent calls per tool call:

def exchange_token(*, subject_token, scope, authorization_details): # Leg 1 of RFC 8693 Token Exchange with the RAR attached. body = { "grant_type": "urn:ietf:params:oauth:grant-type:token-exchange", "client_id": client_id, "client_secret": client_secret, "subject_token": subject_token, # the Entra/Okta token "subject_token_type": _subject_token_type(), "scope": scope, } if authorization_details: body["authorization_details"] = json.dumps(authorization_details) data = _form_post("/oauth2/token", body) if data.get("scope") == "mfa_challenge": raise MfaRequired(data["access_token"]) # step-up, then re-send the RAR return data["access_token"] # the on-behalf-of token
The one rule

Both patterns carry the same RFC 9396 rich authorization request. That is the non-negotiable. Whether the token exchange runs in the MCP server or in the agent's wrapper, the request describes the exact operation, Verify attests it, and Vault checks the attestation. Move the boundary to fit your reality. Do not move the model.

One operational note worth saying out loud: in Pattern A the token-exchange client secret lives in the MCP server, in Pattern B it lives in the agent's wrapper. That is a real shift in where a sensitive credential sits, and it belongs in your runbooks. The MCP server in Pattern B does not even need that secret. It only needs the public JWKS URL to validate what it receives.

06The kill switch reaches all the way back to your IDP

Here is the capability that tends to land hardest in a room. Everything above is the happy path. This is the unhappy path, and it is where Shared Signals earns its keep.

When something goes wrong, a customer-service rep who denies three step-up pushes in a row, or one explicit "mark as suspicious" tap, or a single request that blows past the tenant hard cap, the MCP server emits a CAEP session-revoked event into a local IBM Antenna container. Antenna signs it as an RFC 8417 Security Event Token, the receiver polls for it, and an action handler calls IBM Verify's session-revocation admin API. The user's sessions across every app federated to that Verify tenant die within roughly 30 to 75 seconds.

Cross-IDP tenant kill switch via Shared Signals. A single policy decision, such as a refund over the tenant hard cap, tears down sessions in seconds across every federated IDP. The MCP server emits a CAEP event, IBM Antenna signs it as a Security Event Token and an action handler runs, and the handler calls both IBM Verify session-DELETE and Microsoft Entra revokeSignInSessions so the user is signed out everywhere.
Figure 5. One decision, two providers. Verify kills its session immediately; the action handler also reaches into the external IDP.

But the part people do not expect is the cross-IDP reach. The Antenna action handler does not stop at Verify. It also calls the external IDP's own revocation API. For Entra that is Microsoft Graph's revokeSignInSessions. For Okta it is DELETE /api/v1/users/{userId}/sessions. So a single bad request does not just kill the agent's session. It reaches back to the identity provider you kept, and signs the user out there too.

Be honest about timing

The Verify-side kill is immediate. The user's /oauth2/userinfo starts returning 401 right away, which is what the chat UI's poller detects. The external-IDP revocation behaves differently, and it is worth being precise about. Microsoft Graph's revokeSignInSessions invalidates the user's refresh tokens and browser session cookies, and Microsoft notes it can take a few minutes to propagate. It does not instantly tear down every access token already in flight. Treat the upstream call as an operational control that closes the sign-in loop, not as a guaranteed instant cutoff for tokens already issued. If you need tight upstream cutoff, the lever is short access-token lifetimes on the IDP side, so a revoked session cannot ride a long-lived token.

The trust separation here is deliberate and worth calling out. The MCP server only emits events. It never holds the admin credential and never calls Verify's or the IDP's revocation API directly. The Antenna receiver holds those credentials and is the only thing that calls the kill APIs. If the MCP server is compromised, it can emit a fraudulent event, but it cannot forge the admin token or revoke sessions on its own. The signal and the enforcement are separated on purpose.

The Okta variant of this kill chain looks the same on the wire, just with Okta's revocation endpoint in the handler instead of Graph's.

The same cross-IDP kill switch with Okta. A single policy decision tears down sessions across every federated IDP. The MCP server emits a CAEP event, IBM Antenna signs it as a Security Event Token, and the action handler calls both IBM Verify session-DELETE and Okta's user-sessions DELETE so the user is signed out everywhere.
Figure 6. Same chain, Okta in the handler. Open standards make the IDP a swappable parameter.

07What a CISO should make their team prove

If you are the person who has to sign off on an agentic deployment, here is the short list I would hold a team to. Each of these is a property you can verify by reading the code or watching a trace, not a vendor promise.

That last point is the one I would push hardest on. A lot of "secure agent" demos prove the agent can call the tool. Almost none prove the agent cannot call it when it should not, and even fewer prove you can revoke the user everywhere in seconds. That is the test that actually matters, and it is the one these demos are built to pass.

08Run it, or let us stand it up with you

Both demos are real, both run end to end, and both are built so anyone can audit the chain and understand the flow. The MCP server, the Vault verify-rar role, the Postgres lease, and the audit join key are identical across them. The only thing that changes is whose login page the user sees first.

→ Want this on your own stack?
We will stand up a live demo with you, on-prem or in your cloud
Your IDP · your tenant · your data path · the full chain, running
The fastest way to evaluate this is not to read about it. It is to watch your own Entra or Okta token drive a per-call IBM Verify decision and a five-minute Vault credential, against your environment. I run these as hands-on workshops. If you want one, reach out on LinkedIn and we will get it scheduled. We can integrate the pattern into the platform you already operate.

You do not have to choose between the identity provider you already trust and the per-call control an AI agent needs. Keep the front door. Add the policy engine behind it.

Your identity provider keeps doing what it is already in place to do. IBM Verify makes the per-call decision. HashiCorp Vault mints the credential that lives five minutes and then does not exist. And when something goes wrong, the signal reaches back to the provider the user signed in with.