Bring Microsoft Entra, Okta, or any standards-based provider. IBM Verify makes the per-call decision and HashiCorp Vault secures the last mile with a credential that lives five minutes, so the agent and the database hop are both covered without changing how anyone signs in.
One of the first questions that comes up in these agentic AI conversations is simple. If you want real per-call authorization in front of an AI agent, do you have to replace the identity provider you already use?
No. You should not have to change how people sign in just to secure what happens after they do.
In most environments the problem is not the existing sign-in flow. The problem is the long-lived credential sitting behind the agent, or the absence of a policy decision tied to the exact action the agent is trying to take. One pattern that Verify really shines at, is to keep your current identity provider for authentication, then add a policy layer that evaluates each request and issues short-lived access only when that specific action is allowed. That holds whether you are securing a workforce of a few thousand or a consumer identity estate in the tens of millions.
To make the point concrete rather than theoretical, I built two end to end integrations. One uses Microsoft Entra ID as the identity provider. One uses Okta. In both, the user signs in exactly the way they already do, and IBM Verify still makes every per-call authorization decision, still fires step-up MFA on the actions that warrant it, and still hands HashiCorp Vault the cryptographic proof it needs to mint a credential that lives five minutes and then does not exist. The same pattern works for any standards-based provider, Ping Identity and Transmit Security included.
Your IDP owns who the user is. IBM Verify owns what the agent is allowed to do, on this call, right now. Those are two different jobs, and you do not have to merge them.
This is the pattern I have been demoing at the Gartner Identity & Access Management Summit, at IBM THINK, RSAC, at customer workshops, and that I will be walking through at Identiverse. Along with a myriad of integrations to show the flexibility of Verify and Vault.
Here is the mental model that makes the rest of this make sense. In a classic single-sign-on deployment you think of your identity provider as the thing that owns identity, full stop. Login, MFA, sessions, and authorization. For the human sign-in, that is exactly right. Your identity provider authenticates your users beautifully, whether that is a few thousand employees or millions of consumers.
But an AI agent calling a tool is a different question. The question is not "who is this user." Entra or Okta, your IDP already answered that. The question is "is the agent allowed to do this specific thing, with this data, for this user, at this amount, right now, and can you show PROOF." That is an authorization decision, it is per-call, and it is exactly the decision IBM Verify is built to make.
The bridge between the two is a standard. The user signs in to Entra or Okta and gets an access token. That token becomes the subject_token in an RFC 8693 OAuth 2.0 Token Exchange call to IBM Verify. Verify validates the incoming token's signature against the IDP's published JWKS endpoint, decides the policy, and mints a brand new on-behalf-of token and includes the Rich Authorization Resource data (RAR 9396). From that point on, the chain is identical no matter which IDP you started from.
Verify trusts Entra or Okta the same way any relying party trusts an issuer: it fetches the issuer's JWKS, validates the token signature, and reads the claims. No shared secret with the IDP, no proprietary connector, no migration. If your IDP can mint a standard, validatable JWT, Verify can exchange it.
That one move is what lets you keep your existing sign-in untouched. Verify never sees the user's password, never owns their MFA enrollment for sign-in, never becomes the system of record for who your users are. It sits one layer in, as the policy decision point for what the agent does. That is the role it is best at, and it is a role most identity providers were never designed to fill at the per-call granularity an agent needs.
"We already use Okta or Ping for everything. Can we just extend their API security features to handle our new AI agents?"
You can absolutely use them to authenticate the initial connection. That is what they are built for, and it is the part you should keep. But traditional IAM checks identity at the door. It is not designed to evaluate what an agent does once it is inside. An assistive agent does not make one call and stop. It chains tasks together, and a platform that authorized the session is not designed to tell the difference between a single ordinary API call and an agent stringing together a multi-step workflow nobody approved. This pattern evaluates the intent and context of every single action in real time. Each step is an RFC 9396 authorization request that IBM Verify decides on per call, which is what stops an agent from walking a sequence of individually plausible calls all the way to data exfiltration.
I want to slow down here because these are the load-bearing pieces, and they are all open standards. Nothing in this chain is proprietary glue.
This is the OAuth 2.0 grant type that lets one token be swapped for another. The MCP server (or the agent's wrapper, more on that in section 05) presents the user's Entra or Okta token as the subject_token, names the type of token it is, and asks Verify for an access token back. The grant type is the giveaway in the wire format:
This is the part that turns a coarse OAuth scope into a precise statement of intent. Instead of asking for "x access" in the abstract, the caller attaches an authorization_details object that says exactly what is about to happen. In the refund demos it looks like this:
That amount of 8000 cents is not decoration. IBM Verify's access policy reads it, compares it against the user's per-user refund_limit attribute, and decides whether this call sails through or needs a human to approve it. The authorization is bound to the content of the request, not to a role the user happens to hold.
When Verify approves, it signs the approved authorization_details back into the access token it returns. That signed token is the on-behalf-of token, an RFC 9068 JWT. It is the cryptographic spine of the whole chain, and it is what makes the last hop safe.
HashiCorp Vault, running the verify-rar plugin, takes that on-behalf-of token and does three things in order: it validates the JWT signature against IBM Verify's JWKS, it reads the rich authorization request that Verify signed in, and it matches that request against the role's mappings before it mints anything. Only then does it create a fresh PostgreSQL credential with a five-minute lease.
Vault does not trust the authorization request because the MCP server says so. It trusts it because IBM Verify signed a JWT that contains it, and Vault checks that signature before it acts. If a compromised MCP server tried to forge a wider request and hand it straight to Vault, the signature check fails. The credential follows the request, and the request is attested by Verify.
We can call this the Verify plus Vault chain, and the framing is three lines: the model runs on the agent's infrastructure, the secrets live in your Vault, and the authorization decision happens in IBM Verify. Bringing your own IDP just adds a fourth line at the top. The sign-in stays in Entra, Okta or whatever IDP. Nothing downstream cares.
Let me show what actually changes when Entra is the IDP, because it is a small, specific list and one item on it will cost you an afternoon if nobody warns you.
On the Verify side you create two objects. A Custom Token Type declares the trust anchor: Entra's v2.0 issuer and its JWKS URI, plus the validation rules. An STS Client registers the OAuth client the exchange runs as and binds it to the step-up access policy. That is the federation. Verify now accepts Entra-issued JWTs as a valid subject_token.
There are a few more Entra-specific settings that each map to a specific error code if you miss them..
CSIAQ5216E, the "exp is too far in the future" error.jti validation enabled causes exchanges to fail with CSIAQ5205E because the validator expects jti (RFC 7519 registers it as the JWT ID), but Entra provides uti instead.preferred_username, not upn. Entra omits upn for some account types even when you configure it. preferred_username is reliably present on v2 tokens.given_name and family_name as access-token claims. Verify just-in-time provisions a shadow Cloud Directory user from the token, and Cloud Directory needs first and last name to create it. Miss them and the very first exchange fails with CSIAQ5215E. This is not a full account but rather somewhat of a "stub" account.The payoff: once those are set, the chain is byte-for-byte identical to the Verify-direct version. Same RAR, same Vault role, same five-minute PostgreSQL credential, same refund SQL. The one user-visible difference is step-up MFA. As a transient user, when the policy fires ACTION_MFA_ALWAYS, Verify delivers a six-digit one-time code to the email or mobile from the claim in the Entra token. No enrollment, no friction, and the user never leaves the chat. Note we could also make api call to host IDP if needed.
The Okta demo is the same architecture with Okta-specific plumbing. You register a public PKCE single-page app in Okta, you point a Verify Custom Token Type at Okta's issuer and JWKS URI, and you wire up the STS Client exactly as before.
The Okta trap is different from the Entra one but just as easy to trip over. Okta has two authorization servers, and they are not interchangeable. The Org Authorization Server at /oauth2/v1 issues tokens with OIDC-standard claims only and no custom claims. The Custom Authorization Server at /oauth2/default is the one that can carry a custom claim like refund_limit. If you point Verify at the wrong issuer, sign-in still succeeds, which is what makes it sneaky, but the custom claims your policy needs are simply not there. Decode a real token, read the iss, and make sure it is the /oauth2/default form.
Step-up MFA on the Okta path works the same way as Entra: a transient one-time code by email, or SMS if you want it, delivered to the contact on the Okta token. No Verify-side factor enrollment required, because Verify is not pretending to own the user. It is borrowing an identity claim long enough to make a policy decision.
Entra, Okta, or Verify-direct, the on-behalf-of token Verify mints looks identical and the Vault chain below it is fully agnostic to which IDP you started from. You preserve your workforce-identity investment and still get per-call RFC 9396 authorization, step-up MFA, and tenant-wide enforcement. Without migrating a single user.
This is the part I most want practitioners to internalize, because it is the difference between "nice demo" and "I can actually roll this out across the agents I already have." There are two security topologies, both ship in both demos, and they produce an identical audit chain. The only thing that moves is where the token exchange runs, and therefore where the trust boundary lives.
Before we place that boundary, it helps to see the whole loop in one picture, including the one actor the diagrams so far have left implicit: the model itself. The agent does not act on its own. It asks the model which tool to call, the model names a tool, the agent makes the call, the result goes back to the model, and the model decides whether it is finished or needs another tool. That loop can run several times for a single request. The thing to notice is what the model never touches. It picks the next tool by name, and that is the entire extent of its power. The user's bearer, the on-behalf-of token, and the database credential never enter the model's context. They live below the line, inside the trust boundary, which is exactly why the model being clever, or being fooled, cannot widen what the agent is allowed to do.
The agent is a thin transport. It forwards the user's bearer token to the MCP server verbatim, about five lines of code, and it does nothing else. The MCP server runs the token exchange, handles the step-up MFA dance, and talks to Vault. In the demos the Strands agent defaults to this, and the MCP server runs with MCP_AUTH_MODE=te.
Here is when this is the right call. You have a fleet of agents already running, maybe thousands of them, maybe across teams that do not share a single auth library. You are not going to go re-instrument every one of them. So you do not. You put the enforcement in the MCP servers those agents call, and the agents themselves can stay exactly as they are. The agent's framework, host, or programming language becomes irrelevant to your security posture. The MCP server is the perimeter, and the agents just call it.
The security work moves up into a shared wrapper library that the agent loads. The wrapper runs the token exchange itself, threads the resulting on-behalf-of token downstream, and the MCP server's job narrows to validating that JWT: signature against the Verify JWKS, audience, and the rich authorization request shape. In the demos the LangChain agent defaults to this, and the MCP server runs with MCP_AUTH_MODE=validate.
And here is when this is the right call. The MCP server is supplied by a vendor and you cannot modify it. You do not own that code, so you cannot put the enforcement there. Fine. You put it in your own wrapper, before the call ever reaches the vendor's MCP server. Same outcome, different boundary. The same logic applies when you have many agent frameworks and you want one shared auth library doing token exchange uniformly across all of them, instead of N copies of the same logic living in N different MCP servers.
The wrapper is small. This is the heart of it in the LangChain variant, a single function the agent calls per tool call:
Both patterns carry the same RFC 9396 rich authorization request. That is the non-negotiable. Whether the token exchange runs in the MCP server or in the agent's wrapper, the request describes the exact operation, Verify attests it, and Vault checks the attestation. Move the boundary to fit your reality. Do not move the model.
One operational note worth saying out loud: in Pattern A the token-exchange client secret lives in the MCP server, in Pattern B it lives in the agent's wrapper. That is a real shift in where a sensitive credential sits, and it belongs in your runbooks. The MCP server in Pattern B does not even need that secret. It only needs the public JWKS URL to validate what it receives.
Here is the capability that tends to land hardest in a room. Everything above is the happy path. This is the unhappy path, and it is where Shared Signals earns its keep.
When something goes wrong, a customer-service rep who denies three step-up pushes in a row, or one explicit "mark as suspicious" tap, or a single request that blows past the tenant hard cap, the MCP server emits a CAEP session-revoked event into a local IBM Antenna container. Antenna signs it as an RFC 8417 Security Event Token, the receiver polls for it, and an action handler calls IBM Verify's session-revocation admin API. The user's sessions across every app federated to that Verify tenant die within roughly 30 to 75 seconds.
But the part people do not expect is the cross-IDP reach. The Antenna action handler does not stop at Verify. It also calls the external IDP's own revocation API. For Entra that is Microsoft Graph's revokeSignInSessions. For Okta it is DELETE /api/v1/users/{userId}/sessions. So a single bad request does not just kill the agent's session. It reaches back to the identity provider you kept, and signs the user out there too.
The Verify-side kill is immediate. The user's /oauth2/userinfo starts returning 401 right away, which is what the chat UI's poller detects. The external-IDP revocation behaves differently, and it is worth being precise about. Microsoft Graph's revokeSignInSessions invalidates the user's refresh tokens and browser session cookies, and Microsoft notes it can take a few minutes to propagate. It does not instantly tear down every access token already in flight. Treat the upstream call as an operational control that closes the sign-in loop, not as a guaranteed instant cutoff for tokens already issued. If you need tight upstream cutoff, the lever is short access-token lifetimes on the IDP side, so a revoked session cannot ride a long-lived token.
The trust separation here is deliberate and worth calling out. The MCP server only emits events. It never holds the admin credential and never calls Verify's or the IDP's revocation API directly. The Antenna receiver holds those credentials and is the only thing that calls the kill APIs. If the MCP server is compromised, it can emit a fraudulent event, but it cannot forge the admin token or revoke sessions on its own. The signal and the enforcement are separated on purpose.
The Okta variant of this kill chain looks the same on the wire, just with Okta's revocation endpoint in the handler instead of Graph's.
If you are the person who has to sign off on an agentic deployment, here is the short list I would hold a team to. Each of these is a property you can verify by reading the code or watching a trace, not a vendor promise.
That last point is the one I would push hardest on. A lot of "secure agent" demos prove the agent can call the tool. Almost none prove the agent cannot call it when it should not, and even fewer prove you can revoke the user everywhere in seconds. That is the test that actually matters, and it is the one these demos are built to pass.
Both demos are real, both run end to end, and both are built so anyone can audit the chain and understand the flow. The MCP server, the Vault verify-rar role, the Postgres lease, and the audit join key are identical across them. The only thing that changes is whose login page the user sees first.
You do not have to choose between the identity provider you already trust and the per-call control an AI agent needs. Keep the front door. Add the policy engine behind it.
Your identity provider keeps doing what it is already in place to do. IBM Verify makes the per-call decision. HashiCorp Vault mints the credential that lives five minutes and then does not exist. And when something goes wrong, the signal reaches back to the provider the user signed in with.