Keeping Your AI Workloads Sovereign on AWS

Table of Contents

An Australian lawyer asked me last week whether he could run “frontier Claude” on AWS Bedrock in Sydney and keep his client data in Australia. Simple enough question. The answer is one of those ones where you open your mouth ready to say “yes, obviously” and then close it again, because as of April 2026 the honest answer is “not the way you think, and here is the bit that will bite you.”

That conversation is the trigger for this post, but the problem is a lot bigger than one lawyer with a privacy concern. Any team running Bedrock from Sydney and telling a risk committee, a regulator, or a customer that “our data stays in region” needs to actually understand how Bedrock is routing the request. In 2025 and 2026 the routing has changed in ways that most Bedrock tutorials have not caught up to.

If you are an AWS architect building AI workloads with a data-residency constraint, or a compliance reviewer trying to work out whether the console is telling you the whole story about where your prompts are going, this post is for you. I want to walk through the three Bedrock inference invocation types, explain what “APAC” actually means when AWS says it, show you the Sydney model matrix across Claude, Nova, Titan and the other families, and hand you an SCP you can adapt for a defensible Australian posture.

I covered some of this ground earlier in AWS Bedrock open-weight models and Sydney sovereignty. This is the deeper cut on inference profiles specifically, because they are where the sovereignty story quietly falls apart for most customers.

The three ways a Bedrock request gets executed

Bedrock has three invocation modes. You already know the first one. The other two are where the interesting things happen.

On-demand in-region. You call InvokeModel or Converse with a base model ID like anthropic.claude-3-haiku-20240307-v1:0. The request is processed strictly in the region your endpoint points to. No rerouting, no fallback, no surprises. Your CloudTrail log sits in that region. aws:RequestedRegion equals that region. This is the cleanest residency story AWS offers.

Geographic cross-region inference (geo-CRIS). You call an inference profile ARN whose ID carries a geography prefix: us., eu., apac., au., jp., us-gov.. AWS then routes your request to one of the regions inside that geography, based on available capacity. The prompts and outputs may move between those regions during processing. Only the source-region storage stays put.

Global cross-region inference (global-CRIS). You call a profile ARN with the global. prefix. AWS routes to any supported commercial AWS region worldwide, picking whichever one has capacity at that moment. The AWS documentation is refreshingly direct here: “Organizations with data residency or compliance requirements should assess whether Global cross-Region inference fits their compliance framework, since requests may be processed in other supported AWS commercial Regions.” Translation: if residency matters, do not use this.

Here is the first thing that will catch you out. On-demand used to be the default. In 2025 and 2026, AWS has been launching new Claude models as inference-profile-only in several regions. There is no base model ID to call. If you want Claude Sonnet 4.5 or Opus 4.6 in Sydney today, you are using a profile whether you meant to or not.

What “APAC” actually means

The word “APAC” triggers a mental model for most security reviewers that looks something like: Tokyo, Seoul, Singapore, Sydney, maybe Mumbai. A tidy little ring around East Asia and Oceania. When the Bedrock console says apac.*, that is probably what you picture.

That mental model is wrong.

There is a subtlety in how AWS lists regions in an inference profile that almost everyone misses on a first read. The profile support table is asymmetric. It lists a set of source regions, and for each source region it shows the destination regions that source can route to. The destinations are not the same for every source.

For apac.anthropic.claude-sonnet-4-20250514-v1:0, when ap-southeast-2 (Sydney) is the source, the destination set is:

ap-northeast-1 (Tokyo)
ap-northeast-2 (Seoul)
ap-northeast-3 (Osaka)
ap-south-1 (Mumbai)
ap-south-2 (Hyderabad)
ap-southeast-1 (Singapore)
ap-southeast-2 (Sydney)
ap-southeast-4 (Melbourne)

Six of those eight destinations are outside Australia. A prompt that originates in Sydney and is routed through the apac.* profile can land in Tokyo, Seoul, Osaka, Mumbai, Hyderabad, or Singapore for processing. Only two of the eight destinations (Sydney and Melbourne) keep the data on Australian soil, and you have no control over which one Bedrock picks. That is not a guess, it is what the AWS inference-profile support table says verbatim for the ap-southeast-2 row.

If you look at the profile as a whole, the region membership is even wider. The profile lists ap-east-2 (Taiwan), ap-southeast-3 (Jakarta), ap-southeast-5 (Malaysia), ap-southeast-7 (Thailand), and me-central-1 (UAE) as source regions as well, each with their own destination sets. A request originating in Dubai, for example, CAN be routed to Sydney under this profile. But the reverse is not true. A Sydney-origin request cannot be sent to Dubai or Taipei. The routing asymmetry is the entire point, and it is easy to miss if you read the table as a flat list of “regions that appear anywhere in the profile” rather than a per-source-region routing map.

If you have been answering “yes, we use Bedrock in Sydney” on vendor questionnaires without checking which profile your SDK actually uses, you have a problem. Whatever regulatory framework applies to your workload (Privacy Act, GDPR, HIPAA, CPS 234, PCI DSS, your own customer contracts) the question “where did my prompt go” has a concrete answer that lives in inference-profiles-support.html, and for a lot of customers the answer is not what they tell their auditors.

The Sydney model matrix, April 2026

Most of the Bedrock sovereignty conversation is about Claude, because Claude is where the frontier is and where the legal and compliance teams have noticed. But the picture is very different from family to family, and if you are evaluating Bedrock for a real workload you need to look at the whole menu. Here is how each model family looks from ap-southeast-2 as I write this.

Anthropic Claude

Claude Version	On-Demand in Sydney	`au.*`	`apac.*`	`global.*`
Claude 3 Haiku	Yes	—	Yes	—
Claude 3 Sonnet	Yes	—	Yes	—
Claude 3.5 Sonnet (v1, v2)	No	—	Yes	—
Claude 3.7 Sonnet	No	—	Yes	—
Claude Sonnet 4	No	—	Yes	Yes
Claude Sonnet 4.5	No	Yes	—	Yes
Claude Haiku 4.5	No	Yes	—	Yes
Claude Opus 4.5	No	—	—	Yes only
Claude Opus 4.6	No	Yes	—	Yes
Claude Sonnet 4.6	No	Yes	—	Yes

Two things jump out. Claude 3 Haiku and Claude 3 Sonnet are the only Claude models you can call on-demand in Sydney per the AWS models-regions.html table. Every newer Claude needs an inference profile to work from ap-southeast-2. And Claude Opus 4.5 is reachable from Sydney only via global.*, meaning worldwide routing. If you need Opus 4.5’s reasoning headroom AND Australian residency, today’s answer is “you cannot have both”. Opus 4.6 is the first Opus-class model you can actually pin to Australia.

Amazon Nova

Nova is where the sovereignty story is quietly best. Nova Lite, Nova Micro, Nova Pro, Nova Sonic, Nova Canvas, and Nova Reel are all available on-demand natively in ap-southeast-2. No inference profile required. Nova 2 Lite is available too, but via global.amazon.nova-2-lite-v1:0 only, so it falls into the worldwide-routing bucket.

There is no au.* profile for any Nova model. You do not need one. The on-demand endpoint in Sydney is the residency-clean path, and the apac.amazon.nova-* profiles (with Sydney and Singapore as source regions only) exist mostly for throughput and failover within APAC rather than as the primary access method.

If your workload fits Nova’s capability envelope, you have the simplest sovereignty story of any Bedrock customer: call the base model ID in ap-southeast-2, done, no profile conversations required. Amazon has clearly prioritised in-region parity for their own model family in a way Anthropic has not, and it is worth letting that shape your shortlist.

Amazon Titan

Titan Text Large, Titan Text Embeddings V2, Titan Text/Image Embeddings, Titan Embeddings G1 Text, and Titan Multimodal Embeddings G1 are all on-demand in ap-southeast-2. Titan Image Generator G1 v2 is not. Titan sits in the same “in-region by default” category as Nova. For the embedding workloads that underpin most RAG pipelines, you do not need to think about CRIS at all.

Meta Llama, Mistral, Writer, DeepSeek

This is the bucket that gets ignored in most sovereignty discussions. None of these model families are available in ap-southeast-2, on-demand or via CRIS, at the time of writing. Meta Llama (3.1, 3.2, 3.3, Llama 4 Maverick and Scout) has us.* and limited eu.* profiles. Mistral Pixtral Large has us.* and eu.*. DeepSeek R1 is US-only. Writer Palmyra X4 and X5 are US-only.

If your architecture needs any of these models from an Australian account, your options today are “use an Anthropic or Amazon alternative” or “accept that the request goes to the United States regardless of profile wrapping”. There is no Sydney path at all, sovereign or otherwise. This is worth flagging because a lot of open-source-leaning teams default to Llama on Bedrock without realising the sovereignty picture is actually worse for Llama than it is for Claude.

Cohere

Cohere Embed v4 is available from Sydney as a source region via global.cohere.embed-v4:0 only. Worldwide routing, destinations in any commercial AWS region. Cohere has no apac.*, no au.*, and no native ap-southeast-2 on-demand endpoint. For an embedding model that is usually called in high volume, global routing is a non-trivial data-movement story.

TwelveLabs (video understanding)

TwelveLabs Pegasus v1.2 and Marengo Embed v2.7 appear in apac.* profiles with Sydney listed as a destination but not as a source (the only listed source is ap-northeast-2 Seoul). If your video pipeline runs from Sydney, TwelveLabs does not currently have a path in. This is a small corner but worth knowing if you are building multimodal workloads.

Summary of the family-level picture

Family	Sydney story (April 2026)
Amazon Nova	On-demand in ap-southeast-2, full range. Best sovereignty story.
Amazon Titan (text, embeddings)	On-demand in ap-southeast-2. Clean.
Anthropic Claude 3.x	Mixed: Haiku 3 and Sonnet 3 on-demand. Newer 3.5/3.7 via `apac.*`.
Anthropic Claude 4.x	`au.` for Sonnet 4.5, Haiku 4.5, Opus 4.6, Sonnet 4.6. `global.`-only for Opus 4.5.
Meta Llama	Not available in Sydney at all. US or EU only.
Mistral	Not available in Sydney at all. US or EU only.
Cohere Embed v4	`global.*` only from Sydney. Worldwide routing.
DeepSeek, Writer	Not available in Sydney at all. US only.
TwelveLabs	Sydney is a destination in `apac.*` profiles, not a source.

The headline pattern: if your sovereignty story depends on the simplest possible posture, shortlist Nova, Titan, and the au.*-profile Claude 4.x generation before anything else. Everything else on the Bedrock menu either requires CRIS routing you have to reason about, or is not in Sydney at all.

The one profile that keeps your data in Australia

The au.* profile was introduced late in 2025 with Claude Sonnet 4.5 and expanded through the Haiku 4.5, Opus 4.6, and Sonnet 4.6 launches. Its source and destination region set is exactly two regions:

ap-southeast-2 (Sydney)
ap-southeast-4 (Melbourne)

That is it. A request routed through au.anthropic.claude-opus-4-6-v1 originating in Sydney cannot land anywhere other than Sydney or Melbourne. Your data stays on Australian soil. Your CloudTrail source-region entry captures the additionalEventData.inferenceRegion field so you can audit exactly which of the two regions processed each call.

This is the profile you want for regulated Australian workloads. It is also the profile that most Bedrock tutorials and quickstart guides will not mention, because they were written with us.* or global.* in mind.

The SCP that actually enforces it

AWS publishes a reference SCP in Securing Amazon Bedrock cross-Region inference: Geographic and global that denies all non-US geographic CRIS. The Australian version is the same policy with one ARN pattern swapped. Here it is, drop it into a Control Tower OU or attach it to the account you want pinned to Australian Claude:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyNonAustraliaGeographicCRIS",
      "Effect": "Deny",
      "Action": "bedrock:*",
      "Resource": "*",
      "Condition": {
        "Null": {
          "bedrock:InferenceProfileArn": "false"
        },
        "ArnNotLike": {
          "bedrock:InferenceProfileArn": [
            "arn:aws:bedrock:*:*:inference-profile/au.*"
          ]
        }
      }
    }
  ]
}

A few notes on how this works, because the logic is more subtle than it looks.

The Null condition fires the Deny only when an inference profile is actually in play (bedrock:InferenceProfileArn is not null). Native on-demand calls, such as a base anthropic.claude-3-haiku-20240307-v1:0 invocation or any Nova or Titan call, are unaffected. The ArnNotLike clause then denies any inference profile that is not an au.* profile. In one statement, apac.*, us.*, eu.*, jp.*, and global.* are all blocked, because none of them match au.*.

Pair this with a region allowlist SCP that denies bedrock:* where aws:RequestedRegion is not ap-southeast-2. That gives you two independent locks: one on where you can call from, one on which inference profile families are permitted.

One subtlety for the global-CRIS case. The AWS security blog clarifies (and its own reference SCPs confirm) that aws:RequestedRegion takes the literal value "unspecified" for global cross-region inference calls, not "global". If you want a separate belt-and-braces statement that explicitly denies global CRIS on top of the ArnNotLike rule above, AWS’s verbatim pattern is:

{
  "Effect": "Deny",
  "Action": "bedrock:*",
  "Resource": "*",
  "Condition": {
    "StringLike": {
      "aws:RequestedRegion": ["unspecified"]
    },
    "ArnLike": {
      "bedrock:InferenceProfileArn": "arn:aws:bedrock:*:*:inference-profile/global.*"
    }
  }
}

This is the exact JSON from the AWS security blog, shipped as a standalone deny. It is redundant with the ArnNotLike au.* rule above (both catch global.*), but if you are operating under AWS Control Tower you will find that this two-statement shape is the one the official AWS documentation tells Control Tower customers to use via Customizations for AWS Control Tower, so it may be easier to justify to an auditor.

Test in a non-production account first. The Australian variant above is a one-ARN-pattern swap on AWS’s published US-geographic example, but you should still run it through a sandbox OU and verify that (a) your Nova and Titan on-demand calls still work, (b) your au.* Claude calls still work, and (c) an apac.* or global.* call from the same account is denied.

Auditing what actually happened

CloudTrail logs every Bedrock invocation in the source region. The field that matters for sovereignty is additionalEventData.inferenceRegion. If it is present, the call went through an inference profile and that field tells you which destination region actually served the request. If it is absent, the call was native on-demand and ran in the awsRegion shown in the same event. The absence of inferenceRegion is itself the diagnostic. It is the clean-in-region signal you want on every event for a sovereign workload.

Here is a real CloudTrail event from my own AWS account in Sydney, running Nova Pro via the Converse API. I pulled this from the account that hosts Viking Context Service, which uses Nova Pro on-demand for its compilation layer:

{
  "eventVersion": "1.11",
  "eventTime": "2026-04-12T12:36:37Z",
  "eventSource": "bedrock.amazonaws.com",
  "eventName": "Converse",
  "awsRegion": "ap-southeast-2",
  "requestParameters": {
    "modelId": "amazon.nova-pro-v1:0",
    "inferenceConfig": { "maxTokens": 8192 }
  },
  "additionalEventData": {
    "inputTokens": 2051,
    "outputTokens": 812
  },
  "tlsDetails": {
    "clientProvidedHostHeader": "bedrock-runtime.ap-southeast-2.amazonaws.com"
  }
}

Two things to notice. First, additionalEventData contains token counts but no inferenceRegion, because the call never left Sydney. Nova Pro is on-demand in ap-southeast-2, so no profile is involved. Second, tlsDetails.clientProvidedHostHeader shows the endpoint the SDK actually connected to, which is a handy secondary signal for auditors who want to see the region in the wire-level metadata, not just in AWS’s own routing fields.

For comparison, here is what the AWS security blog shows as a CloudTrail entry when a global.* call is made from ap-southeast-2 and Bedrock routes it to ap-southeast-4:

{
  "eventVersion": "1.11",
  "eventTime": "2025-10-02T01:55:04Z",
  "eventSource": "bedrock.amazonaws.com",
  "eventName": "InvokeModel",
  "awsRegion": "ap-southeast-2",
  "requestParameters": {
    "modelId": "global.anthropic.claude-sonnet-4-5-20250929-v1:0"
  },
  "additionalEventData": {
    "inferenceRegion": "ap-southeast-4"
  }
}

Notice the difference. The model ID is prefixed with global., the caller is still in Sydney, and additionalEventData.inferenceRegion has materialised with a destination. In this example the happy path held and Bedrock routed to Melbourne, still in Australia. But that is a coincidence, not a guarantee. The same profile the next minute could just as easily route to Tokyo or Dublin.

Build a CloudWatch metric filter or CloudTrail Lake query on additionalEventData.inferenceRegion. Alert on anything that is not ap-southeast-2 or ap-southeast-4, and alert even more loudly on any model ID prefixed with global. that you did not explicitly approve. Then you have a detective control that backs up your preventative SCP, and a story to tell the auditor beyond “trust me, the policy blocks it”.

Three decisions your architecture needs to make

Decision 1: Do you actually need frontier Claude, or will a smaller model do? Most summarisation, extraction, classification, and drafting work runs fine on Haiku 4.5 or Nova Pro. Restricting yourself to those is not a regression, it is a deliberate capability ceiling in exchange for a cleaner residency story and a lower bill. A lot of the workloads I see reaching for Opus are doing so because someone on the team read a benchmark blog, not because the task needs it. Start with the smallest model that passes your evals and only climb the ladder when you have to.

Decision 2: Which model family matches your sovereignty posture before you pick the “best” model? Nova and Titan are on-demand in Sydney with no profile involved. au.*-profile Claude is on-demand-grade residency via CRIS. apac.* leaks out of Australia. global.* leaks worldwide. Meta Llama, Mistral, DeepSeek, and Writer have no Sydney path at all. Shortlist the family first, then pick the model inside it, not the other way round.

Decision 3: Are you prepared for the day a new model launches that is not yet in au.*? This has already happened with Claude Opus 4.5. It will happen again. Build a runbook now for the “new model just dropped and it is global.*-only” scenario, so the decision about whether to enable it, pin it out of your SCP, or wait, is made deliberately rather than by a developer hitting a 403 at 2am the day after launch.

My take

Bedrock in Sydney is not the clean sovereign-AI story AWS marketing would like it to be. It is a patchwork. Amazon’s own models are in the best shape: Nova and Titan are on-demand in ap-southeast-2 with no profile involved. Anthropic’s current generation has a genuine Australian pin via au.*, but only for a specific subset of models, and new releases routinely ship months before they get an au.* profile. Meta, Mistral, DeepSeek, and Writer are not in Sydney at all. Cohere is worldwide-only. The practical consequence is that model shortlisting and sovereignty posture are not separate decisions, they are the same decision.

The good news is that once you know this, the controls are straightforward. The bedrock:InferenceProfileArn condition key is precise, testable, and composable with Control Tower. The au.* profile is a genuinely strong residency primitive for the models it covers. Native on-demand calls for Nova, Titan, and Claude 3 Haiku are a clean “in-region” story with no profile machinery at all. And additionalEventData.inferenceRegion in CloudTrail gives you the forensic hook to prove after the fact that your traffic went where you said it would.

The real question worth asking is whether your team picks the model first and the sovereignty posture second, or the other way round. Most teams pick the model first, get six months into the build, and then discover on the day of their first audit that the profile they have been using routes to Tokyo. The cheaper fix is to pick the posture first and let it constrain the shortlist.

If you are running Bedrock from Sydney today and you cannot answer the question “which profile does our SDK actually use”, that is the first thing to find out on Monday morning.