AI Connections

AI Connections represent connections to specific AI providers with their credentials and capacity configuration. This guide covers everything you need to know about creating and managing AI Connections.

What is an AI Connection?

An AI Connection encapsulates:

Provider: One of the seven supported providers — OpenAI, Anthropic, Google Gemini, Groq, Perplexity, AWS Bedrock (Converse), AWS Bedrock-Invoke (Anthropic on AWS).
Credentials: Encrypted API key or AWS IAM role.
Capacity: Custom capacity limits (e.g., 100 RPM, 100,000 TPM).
Discovered Capacity: Automatically discovered rate limits from the provider.

See the LLM Providers index for a side-by-side capability matrix and per-provider deep dives.

Creating an AI Connection

Navigate to AI Connections in the UI
Click Create Connection
Fill in the connection details:
- Name: A descriptive name for the connection
- Description: Optional description
- Provider: Select the AI provider
- Configuration: Provider-specific configuration (API keys, region, etc.)
- Capacity: Define capacity limits (optional)

You can also use the quick add feature for faster connection setup:

AI Connection Quick Add

Provider-Specific Configuration

OpenAI

{
  "provider": "openai",
  "config": {
    "apiKey": "sk-..."
  }
}

Anthropic

{
  "provider": "anthropic",
  "config": {
    "apiKey": "sk-ant-..."
  }
}

Google Gemini

The provider id is gemini (not google).

{
  "provider": "gemini",
  "config": {
    "apiKey": "AIza..."
  }
}

Groq

{
  "provider": "groq",
  "config": {
    "apiKey": "gsk_..."
  }
}

Perplexity

{
  "provider": "perplexity",
  "config": {
    "apiKey": "pplx-..."
  }
}

AWS Bedrock (Converse)

The unified Bedrock Converse API — supports every Bedrock foundation model (Claude, Llama, Mistral, Nova, …) at the cost of losing Anthropic-only features in conversion. AWS Bedrock uses IAM role assumption rather than an API key:

{
  "provider": "aws-bedrock",
  "config": {
    "region": "us-east-1",
    "iamRoleArn": "arn:aws:iam::123456789012:role/vm-x-ai-bedrock-role",
    "performanceConfig": {
      "latency": "optimized"
    },
    "guardrailConfig": {
      "guardrailIdentifier": "abc123guardrail",
      "guardrailVersion": "DRAFT",
      "trace": "enabled"
    }
  }
}

Both performanceConfig and guardrailConfig are optional. performanceConfig.latency is 'standard' | 'optimized' and applies to every Converse call on the connection. guardrailConfig attaches a Bedrock Guardrail (by ID or full ARN) to every inference call; trace defaults to 'enabled' so the audit row sees the guardrail assessments.

AWS Bedrock-Invoke (Anthropic on AWS)

Same IAM-role auth as aws-bedrock — including the optional performanceConfig and guardrailConfig blocks shown above — but the wire shape is the full Anthropic Messages API via Bedrock's InvokeModel. Use this when running Claude on AWS and you need Anthropic-only features (cache_control, extended thinking, server tools) preserved end-to-end. See the AWS Bedrock-Invoke provider page for the full feature matrix.

{
  "provider": "aws-bedrock-invoke",
  "config": {
    "region": "us-east-1",
    "iamRoleArn": "arn:aws:iam::123456789012:role/vm-x-ai-bedrock-role"
  }
}

IAM Role Setup (Bedrock providers)

Both Bedrock providers share the same IAM-role setup:

Create an IAM role in your AWS account with Bedrock permissions.
Configure the role's trust policy to allow VM-X AI to assume it.
Use the role ARN in the connection configuration.

A CloudFormation template is available in the repository at packages/api/assets/aws/cfn/bedrock-iam-role.yaml — same template works for both providers.

Capacity Configuration

Capacity limits control how many requests and tokens can be used within a time period.

Capacity Periods

Supported periods (all values lowercase on the wire):

minute: Requests/tokens per minute
hour: Requests/tokens per hour
day: Requests/tokens per day
week: Requests/tokens per week
month: Requests/tokens per month
lifetime: Cumulative cap with no rolling window

Each capacity entry can also carry an optional enabled flag and a dimension (currently only source-ip) so the limit applies per-source-IP instead of globally.

Example Configuration

{
  "capacity": [
    {
      "period": "minute",
      "requests": 100,
      "tokens": 100000
    },
    {
      "period": "hour",
      "requests": 5000,
      "tokens": 5000000
    },
    {
      "period": "day",
      "requests": 100000,
      "tokens": 100000000
    }
  ]
}

Capacity Enforcement

Capacity is enforced at the connection level. When a request exceeds capacity:

The request is rejected with a 429 Too Many Requests status
An error message indicates which limit was exceeded
The client should retry after the rate limit window resets

Discovered Capacity

VM-X AI automatically discovers rate limits from provider responses:

X-RateLimit-Limit-Requests: Maximum requests per window
X-RateLimit-Limit-Tokens: Maximum tokens per window

Discovered capacity is stored in the connection and can be viewed in the UI. This helps you:

Understand actual provider limits
Optimize your capacity configuration
Monitor provider rate limit changes

Credential Security

Encryption

Credentials are encrypted at rest using:

AWS KMS: For production environments (recommended)
Libsodium: For local development and small deployments

Credential Storage

Credentials are stored encrypted in PostgreSQL
Decryption happens in-memory only
Credentials are never exposed in:
- API responses
- Logs
- Error messages

Credential Rotation

To rotate credentials:

Update the connection configuration with new credentials
The old credentials are immediately replaced
No downtime required - existing requests continue with old credentials until new ones are used

Best Practices

1. One Connection Per Provider Account

Create separate connections for:

Different provider accounts
Different regions (for AWS Bedrock)
Different environments (development, staging, production)

2. Set Realistic Capacity

Base capacity limits on:

Provider quotas
Your usage patterns
Cost considerations

Monitor discovered capacity to understand actual provider limits.

3. Monitor Usage

Regularly review:

Capacity utilization
Discovered capacity changes
Error rates

5. Secure Credentials

Use AWS KMS for production
Rotate credentials regularly
Never commit credentials to version control
Use least-privilege access for AWS KMS keys

Updating an AI Connection

Navigate to the connection
Click Edit
Update the desired fields
Click Save

Viewing Connection Details

Navigate to AI Connections and click on a connection to view:

Connection details
Capacity configuration
Discovered capacity
Usage statistics

Troubleshooting

Connection Not Working

Verify Credentials: Ensure API keys are correct and valid
Check Provider Status: Verify the provider service is operational
Review Logs: Check API logs for error messages
Test Connection: Use the provider's API directly to verify credentials

Capacity Limits Too Restrictive

Review Capacity Configuration: Check if limits are too low
Monitor Usage: Review actual usage patterns
Adjust Limits: Increase capacity limits as needed
Consider Prioritization: Use prioritization to allocate capacity fairly

Discovered Capacity Not Updating

Make Requests: Discovered capacity is updated when requests are made
Check Provider Headers: Verify provider returns rate limit headers
Review Logs: Check for errors in capacity discovery

What is an AI Connection?​

Creating an AI Connection​

Provider-Specific Configuration​

OpenAI​

Anthropic​

Google Gemini​

Groq​

Perplexity​

AWS Bedrock (Converse)​

AWS Bedrock-Invoke (Anthropic on AWS)​

IAM Role Setup (Bedrock providers)​

Capacity Configuration​

Capacity Periods​

Example Configuration​

Capacity Enforcement​

Discovered Capacity​

Credential Security​

Encryption​

Credential Storage​

Credential Rotation​

Best Practices​

1. One Connection Per Provider Account​

2. Set Realistic Capacity​

3. Monitor Usage​

5. Secure Credentials​

Updating an AI Connection​

Viewing Connection Details​

Troubleshooting​

Connection Not Working​

Capacity Limits Too Restrictive​

Discovered Capacity Not Updating​