Altuscloud AI logo
Limited-time free access for early AI teams

Serverless Inference

Access production-ready models through a clean API. Start in minutes with usage-based pricing and global low-latency routing.

Designed for teams shipping AI agents, copilots, and workflow automation.

API Usage

Only two steps to call the Altus API

1

Obtain API Key

Create a key in your console.

2

Chat API Call

Send your first inference request.

Request Example

bash
curl --location 'https://api.altuscloud.ai/v1/chat/completions' \
  --header 'Authorization: Bearer your-api-key' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "altus-chat-v3",
    "messages": [
      {
        "role": "user",
        "content": "Hello, Altuscloud!"
      }
    ]
  }'

Advantages

Enterprise-grade AI inference with predictable performance

Full-Fledged API

Access chat, image, and embedding endpoints through one consistent API.

One-Click Access

Deploy models in minutes with defaults tuned for production reliability.

Low Latency

A low-latency global network serves requests near your users.

API Pricing

Transparent pricing with no hidden infrastructure fees.

Momentum

High-throughput serverless inference for growing teams.

Input: $0.45 / 1M tokens

Output: $1.20 / 1M tokens

  • Autoscale to zero
  • Batch + streaming support
  • Regional failover
  • Usage analytics
  • Community support
Start Free

Pinnacle

Max performance tier with dedicated capacity and SLAs.

Most Powerful

Input: $0.75 / 1M tokens

Output: $2.10 / 1M tokens

  • Priority GPU pools
  • Dedicated routing lanes
  • Enterprise SLAs
  • Advanced observability
  • Designated success team
Start Free