Two ways to run ARK.
Pick how you buy, not how it runs.

ARK Cloud is self-serve, token-based, and public. Sign up with an email, get free credits, ship today. ARK Tailored and ARK Core are per-GPU licenses deployed on your own infrastructure — consultative, scoped to your stack, quote-based. Same runtime underneath. Different commercial surface.

ARK RUNTIME POST /v1/chat Bearer sk- ARK CLOUD · API PAY PER TOKEN $1 = 1M credits YOUR GPU RACK H100 A100 L40S 3090 MI300 PER-GPU LICENSE Quote on request EITHER
☁ ARK CLOUD · SERVERLESS

Building or experimenting.

Fastest way to run frontier open-source models behind an OpenAI-compatible API. Hosted in the EU. Sign up with an email, get free credits, ship today.

Pricing model
Pay per token
Commit
None — credits never expire
Starting at
Free credits, no card
Sales contact
Not required
  • OpenAI v1 / Anthropic-compatible API
  • 10+ frontier open models (text, vision, code, reasoning)
  • EU-only data residency, zero retention by default
  • Credits never expire, no auto-renew, no hidden fees
  • Best-effort 99% availability on shared endpoints
⚙ ARK TAILORED & ARK CORE · YOUR INFRASTRUCTURE

Deploying on your own GPUs.

The same ARK runtime, licensed per GPU and deployed into your environment — cloud, on-prem, or hybrid. Full platform (Tailored) or core runtime only (Core), your call. Data never leaves your infrastructure.

Pricing model
Per-GPU license + add-ons
Billing
Monthly or annual · multi-year available
Starting at
Quote on request
Sales contact
Required today
  • Per-GPU license — Tailored (full platform) or Core (runtime only)
  • Any GPU mix, any generation — heterogeneous by design
  • Standard Ethernet — no NVLink or InfiniBand required
  • Add-ons for modalities, extended model catalogue, installation
  • Software Support SLA, tier scales with GPU license count
A 60-second decision helper.

Three quick questions. The honest answer points to the right tier.

1

Does the data need to stay on your infrastructure?

Yes — regulated industry, customer data, GDPR, internal IP. Your infrastructure, your rules, your control.

No — public-facing or non-sensitive workloads, EU-hosted is enough.

Yes → Enterprise
No → Cloud
2

Do you already own or lease GPUs at meaningful volume?

Yes — GPU cost is a balance-sheet item. The economics flip toward a per-GPU license and running the engine yourself.

No — you're paying per API call. Cloud is cheaper until you're running inference at scale.

Yes → Enterprise
No → Cloud
3

Who is going to operate the deployment day-to-day?

Our platform / ML / infra team — mature IAM, logging, monitoring already in place. Bolt the ARK runtime into what you run. ARK Core.

We'd rather ARK deliver the stack — configured, deployed, and handed over turnkey. ARK Tailored.

Core → runtime only
Tailored → full platform
🚢 On the roadmap: ARK Core as a signed self-service installer — no ARK engineers in the loop. Until then, self-deploy with documentation or an optional guidance package.
Pricing — the honest answers.
Can I start on Cloud and move to our own GPUs later?
Yes. It's the same runtime. Same OpenAI v1-compatible API. The model catalogue overlaps heavily. Most of our enterprise customers started with a Cloud trial, validated performance, then moved the production workload to their own infrastructure under ARK Tailored or ARK Core.
Why don't you publish a per-GPU price for Tailored and Core?
Because it's not a transaction. Deployments differ by modality mix, model catalogue size, support tier, integration work, and regional context. Publishing a single number would either underquote one customer or overquote another. We'd rather have a 30-minute discovery call and give you a real number.
What counts as "a GPU" in the per-GPU license?
Any physical accelerator attached to a Compute Node and used to serve inference under ARK. We don't meter per vGPU, per MIG partition, or per tensor core — just the physical units you put under ARK's control. Mixed generations (3060 to H100) count as one GPU each.
Do the per-GPU numbers include modalities beyond text?
Text generation is always included. Additional modalities (image, vision, embeddings, speech) are modular add-ons at a fixed per-GPU per-month rate. See the Enterprise pricing page for the framework.
Is installation and configuration included in the license?
The license covers the software. ARK Core can be self-deployed from documentation at no additional fee, with an optional Install Guidance package if you want a light touch. ARK Tailored is delivered by ARK engineers under a Standard or Full Managed Deployment package. You only pay for the level of hand-holding you actually need.
Will ARK Core ever be fully self-service?
Yes — that's the roadmap. ARK Core is moving toward a signed installer that provisions the runtime, registers a license key, and activates against a per-GPU count with no ARK engineers in the loop. Until it ships, Core is available today with documentation-only self-deploy or an optional guidance package.
What does the SLA cover?
On Cloud, ARK operates the entire stack and is accountable for availability and latency on shared endpoints. On Tailored and Core, ARK's SLA covers software defect response times and resolution on ARK components. Infrastructure uptime is the customer's (or their infrastructure partner's) responsibility. Support tier is determined by GPU license count and is the same software, same update cycle — no feature gating.