Solutions Overview

One stack for hosting, chat, and billing.

Use the hosted Lightning server for cloud generation, the subscription API for day-to-day chatting, and the P2G API when you need stable, credit-metered production integrations.

The services behind the InferencePort AI stack.

Each part of the platform has a specific job: hosting serves the backend, the generation API powers the default chat experience, the P2G API handles metered production workloads, and the console ties billing together.

Generation API
Subscription chat
Best for high-volume, low-token chatting. This is the default chat path, but it uses strict plan quotas and abuse controls.
  • Ideal for everyday chatbot traffic
  • Subscription-backed limits
  • OpenAI-style chat endpoint
P2G API
Credit-billed production API
Best for enterprise integrations and stable production services. Requests charge credits from your wallet instead of consuming plan quotas.
  • Predictable wallet balance
  • Separate credit ledger
  • Managed through the console
Chat Services
Local + cloud chat
Use local chat for offline work and the cloud chat stack when you need hosted access, sync, or shared usage across devices.
  • Local chat for private work
  • Cloud chat for hosted access
  • Optional sync and media storage

Subscription generation for chatty workloads, P2G for production.

Use case Best fit Why
Regular chat UI, assistant, or day-to-day usage Generation API Default chat path, optimized for regular chatting and low token usage, but bounded by plan limits.
Enterprise integration or customer-facing backend P2G API Credit-metered calls with a separate wallet and ledger, which is easier to treat as a production service.
Need account management, balances, or API keys Console Dashboard Use the console to view billing, buy packs, create API keys, and review usage in one place.

Two billing models. Two different jobs.

The subscription generation API is bundled with plan quotas, while P2G is charged against credits in your wallet. The console always shows the live pricing and usage values.

P2G (Credits)
Best for production APIs
Credit pack billing
Current default server rates are credit-based and visible in the console. The repository defaults are 0.75 credits per million text tokens, 0.02 per image, 0.01 per video second, and 0.01 per audio second.
  • Separate wallet and ledger
  • Recharge by purchasing credit packs
  • Recommended for enterprise and production use

Start with the right path, then scale from there.

Step 1
Sign in and inspect your account
Open the console to view your wallet, current plan, usage history, and API keys.
Step 2
Choose the API that matches your workload
Use the subscription Generation API for day-to-day chatting and the P2G API for stable production services.
Step 3
Ship with the right billing model
Keep chat traffic on subscription quotas and move production or enterprise traffic to the credit-based P2G API.