Solutions Overview

One stack for hosting, chat, and billing.

Use the hosted Lightning server for cloud generation, the subscription API for day-to-day chatting, and the P2G API when you need stable, credit-metered production integrations.

Open Console Read Docs

What We Run

The services behind the InferencePort AI stack.

Each part of the platform has a specific job: hosting serves the backend, the generation API powers the default chat experience, the P2G API handles metered production workloads, and the console ties billing together.

Server Hosting

Lightning backend

Hosted on sharktide-lightning.hf.space. It serves the public API, wallet logic, subscription resolution, and the Stripe reconciliation flow.

Public config and model discovery
Subscription and usage endpoints
Stripe webhook and reconciliation support

Generation API

Subscription chat

Best for high-volume, low-token chatting. This is the default chat path, but it uses strict plan quotas and abuse controls.

Ideal for everyday chatbot traffic
Subscription-backed limits
OpenAI-style chat endpoint

P2G API

Credit-billed production API

Best for enterprise integrations and stable production services. Requests charge credits from your wallet instead of consuming plan quotas.

Predictable wallet balance
Separate credit ledger
Managed through the console

Chat Services

Local + cloud chat

Use local chat for offline work and the cloud chat stack when you need hosted access, sync, or shared usage across devices.

Local chat for private work
Cloud chat for hosted access
Optional sync and media storage

Pick The Right API

Subscription generation for chatty workloads, P2G for production.

Use case	Best fit	Why
Regular chat UI, assistant, or day-to-day usage	Generation API	Default chat path, optimized for regular chatting and low token usage, but bounded by plan limits.
Enterprise integration or customer-facing backend	P2G API	Credit-metered calls with a separate wallet and ledger, which is easier to treat as a production service.
Need account management, balances, or API keys	Console Dashboard	Use the console to view billing, buy packs, create API keys, and review usage in one place.

Pricing

Two billing models. Two different jobs.

The subscription generation API is bundled with plan quotas, while P2G is charged against credits in your wallet. The console always shows the live pricing and usage values.

Generation (Subscription)

Best for regular chatting

Plan quotas

Included with your subscription tier. Good for high-volume, low-token chat traffic, but bounded by strict abuse and token limits.

Daily cloud-chat quota by plan
Additional image, video, and audio limits
Default chat experience in the app

P2G (Credits)

Best for production APIs

Credit pack billing

Current default server rates are credit-based and visible in the console. The repository defaults are 0.75 credits per million text tokens, 0.02 per image, 0.01 per video second, and 0.01 per audio second.

Separate wallet and ledger
Recharge by purchasing credit packs
Recommended for enterprise and production use

Getting Started

Start with the right path, then scale from there.

Step 1

Open the console to view your wallet, current plan, usage history, and API keys.

Step 2

Choose the API that matches your workload

Use the subscription Generation API for day-to-day chatting and the P2G API for stable production services.

Step 3

Ship with the right billing model

Keep chat traffic on subscription quotas and move production or enterprise traffic to the credit-based P2G API.