QVAC Logo
Usage examplesP2P capabilities

Delegated inference

Perform peer-to-peer inference delegation out-of-the-box via Holepunch stack, enabling resource sharing.

Overview

Lets a consumer delegate inference requests to a remote provider in a P2P manner over the Hyperswarm DHT. Use it when an inference requires more resources than the local device is able to provide.

Connectivity is direct: the consumer opens a dht.connect(providerPublicKey) connection straight to the provider — there is no topic or discovery phase. The provider exposes itself on the DHT via its keypair, and consumers connect by public key.

Delegation is configured at model-load time by passing a delegate object to loadModel(). The provider is started separately using startQVACProvider().

Functions

Provider:

  1. startQVACProvider() — bind the DHT server on the provider's keypair and start accepting delegated requests
  2. stopQVACProvider() — stop accepting requests

Consumer:

  1. loadModel() — with delegate option
  2. completion() / transcribe() / translate() / etc. — same as local
  3. unloadModel()

For how to use each function, see SDK — API reference.

Provider

Binds the DHT server on its keypair and serves delegated requests. It publishes its public key; consumers use that key to connect directly.

Consumer

Creates a delegated model via loadModel({ delegate: ... }). delegate main options:

  • providerPublicKey: provider public key (required)
  • timeout: request timeout in ms (optional). Use a generous value (e.g. 60_000) on the first call — cold-DHT bootstrap can take 15–45s. Subsequent calls reuse the open socket and are sub-second.
  • fallbackToLocal: if true, run locally when delegation fails (optional)
  • forceNewConnection: if true, do not reuse cached connections (optional)

Examples

Consumer

The following script shows an example of a consumer that delegates completion() requests to a provider:

delegated-inference-consumer.js
import { completion, LLAMA_3_2_1B_INST_Q4_0, loadModel, close, } from "@qvac/sdk";
const providerPublicKey = process.argv[2];
if (!providerPublicKey) {
    console.error("❌ Provider public key is required. Usage: node consumer.ts <provider-public-key> [consumer-seed]");
    process.exit(1);
}
try {
    // Optional: Consumer seed for deterministic consumer identity (for firewall testing)
    const consumerSeed = process.argv[3];
    process.env["QVAC_HYPERSWARM_SEED"] = consumerSeed;
    console.log(`🚀 Testing delegated inference`);
    console.log(`🔑 Provider: ${providerPublicKey}`);
    if (consumerSeed) {
        console.log(`🔑 Consumer seed: ${consumerSeed.substring(0, 16)}... (deterministic identity)`);
    }
    else {
        console.log(`🎲 No consumer seed provided (random identity)`);
    }
    const modelId = await loadModel({
        modelSrc: LLAMA_3_2_1B_INST_Q4_0,
        modelType: "llm",
        delegate: {
            providerPublicKey,
            // Generous timeout for the first call on a cold DHT: bootstrapping
            // hyperdht and looking up the provider's key can take 15–45s on the
            // very first run. Subsequent connections in the same process are
            // sub-second because the DHT is already warm.
            timeout: 60_000,
            fallbackToLocal: true, // Optional: Fall back to local inference if delegation fails
            // forceNewConnection: true, // Optional: Force a new connection instead of reusing cached one
        },
        onProgress: (progress) => {
            console.log(`📊 Download progress: ${progress.percentage.toFixed(1)}% (${progress.downloaded}/${progress.total} bytes)`);
        },
    });
    console.log(`✅ Delegated model registered: ${modelId}`);
    const response = completion({
        modelId,
        history: [{ role: "user", content: "Hello!" }],
        stream: true,
    });
    for await (const token of response.tokenStream) {
        console.log(`📨 Response: ${token}`);
    }
    console.log("🔍 Stats:", await response.stats);
    console.log("\n🎯 Delegation infrastructure working! Server correctly detected and routed the delegated request.");
    void close();
}
catch (error) {
    console.error("❌ Error:", error);
    process.exit(1);
}

Provider

The following script shows an example of starting a provider and printing its publicKey for consumers:

delegated-inference-provider.js
import { startQVACProvider } from "@qvac/sdk";
// Optional: Seed for deterministic provider identity (64-character hex string)
const seed = process.argv[2];
if (seed) {
    process.env["QVAC_HYPERSWARM_SEED"] = seed;
}
// Optional: Consumer public key for firewall (allow only this consumer)
const allowedConsumerPublicKey = process.argv[3];
console.log(`🚀 Starting provider service...`);
try {
    if (allowedConsumerPublicKey) {
        console.log(`🔒 Firewall enabled: only allowing consumer ${allowedConsumerPublicKey}`);
    }
    const response = await startQVACProvider({
        firewall: allowedConsumerPublicKey
            ? {
                mode: "allow",
                publicKeys: [allowedConsumerPublicKey],
            }
            : undefined,
    });
    console.log("✅ Provider service started successfully!");
    console.log("🔗 Provider is now available for delegated inference requests");
    console.log("");
    console.log("📋 Connection Details:");
    console.log(`   🆔 Provider Public Key: ${response.publicKey}`);
    console.log("");
    console.log("💡 Consumer command:");
    console.log(`   node consumer.ts ${response.publicKey}`);
    console.log("");
    console.log("💡 To reproduce this provider identity:");
    console.log(`   node provider.ts ${seed || "<random-seed>"}`);
    if (!seed) {
        console.log("   (Note: seed was random this time, set one for reproducible identity)");
    }
    console.log("");
    console.log("🔒 For firewall testing:");
    console.log("   1. Generate a consumer seed (64-char hex)");
    console.log("   2. Get consumer public key: getConsumerPublicKey(consumerSeed)");
    console.log("   3. Restart provider with consumer public key as 2nd argument");
    console.log(`   4. Run consumer with: node consumer.ts ${response.publicKey} <consumer-seed>`);
    console.log("📡 Provider is running... Press Ctrl+C to stop");
    process.on("SIGINT", () => {
        console.log("\n🛑 Provider service stopped");
        process.exit(0);
    });
    process.stdin.resume();
}
catch (error) {
    console.error("❌ Error:", error);
    process.exit(1);
}

Tip: all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see SDK quickstart.

Notes

  • Consumers do not handle reconnection automatically yet. If the provider restarts, restart the consumer.
  • To stop a running provider, call stopQVACProvider().
  • When starting the provider, you can optionally set a firewall rule to allow/deny specific consumer public keys.
  • Cold-start DHT bootstrap on the first connect can take 15–45s; subsequent connections in the same process are sub-second.

On this page