Quick Start

Lyumen is a free OpenAI-compatible LLM inference API. No account. No API key. Just send requests.

curl

Python

JavaScript

curl https://lyumen-api.okotto.workers.dev/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
                            

import requests

response = requests.post(
    "https://lyumen-api.okotto.workers.dev/v1/chat/completions",
    json={
        "model": "gemini-3-flash",
        "messages": [{"role": "user", "content": "Hello!"}]
    }
)
print(response.json())
                            

fetch("https://lyumen-api.okotto.workers.dev/v1/chat/completions", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "gemini-3-flash",
    messages: [{ role: "user", content: "Hello!" }]
  })
}).then(res => res.json()).then(console.log);
                            

How It Works

Requests are proxied to inference backends. Everything is logged to a database. Logged data may be sold as AI training data. Transparent, no surprises.

POST /v1/chat/completions

Create a completion for the chat message.

Parameter	Type	Required	Description
`model`	string	yes	Model ID from /v1/models
`messages`	array	yes	OpenAI-format messages array
`stream`	boolean	no	Enable SSE streaming (default false)
`max_tokens`	integer	no	Max tokens to generate

Example Request

JSON

{
  "model": "gemini-3-flash",
  "messages": [
    {"role": "user", "content": "Hi"}
  ]
}

Example Response

JSON

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gemini-3-flash",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help?"
    },
    "finish_reason": "stop",
    "index": 0
  }]
}

GET /v1/models

Retrieve a list of available models.

Response

{
  "data": [
    { "id": "gemini-3-flash", "object": "model" },
    { "id": "minimax-m2.7", "object": "model" }
  ]
}

GET /v1/logs/count

Get the total number of requests processed by Lyumen.

Response

{ "count": 42 }

curl

Standard curl examples for Lyumen.

Non-streaming

bash

curl https://lyumen-api.okotto.workers.dev/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Streaming

bash

curl https://lyumen-api.okotto.workers.dev/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Python

You can use Lyumen with the standard requests library or the official OpenAI Python SDK.

Using Requests

python

import requests

url = "https://lyumen-api.okotto.workers.dev/v1/chat/completions"
data = {
    "model": "gemini-3-flash",
    "messages": [{"role": "user", "content": "How are you?"}]
}
response = requests.post(url, json=data)
print(response.json())

Using OpenAI SDK

python

from openai import OpenAI

client = OpenAI(
    base_url="https://lyumen-api.okotto.workers.dev/v1",
    api_key="lyumen" # Any string works
)

completion = client.chat.completions.create(
    model="gemini-3-flash",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(completion.choices[0].message.content)

RooCode

Step by step configuration for RooCode:

Open RooCode settings
Set API Provider to OpenAI Compatible
Base URL: https://lyumen-api.okotto.workers.dev
API Key: any string e.g. lyumen
Model: gemini-3-flash

VS Code / Continue

Add the following to your config.json for the Continue extension:

config.json

{
  "models": [
    {
      "title": "Lyumen Gemini",
      "provider": "openai",
      "model": "gemini-3-flash",
      "apiKey": "lyumen",
      "apiBase": "https://lyumen-api.okotto.workers.dev/v1"
    }
  ]
}

OpenWebUI

To use Lyumen with OpenWebUI:

Go to Admin Panel → Settings
Select Connections → OpenAI
Set the API Base URL to https://lyumen-api.okotto.workers.dev/v1
Set a dummy API Key (e.g., lyumen)

Legal

Data Policy

Plain English Policy: All requests including prompts and completions are logged. Data may be packaged and sold as AI training data to third parties. Do not send sensitive, personal, or confidential information. No user accounts means no way to delete your data after the fact.

Terms of Use

Free to use for all purposes.
No Service Level Agreement (SLA) or uptime guarantee.
The service can be modified or shut down at any time.
All data is logged and may be used for AI training.
Do not use for illegal purposes or to violate backend provider terms.