Haufe.ai | Overview

Copilot via API gives you access to a specialized AI agent through a simple API. You choose a copilot (e.g. Tax, HR) by providing an assistant_id, and then interact with it in one of two modes:

Mode	Description
Threads	Create a thread, exchange messages, and let the platform manage conversation history.
Chat Completions	Send a request, get a response. No server-side state.

Both modes invoke the same underlying agent with the same capabilities. The only difference is how conversation state is managed.

Choosing a copilot

Every request requires an assistant_id that determines which copilot handles your query. Each copilot is scoped to a specific domain and configured with the relevant tools and knowledge sources.

Your available copilots and their assistant_id values are provided during onboarding.

Threads (endpoints)

Use this mode when you want to build a multi-turn conversation where context accumulates over time. The platform stores the conversation history for you.

The workflow is a simple loop:

1. Create a thread

curl --request POST \
  --url https://api.haufe.ai/agents/v1/threads \
  --header 'content-type: application/json' \
  --header 'api-key: <API_KEY>' \
  --data '{
    "assistant_id": "<ASSISTANT_ID>",
    "user_id": "<USER_ID>",
    "title": "My conversation"
  }'

import requests

response = requests.post(
    "https://api.haufe.ai/agents/v1/threads",
    headers={"api-key": "<API_KEY>"},
    json={
        "assistant_id": "<ASSISTANT_ID>",
        "user_id": "<USER_ID>",
        "title": "My conversation",
    },
)
thread_id = response.json()["id"]

const response = await fetch("https://api.haufe.ai/agents/v1/threads", {
  method: "POST",
  headers: {
    "content-type": "application/json",
    "api-key": "<API_KEY>",
  },
  body: JSON.stringify({
    assistant_id: "<ASSISTANT_ID>",
    user_id: "<USER_ID>",
    title: "My conversation",
  }),
});
const { id: threadId } = await response.json();

2. Add a message

curl --request POST \
  --url https://api.haufe.ai/agents/v1/threads/:thread_id/messages \
  --header 'content-type: application/json' \
  --header 'api-key: <API_KEY>' \
  --data '{
    "role": "user",
    "content": "What are the deadlines for filing a tax return?"
  }'

response = requests.post(
    f"https://api.haufe.ai/agents/v1/threads/{thread_id}/messages",
    headers={"api-key": "<API_KEY>"},
    json={
        "role": "user",
        "content": "What are the deadlines for filing a tax return?",
    },
)

await fetch(
  `https://api.haufe.ai/agents/v1/threads/${threadId}/messages`,
  {
    method: "POST",
    headers: {
      "content-type": "application/json",
      "api-key": "<API_KEY>",
    },
    body: JSON.stringify({
      role: "user",
      content: "What are the deadlines for filing a tax return?",
    }),
  }
);

3. Generate a response

curl --request POST \
  --url https://api.haufe.ai/agents/v1/threads/:thread_id/run \
  --header 'content-type: application/json' \
  --header 'api-key: <API_KEY>' \
  --data '{
    "meta_data": {
      "user_data": {
        "licence": "<LICENSE_ID>"
      }
    }
  }'

response = requests.post(
    f"https://api.haufe.ai/agents/v1/threads/{thread_id}/run",
    headers={"api-key": "<API_KEY>"},
    json={
        "meta_data": {
            "user_data": {
                "licence": "<LICENSE_ID>",
            }
        }
    },
)
answer = response.json()

const response = await fetch(
  `https://api.haufe.ai/agents/v1/threads/${threadId}/run`,
  {
    method: "POST",
    headers: {
      "content-type": "application/json",
      "api-key": "<API_KEY>",
    },
    body: JSON.stringify({
      meta_data: {
        user_data: {
          licence: "<LICENSE_ID>",
        },
      },
    }),
  }
);
const answer = await response.json();

Repeat steps 2–3 to continue the conversation. The thread persists between requests, so each run has access to the full conversation history. The platform handles context truncation automatically when conversations grow long.

Allowed Licenses

info

When calling /run, you must provide a valid license ID in the meta_data.user_data.licence field. Your license ID is provided by your Haufe contact person.

Using an invalid or mismatching license ID may lead to degraded response quality but typically does not raise an error.

Chat Completions (endpoints)

Use this mode when you need a one-off answer or want to integrate the copilot as a tool within your own application.

You provide the full list of messages in each request. Each message has a role (user, system, or assistant) and a content field. The API does not store anything between requests.

curl --request POST \
  --url https://api.haufe.ai/agents/v1/chat/completions \
  --header 'content-type: application/json' \
  --header 'api-key: <API_KEY>' \
  --data '{
    "assistant_id": "<ASSISTANT_ID>",
    "messages": [
      {
        "role": "user",
        "content": "What are the deadlines for filing a tax return?"
      }
    ],
    "meta_data": {
      "user_data": {
        "licence": "<LICENSE_ID>"
      }
    }
  }'

import requests

response = requests.post(
    "https://api.haufe.ai/agents/v1/chat/completions",
    headers={"api-key": "<API_KEY>"},
    json={
        "assistant_id": "<ASSISTANT_ID>",
        "messages": [
            {
                "role": "user",
                "content": "What are the deadlines for filing a tax return?",
            }
        ],
        "meta_data": {
            "user_data": {
                "licence": "<LICENSE_ID>",
            }
        },
    },
)
answer = response.json()

const response = await fetch(
  "https://api.haufe.ai/agents/v1/chat/completions",
  {
    method: "POST",
    headers: {
      "content-type": "application/json",
      "api-key": "<API_KEY>",
    },
    body: JSON.stringify({
      assistant_id: "<ASSISTANT_ID>",
      messages: [
        {
          role: "user",
          content: "What are the deadlines for filing a tax return?",
        },
      ],
      meta_data: {
        user_data: {
          licence: "<LICENSE_ID>",
        },
      },
    }),
  }
);
const answer = await response.json();

info

You can still have multi-turn conversations with chat completions — just include the full message history in every request. This gives you complete control over what context the copilot sees.

Which mode should I use?

Building a chat interface where users have ongoing conversations? → Threads
Calling the copilot as a tool from your own backend or workflow? → Chat Completions
Need full control over what context is sent with each request? → Chat Completions
Don't want to manage conversation history yourself? → Threads

Overview

Choosing a copilot

Threads (endpoints)

1. Create a thread

2. Add a message

3. Generate a response

Allowed Licenses

Chat Completions (endpoints)

Which mode should I use?

Next Steps

Quickstart

Error Handling

On this page