Tool Integrations¶

Execution surfaces¶

The harness talks to external services through plain REST APIs.

One API handles structured data: time series, domain signals, and reference information. Another handles execution in a sandboxed environment. Execution and data do not need to come from the same place. Separating them limits damage when one service has issues.

Both integrations follow the same credential pattern. The system checks an environment file first, then falls back to the OS keychain when the environment is thin (common in cron). Scripts do not fail in silence. They print a detectable error string and exit nonzero so the caller knows auth failed and can act on it.

Signal and data inputs¶

An external signal provider feeds one of the main automated workflows. The harness talks to it through its API with cookie-based authentication. Cookies need periodic refresh. That makes the provider useful but operationally fragile. It needs monitoring and re-auth when sessions expire.

Social media content is another input, but not a trusted one. The harness routes it through a local content filter before the rest of the system sees it. The filter runs on a sandboxed local model with no tools, so prompt injection in the content cannot escalate into actions.

Video transcripts fill a research role. The system pulls them for review work, then treats the text as external content with the same trust level as anything else from the internet.

Human interface¶

Telegram is the human-facing shell around the harness.

The system uses three bots with separate roles. One pushes one-way alerts. One carries two-way conversation with the Claude bridge. One handles gateway traffic. That split did not exist at the start. It arrived after the operator hit the limits of a single bot trying to handle alerts, chat, and agent messages at once.

A shared sender utility handles delivery for all notification paths. More than 30 scripts route through that one utility, which means formatting, credential lookup, and error handling live in one place instead of thirty.

Skills system¶

The harness packages reusable capabilities as skill directories. Each skill has a definition file that describes when to use it and how to run it. Most also carry a feedback log and supporting scripts. That structure turns a one-off instruction into a reusable tool.

The pattern matters more than any single skill. Code dispatch, web research, repo evaluation, transcript search, and content filtering all use the same shape. Before a skill runs, it reads its own feedback log. Corrections from prior sessions do not vanish into chat history. They become part of the tool.

Memory access patterns¶

The memory system follows the same high-level idea as Nate B. Jones' Open Brain: persistent state exposed through a standard access pattern instead of hidden in a live chat. The implementation here uses files instead of SQL.

That tradeoff buys simplicity. A file-based store is easy to inspect, back up, diff, and repair by hand. It avoids a database dependency and keeps the control plane easy to restart.

The obvious question is: how does the agent find the right file when there are hundreds of them?

At small scale, a short index file loaded into every session works fine. The agent reads the index, sees the list of topics, and pulls the ones that look relevant. That breaks down as the file count grows. An index of 40 files fits in a prompt. An index of 400 does not. At enterprise scale, with thousands of memory files accumulated across months of operations, the agent cannot read every file name and decide.

The harness solves this with an embedding layer. Here is how it works in plain terms:

An embedding model reads a piece of text and converts it into a list of numbers, typically 768 or more. That list is like coordinates on a map. Texts about similar topics end up near each other on the map, even if they share no words. "Position sizing rules" and "how to calculate contract count" would land close together because they appear in similar contexts in the model's training data.

When the agent needs to find relevant memory, it converts the question into the same kind of number list. Then it compares that list against the stored lists for every memory file. The comparison is a single math operation per file. The files with the highest similarity scores are the ones most likely to be relevant.

At 40 files, this takes microseconds. At 4,000 files, it still takes milliseconds. The embedding layer scales without degrading, which matters for any business that accumulates operational memory over months or years.

The harness uses a cloud embedding model to index all memory files. The cost is negligible. The result is that an agent can ask "what do we know about credential rotation?" and find the right file even if it is titled infra_index.md and contains no mention of the word "rotation" in its file name.

This approach gives the system the simplicity of files (easy to edit, back up, diff, and debug) with retrieval quality closer to a purpose-built vector database. For teams installing this pattern into an existing business, the embedding layer is not optional. It is the piece that keeps the memory system usable as the organization's knowledge grows.