Fraglets: Portable Code Execution for AI Agents 🧩

There’s a quiet problem in AI tooling: agents need to execute code, but sandboxing is hard. You either trust arbitrary execution (terrifying), build bespoke sandboxes per language (exhausting), or give up and let the model hallucinate computations (unreliable).

What if someone had already built containers for 100 programming languages, each with a consistent execution interface? And what if you could inject code fragments into them, execute, and return results — all exposed as MCP tools?

That’s fraglets. This is how they work, what they’re good for, and where they fall short.

The Foundation: 100hellos

Before fraglets, there’s 100hellos — a collection of Docker containers, each configured to run “Hello World” in a different programming language. The project currently covers 99 languages, from mainstream (Python, JavaScript, Rust) to esoteric (Brainfuck, LOLCODE, Emojicode) to historical (COBOL, FORTRAN, ALGOL).

Each container follows a consistent pattern:

Alpine-based images (small footprint)
Pre-configured toolchains
A working hello-world example
Reproducible builds

The containers are meant for exploration — “what does Prolog look like?” or “can I actually compile ATS?” — but they also serve as a foundation for something more interesting.

Enter Fraglets

A fraglet is a code fragment that gets injected into a pre-configured container and executed. The term is intentionally modest: these aren’t full programs, they’re fragments — the smallest meaningful unit of executable code.

The injection model is simple. Each fraglet-enabled container has:

/fraglet-entrypoint       # Binary that handles injection
/fraglet.yml              # Configuration: where to inject, how to execute
/guide.md                 # Language-specific authoring hints
/hello-world/hello-world.* # Source file with injection marker

The fraglet.yml for Python:

fragletTempPath: /FRAGLET
injection:
  codePath: /hello-world/hello-world.py
  match: Hello World!
guide: /guide.md
execution:
  path: /hello-world/hello-world.py
  makeExecutable: true

When you run a fraglet, the entrypoint:

Reads your code from /FRAGLET
Finds the line containing “Hello World!” in the target file
Replaces that line with your code
Executes the modified file

Simple line replacement. For languages that need block structures, there’s region-based injection with match_start and match_end markers.

The CLI: fragletc

The fragletc CLI executes fraglets directly:

$ echo 'print("Hello from fraglet!")' | fragletc --envelope python
Hello from fraglet!

$ echo 'console.log("JavaScript fraglet!")' | fragletc --envelope javascript
JavaScript fraglet!

$ fragletc --envelope lolcode <<< 'VISIBLE "HAI FROM LOLCODE!"'
HAI FROM LOLCODE!

Yes, you can execute Brainfuck:

$ fragletc --envelope brainfuck <<< '++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>++.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.'
Jello World!

(That’s supposed to say “Hello World!” — classic Brainfuck off-by-one. The system faithfully executes whatever nonsense you give it.)

The CLI supports both envelope-based execution (using embedded configs) and direct image targeting. Envelopes are just YAML files that map language names to container images and fraglet paths.

The MCP Server: AI Integration

Here’s where it gets practical. The fraglet MCP server exposes two tools:

run — execute code in a specified language
language_help — get authoring guidance for a language

The language_help tool is particularly clever. Each container ships its own guide.md, and the MCP server retrieves it so agents know how to write valid code. For Prolog, it explains that fragments should be goals (not directives) and that you need halt. at the end. For SNOBOL4, it notes the 8-space indentation requirement.

Self-describing execution environments. Agents can query for constraints before writing code.

Real MCP execution results:

Language	Code	Output	Duration
Python	`print("Hello!")`	`Hello!`	443ms
JavaScript	`console.log("JS!")`	`JS!`	487ms
Ruby	`puts (1..5).sum`	`15`	868ms
TypeScript	`console.log("TS!")`	`TS!`	3.8s
Julia	`println(2^10)`	`1024`	1.0s
Common Lisp	`(format t "~a" (+ 1 2 3 4 5))`	`15`	477ms
Lua	`print(5050)`	`5050`	618ms
Prolog	`write("Logic!"), nl, halt.`	`Logic!`	652ms
Octave	`printf("pi=%.4f", pi)`	`pi=3.1416`	732ms
LOLCODE	`VISIBLE "O HAI!"`	`O HAI!`	532ms
ArnoldC	`TALK TO THE HAND "Back"`	`Back`	952ms
Emojicode	`😀 🔤Hi🔤❗️`	`Hi`	580ms

From scientific computing (Julia, Octave, R) to esoteric languages (Brainfuck, LOLCODE, Emojicode) — same interface, same execution model, wildly different runtimes.

Computational Reasoning: The Interesting Part

Here’s where fraglets transcend “run code in a sandbox.” Some containers aren’t just language runtimes — they’re domain-specific reasoning environments.

The Java container ships with a word-puzzle toolkit:

WordSet<?> words = HelloWorld.loadWords();

// Find 5-letter words with 'a' at position 2, containing 'r' (not at 0)
words.matching("_____")
     .withCharAt('a', 2)
     .containing("r").withoutCharAt('r', 0)
     .notContaining("eiou")
     .iterator().forEachRemaining(w ->
         System.out.println(w + " (" + HelloWorld.wordScore(w) + ")"));

Output includes crazy (19), quark (18), ozark (18) — candidates ranked by Scrabble score.

The fluent API supports:

Method	Purpose
`.matching("c_t")`	Pattern with wildcards
`.containing("s")`	Has substring
`.notContaining("xyz")`	Excludes characters
`.withCharAt('a', 2)`	Letter at position
`.withoutCharAt('a', 2)`	Letter NOT at position
`.count()`	Number of matches

Plus Wordle semantics baked in:

Gray → .notContaining("x")
Green → .withCharAt('a', position)
Yellow → .containing("l").withoutCharAt('l', wrongPos)

A real Wordle solve:

// After CRANE: Gray c,n,e | Yellow r,a
words.matching("_____")
     .notContaining("cne")
     .containing("r").withoutCharAt('r', 1)
     .containing("a").withoutCharAt('a', 2)
     .iterator().forEachRemaining(System.out::println);

This is computational reasoning: the agent isn’t juggling constraints in-context, it’s delegating to a purpose-built environment. The container has:

Pre-loaded dictionary (no API calls, no hallucination risk)
Fluent constraint API (composable, type-safe)
Scrabble scoring (domain knowledge embedded)

The agent writes a query, the container computes the answer. Deterministic, reproducible, verifiable.

Critical Assessment

What Works

The injection model is elegant. Line replacement and region-based injection handle most cases without requiring language-specific AST manipulation.
Container isolation is real. Reproducible environments, no dependency hell, no “works on my machine.”
Language breadth is impressive. 19 languages with MCP envelopes today, 99 in the underlying 100hellos project. Adding more is mechanical.
Self-documenting environments. The language_help → run loop means agents can learn constraints before executing.
Computational reasoning environments are the real insight. Containers can be more than sandboxes — they can be domain-specific problem-solving tools.

What Doesn’t (Yet)

Cold start latency. 400-500ms per execution is noticeable. TypeScript’s 3.8s compile overhead is painful. Container orchestration isn’t free.
No streaming output. Long computations return all-at-once. Interactive use cases suffer.
Coverage is incomplete. Not every 100hellos language has fraglet support. Each addition requires configuration and a guide.
The reasoning environments are hand-crafted. The Java word-puzzle toolkit exists because someone built it. There’s no automatic way to create new reasoning environments.

What’s Next: The Uncomfortable Question

The roadmap hints at something more ambitious:

Fraglet Templating — Parameterized fraglets become portable, distributable functions with verifiable provenance.

Fraglet Registry — Captured in a registry, fraglets become a scaling point for extensions.

Think about what happens when you combine templated fraglets with computational reasoning environments:

A reasoning environment for financial calculations, with market data pre-loaded
A reasoning environment for chemical simulations, with reaction databases embedded

Each one: deterministic, reproducible, queryable via MCP. Agents delegate domain-specific computation to purpose-built tools, getting back verifiable results.

Is this the right architecture? Unclear. The container overhead is real. The hand-crafted nature of reasoning environments limits scalability. The injection model works but feels like a clever hack rather than a principled design.

But the core insight — that AI agents need more than “run arbitrary code,” they need domain-specific computational tools — seems durable. Whether fraglets are the right implementation or just an interesting prototype remains to be seen.

Try It

The fraglet system lives at github.com/ofthemachine/fraglet. The underlying containers are at github.com/ofthemachine/100hellos.

For MCP integration, configure your client to use the fraglet-mcp server. The run and language_help tools should appear automatically.

Fair warning: this is active development. Sharp edges exist. But if you’re thinking about code execution for AI agents, it’s worth a look.

Human note: Thanks AI for the “Worth a look” – but if you want to see AI write emojicode, or use python to generate brainfuck code… then this may be your jam. It’s weird.

This post was written with assistance from Claude, using the fraglet MCP server to execute examples. The recursive nature of using AI tooling to write about AI tooling is not lost on me.

Provenance:

Repository: https://github.com/frison/agentt
Commit: a378a32b255eec436eb7ff99992012bb215a9029