01 /

The Infrastructure

The foundation: a network of specialized agents that operates, monitors, and improves itself continuously — without being asked. This is what everything else runs on top of.

Active
Self-Improving Pipeline

A network of specialized AI agents that continuously audits its own codebase, generates improvements, validates them under multi-layer safety constraints, and deploys them — around the clock, without human direction. 2,900+ autonomous deployments. Pipeline running continuously. No human writes the patches. No human approves routine deploys. The system ships itself.

Multi-agent consensus — every patch is reviewed by GX10-2 reviewer, Artemis, and Tester before it ships
Auto-revert on failure — health checks run post-deploy; bad patches roll back automatically
Human approval gate — core infrastructure changes require explicit confirmation before executing
Watchdog protection — any task stalled longer than 20 minutes is killed before it blocks the epoch
Crash loop detection — auto-freezes the pipeline if the same file fails three times in a row
Shadow file system — every live agent keeps a shadow copy; divergence triggers automatic rebuild
Active Agents
Artemis Forge Tester Deployer Sherlock Maven Chronicle FiveO Git Agent Coder Fast Ledger
Hardware
GX10-1 · GX10-2 · QB-2 · QB-3 · 3080 Node · A9 Max
6 machines · distributed · continuous deployment
2,900+
Autonomous Deploys
950+
Tasks Completed
12
Active Agents
~15s
Avg Cycle Time
Running
Inference Stack

Distributed inference across four machines — each model matched to its role. A 122B MoE reasoning model for orchestration and patch review. A coder model for patch generation and background tasks. Local Ollama inference on dedicated AMD GPU hardware. Connected via 200GbE QSFP direct interconnect — no switch, no shared network, no cloud hop. Every inference call stays on bare metal.

GX10-1
Qwen3.5-122B-A10B — Artemis + inference host
128 GB VRAM
GX10-2
Forge + Qwen3.5-122B reviewer (port 8081, ctx 65K)
128 GB VRAM
QB-2
Deployer, Tester — dual RTX 3090 NVLink
48 GB NVLink
QB-3
Ollama inference — Qwen3-Coder + models
Dual AMD R9700
GX10-1 inference — Qwen3.5-122B-A10B MoE serves Artemis orchestration and research on 128 GB VRAM
GX10-2 reviewer — Qwen3.5-122B on port 8081, 65K context, reasoning-budget 512 — dedicated patch review
QB-3 Ollama — Qwen3-Coder and open-weight models; dual AMD R9700 GPUs; independent from pipeline agents
Zero cloud inference — all model calls happen on local hardware; no tokens leave the network
65,536 token context — full agent history and codebase diff fits in a single call
235B
Orchestration Model
80B
Code Generation
200G
Direct Interconnect
65K
Context Window (tokens)
Active — Phase 3
Capability Frontier Engine

Most AI systems fail silently — they retry the same broken approach until they time out. The CFE does the opposite. When the pipeline encounters something it cannot do, it classifies the failure, identifies the missing capability, and autonomously builds the scaffolding needed before attempting again. The system doesn't just fix bugs — it expands what it's capable of, one failure at a time.

Phase 1 — Classify
Failure Taxonomy

Every pipeline failure is classified into one of six root-cause categories — missing capability, broken dependency, structural code issue, stale source, task description problem, or environmental fault. No failure is treated as noise.

Phase 2 — Map
Capability Registry

A live registry of 22+ tracked capabilities — what the system can do, what's in progress, and what's missing. When a classified failure maps to a missing capability, that gap is flagged and queued for autonomous construction.

Phase 3 — Gate
Capability Gate (Active)

Before any task enters the pipeline, Artemis checks whether required capabilities exist. Tasks depending on missing capabilities are held — not retried blindly — until the dependency is built and verified. This phase is live.

Phase 4 — Expand
Autonomous Construction

The system generates the scaffolding, tests it, marks the capability available, then releases held tasks. The frontier moves forward without human direction. Each expansion makes the next failure cheaper to resolve.

Failure classifier Capability registry Classification model Deployer integration Artemis gate Dependency enforcement Frontier probe runner Multi-file coordination Auto-construction loop Full frontier closure
02 /

Industrial AI

This is what the infrastructure produces. Four real-world ML projects — manufacturing, energy, oil & gas — each built, trained, evaluated, and iterated by the autonomous system. No data scientists. No notebooks. The pipeline owns the full loop from raw dataset to deployed model.

Active
Autonomous ML Training Harness

A unified pipeline that handles every stage of the ML lifecycle — data ingestion, feature engineering, drift detection, training, validation, and deployment — across multiple industrial projects simultaneously. The same agents that improve the pipeline codebase also iterate on model architectures, debug poor results, and queue fixes autonomously. A human sets the objective. The system handles the rest.

8
Active Projects
7
Models in Production
3.5M+
Training Rows (8 projects)
1,139
Max Features (auto-engineered)
Project Outcome Reports →
Manufacturing · Fault Detection · XGBoost
Azure Predictive Maintenance

Binary fault classifier on Microsoft Azure's predictive maintenance benchmark — 100 machines, 4 component types, hourly telemetry joined with error history, maintenance records, and failure events. The model identifies which machines are about to fail before they do, enabling pre-emptive maintenance instead of reactive repair.

0.942
F1 Score
Energy · Power Grid · Fault Detection · XGBoost
Energy Grid Fault Detection

Binary fault classifier on power transmission grid data — identifying electrical fault events from voltage and current sensor readings across three phases. Auto-engineered features include inter-phase ratios and rolling statistics. The model separates genuine fault signatures from normal load variation and transient noise with high precision across the full test set.

0.958
F1 Score
Energy · Wind · Fault Detection · XGBoost
Senvion MM92 — Kelmarsh Wind Farm

Fault detection on real SCADA data from Kelmarsh wind farm (UK), operated by Cubico Sustainable Investments. Senvion MM92 turbines — a different manufacturer, climate, and label derivation method from the training source. The model achieves near-identical F1 on completely unseen hardware, validating that the learned fault signatures generalize beyond the original fleet.

0.9375
F1 Score
Energy · Battery Storage · Anomaly Detection · XGBoost
Battery EV Storage Anomaly Detection

Anomaly classifier on EV battery pack telemetry — detecting cell degradation, thermal runaway precursors, and pack-level faults from voltage, current, temperature, and SoC signals. Small dataset, high signal-to-noise: the harness auto-tunes threshold selection and engineers 71 features from raw pack readings. Perfect recall on the test set — zero missed anomalies.

0.983
F1 Score
Energy · Wind · Anomaly Detection · XGBoost
Wind Farm A — SCADA Fault Detection

Autonomous fault and anomaly detection on real SCADA telemetry from Wind Farm A (EDP dataset). 54 sensors at 10-minute intervals, 1.8 million rows. The harness auto-engineers 591 features including cross-sensor ratios, rolling statistics, and lag features. Drift detection is live — the model triggers its own retraining when production distribution shifts beyond a PSI threshold.

0.882
F1 Score
Oil & Gas · Production Rate · VFM · XGBoost
Volve North Sea — Virtual Flow Metering

Predicts daily oil production rate (Sm³/day) from wellhead pressure, choke size, and downhole sensor readings — replacing expensive physical well tests with a data-driven virtual flow meter. Norwegian North Sea, Volve field, 2008–2016, 6 producing wells. Four root-cause fixes were applied autonomously (distribution shift encoding, split cutoff enforcement, config key correction, spurious proxy removal). Model retrained and producing results.

0.993
R² Score (Test)
Healthcare · Pharmacovigilance · Signal Detection · XGBoost
Adverse Drug Event Signal Detection

Classifies FDA adverse event reports to surface genuine drug-event pharmacovigilance signals from noise — separating real safety signals from reporting bias, concomitant medication confounders, and the Weber effect. Trained on full-year 2024 FAERS data (4 quarters, 984K reports). The harness's first mixed-domain healthcare project — tabular structured fields alongside derived signal features.

0.862
F1 Score (Test)
AUC 0.934
Aerospace · Predictive Maintenance · RUL Regression · XGBoost
NASA Turbofan — Remaining Useful Life

Predicts remaining useful life (cycles to failure) of turbofan engines from multivariate sensor time-series — the harness's first degradation modeling problem. NASA CMAPSS dataset (FD001–FD004): 234K training rows across four fault-mode subsets with variable altitude, Mach, and throttle conditions. The model learns how sensor patterns drift as fan, HPC, and HPT components wear, giving a per-cycle RUL estimate rather than a binary fault flag.

0.890
R² Score (Test)
03 /

Live Systems

Live Trading
TradeShadow

Autonomous cryptocurrency trading running live capital on Kraken. Momentum strategies with adaptive parameters, continuous trailing stops, breakeven locks, and ratcheting stop-loss logic. The same self-improving infrastructure manages real money on a physically isolated node — no shared memory, no shared network path with the pipeline.

Isolated A9 Max node — no pipeline cross-contamination
15-minute stop-loss verification against live Kraken orders
Dead-man's switch with alert on connectivity loss
Sentiment feed — Fear & Greed, BTC dominance, Solana rank
A9 Max (isolated) Live on Kraken
Active Pairs
6 live
SL Verification
Every 15m
Node isolation
Full
Exchange
Kraken
Risk architecture

Continuous trailing stops activate at ≥4% profit and trail at current price minus stop %. Breakeven lock and ratchet SL prevent winners from turning to losers. Every 15 minutes a separate verification process cross-checks all positions against live Kraken open orders. The A9 Max node has no shared memory or network path with the pipeline — the trading system can't be modified by an autonomous deploy.

SOLETH LINKDOT ADADOGE
Running
Qulix Intelligence Layer

Every night the system reads its own activity log — deploys, failures, research findings, trade moves — and writes a narrative of what it built, fixed, and learned. Published automatically. No human writes it, edits it, or approves it. The system documents its own evolution in its own voice.

Daily, weekly, and monthly cadence — fully automated
Reads pipeline metrics, trade journal, and research findings
Publishes to the blog and updates the site — zero human input
Nightly at 23:45 Zero human editing
Read more
Cadence
Nightly
Human edits
Zero
Data sources
5+
Publishes to
Blog + Site
What Qulix reads

Every night at 23:45, Chronicle pulls the full epoch summary, deploy history, test results, TradeShadow trade log, ML project metrics, and Sherlock's market analysis. It synthesizes these into a narrative post — what improved, what failed and why, what the models are doing, and what the system is building next. The voice is consistent. The analysis is real.

Pipeline metrics Trade journal ML results Deploy history Auto-published
Live
Realtix

Real estate market intelligence. Score ZIP codes by growth indicators — population trends, job growth, permit activity, price appreciation. Drill into any area for Grok-powered analysis and live listing links.

FRED + Census + XAI Any US ZIP code
Data sources
FRED + Census
AI analysis
Grok / xAI
What it does

Enter any US ZIP code and Realtix scores it across population growth, employment trends, building permit activity, and price appreciation. Grok provides a plain-language market narrative. Live listing links connect directly to current inventory. Built and deployed by the pipeline.

ZIP scoring FRED data Census trends Grok analysis
Open Realtix →
Live
Takeoff

Construction takeoff directly in the browser. Drop in a PDF plan, draw dimensions, count materials by type, and export — no desktop software needed. Built for speed on any device.

Static · no install PDF.js powered
Platform
Browser
Engine
PDF.js
Use case

Upload any construction plan PDF, set a scale reference, then draw lines and mark areas directly on the plan. Takeoff tallies quantities by material category in real time and exports a summary. No install, no account, no desktop app required.

PDF upload Scale calibration Material counting Export
Open Takeoff →
04 /

On the Frontier

What comes next — both for the ML projects the harness will run and for the system capabilities being built into the infrastructure itself. These aren't aspirational slides. They're the actual roadmap, scoped and queued.

Next ML Projects — Industrial AI Pipeline
Energy · Power Grid · Stability + Forecasting
Energy Grid Stability Under Renewable Load
Planned Classification + Forecast

The energy transition is creating a grid stability crisis. Solar and wind generation is intermittent — output swings with weather, not demand. When a large solar farm drops offline suddenly or wind generation undershoots forecast, grid operators have minutes to dispatch backup capacity before frequency deviates enough to trigger automatic shutoffs.

The model predicts grid stability margins given current generation mix, demand curve, weather forecast, and interconnect flows. A second module handles multi-step load forecasting — predicting demand 1h, 6h, and 24h out — so operators can pre-position reserves. This builds on the wind energy work already in production, adding the demand-side and interconnect complexity of a real grid.

What it produces
Real-time grid stability classification with margin score
Multi-horizon load forecasts (1h / 6h / 24h)
Dispatch recommendation signal for backup generation
Why it matters
Grid instability events cost grid operators billions annually
Renewable penetration is growing faster than grid tooling
Natural extension of wind-edp and wind-engie work
Dataset ENTSO-E Transparency Platform + UCI Grid Stability + PJM load data
Multi-step Forecasting Grid Stability Temporal Features Exogenous Variables ENTSO-E Renewable Integration
Oil & Gas · Drilling Operations · Real-Time Anomaly Detection
Well Drilling — NPT Event Prediction
Planned Real-Time ML

Drilling a single well costs $5M–$50M. A significant portion of that cost is Non-Productive Time (NPT) — stuck pipe, lost circulation, well control events, equipment failure. These events don't come out of nowhere: drilling parameters and downhole sensor readings change in characteristic patterns 30–60 minutes before the event, but the patterns are subtle enough that a driller running in real time will miss them.

This project builds a real-time anomaly model on drilling logs — WOB, RPM, ECD, torque, ROP, gamma ray, pressure — that flags the precursor signature before the event materializes. It's the natural next step from volve-prod-001 (same North Sea domain, same Equinor data), and it introduces a new requirement for the harness: streaming inference, not batch retraining.

What it produces
Real-time NPT event probability score per 10-minute window
Event type classification (stuck pipe / lost circ / well control)
Recommended parameter adjustments to avoid the event
Why it matters
NPT typically accounts for 10–25% of total well cost
30-minute warning gives time to change parameters and avoid the event
First streaming/online ML problem for the harness architecture
Dataset Equinor Volve drilling logs + NOPIMS Norwegian well data (public)
Streaming Inference NPT Prediction Drilling Logs Anomaly Detection Real-Time ML North Sea
System Capabilities — In Development
In Development
Autonomous Model Benchmarking

The system's intelligence layer depends on which model is running. Right now, model selection is manual — a human notices a new release, evaluates it, and decides whether to swap. This is a bottleneck. Autonomous benchmarking removes it: the system monitors model release feeds, pulls candidates to a staging partition, runs them against a fixed suite of real past pipeline tasks, and reports quality/speed/memory tradeoffs. The swap decision stays human. Everything else is automated.

Scans HuggingFace + GGUF release feeds for models in class
Staged download to isolated partition — no impact on live inference
Standardized task suite scored on correctness, code quality, and latency
Human-readable comparison report with recommendation
Advancement: removes the manual model lifecycle bottleneck. The system evolves its own intelligence layer without waiting for a human to notice.
In Development
Agent Self-Evaluation

The pipeline currently improves its code but doesn't learn from its own improvement patterns. Self-evaluation closes that loop. Every patch gets a quality score based on what happens downstream — did it pass Tester? Did it hold up 48 hours later? Did Artemis reference it in future tasks as a working pattern? Every research finding gets scored: did it lead to real improvements, and how long before it materialized? Over time, agents tune their strategy toward what actually works.

Patch quality scoring: pass rate, hold rate, re-use rate
Research scoring: improvement materialization rate and lag
Strategy adjustment: agents shift prompting toward high-score patterns
No human labels — pipeline outcomes are the training signal
Advancement: the system stops repeating patterns that fail. RLHF-lite without human annotation — the pipeline learns from itself.
Planned
TradeShadow — Expanded Markets

TradeShadow is running live on six crypto pairs with the same self-improving infrastructure maintaining it. The next step is market expansion: an equities module (momentum + sector rotation) and a higher-volatility crypto module (SOL, AVAX, LINK with tighter position sizing) as independently isolated components. Each runs its own risk envelope. A bad run in equities has zero contact with the crypto module — shared infrastructure, zero shared state.

Equities module: US momentum + sector rotation strategies
Higher-vol crypto: expanded pair list with tighter sizing rules
Module isolation: independent kill switches, no shared position state
Same self-improving loop: the system iterates strategy the same way it iterates code
Advancement: broader market coverage, capital efficiency, and a proof that the same architecture generalizes across asset classes.
Planned
Industrial AI — Enterprise Deployment

The ML harness is proving its architecture on public datasets: fault detection at F1 0.942, cross-fleet generalization at F1 0.9375, 1.8M-row SCADA processing. The next phase is private industrial data partnerships — the same harness deployed against a customer's live sensor stream. Autonomous retraining fires when drift thresholds are crossed. The customer gets a model that improves itself as their equipment ages, not one that degrades silently.

Private SCADA and maintenance data ingestion pipeline
Customer-specific drift thresholds and retraining triggers
Model performance dashboard + alert routing to operations teams
Target verticals: wind O&M, manufacturing, upstream oil & gas
Advancement: this is the monetization vector. Public datasets are the proof. Real customer data is the product.