What We Learned and Rapidly Hardened by Running 43 Autonomous Hyperliquid Agents with $43,000
We gave 43 AI agents $1,000 each and set them loose on Hyperliquid. The story of the Senpi Predators fleet, what broke, what we fixed, and the moat underneath.

What We Learned and Rapidly Hardened by Running 43 Autonomous Hyperliquid Agents with $43,000
Senpi gave 43 AI agents $1,000 each and set them loose on Hyperliquid. Some made money. Many lost. All of them made the platform stronger.
This is the story of the Senpi Predators fleet - how we built it, what broke, what we fixed, and what the new fleet looks like after weeks of live trading with $23M+ in volume.
But this isn't just a trading experiment. Every agent we run is a live stress test of the entire Senpi Hyperliquid agent stack - the proprietary data layer, the hardened runtime, and the execution infrastructure. Every bug we find with our own capital is a bug no user ever hits. Every lesson we learn feeds directly into the platform that thousands of agents will run on. The fleet is our R&D engine. The product is the infrastructure it hardens.
The Experiment
The idea was simple: build dozens of autonomous trading agents, each with a different thesis about how to make money on Hyperliquid, and let them compete with real capital. Every agent scans the market using Senpi's proprietary Hyperfeed - realtime tracking of the top 1,000 traders on Hyperliquid - and makes its own entry decisions. No human in the loop. 24/7.
We started with 8 agents. Within weeks we had 43. Some hunt a single asset obsessively. Some scan hundreds of markets for explosive SM moves. Some follow quality traders. One counter-trades degens. One hunts squeezes. One bets against the trend.
The fleet evolved fast because the feedback loop was immediate: deploy an agent, watch it trade, see what breaks, fix it, deploy the next version. Every failure was a lesson. Every lesson became a rule hardcoded into the next generation - and into the runtime that every agent on the platform shares.
What We Learned
Lesson 1: Fewer Trades + Higher Conviction + Wider Stops = Better Performance
This is the single most important finding. Our top performers trade infrequently with high conviction. Our worst performers traded constantly.
Polar (ETH hunter) held positions through Phase 2 Tier 3 for +19.8% ROE after we removed the thesis exit and let the DSL manage. Before that fix, the same agent was scratching trades at +0.35% after 2 minutes.
Ghost Fox - 1,078 trades, -58.5% ROE. Same SM scanner as Polar, but configured to trade every signal. Over-trading killed it.
The pattern held across every agent category. The agents that waited for overwhelming conviction and gave positions room to breathe made money. The agents that chased every signal and cut positions quickly bled from fees and churn.
Lesson 2: The AI Model Doesn't Matter - The Infrastructure Does
Every agent has access to the same reasoning capability. The difference between a profitable agent and a catastrophic one isn't intelligence - it's infrastructure.
Phoenix had the best signal in the fleet. It found a HYPE SHORT at 54x divergence ratio - SM profit velocity surging while price was flat. That trade peaked at +50% ROE. But Phoenix lost -40.6% because its trade counter broke and it opened 24 positions in one day instead of 6. The signal was great. The infrastructure failed.
Nine agents lost a combined $3,000+ from the same root cause: DSL state files missing wallet address fields. The position ran unprotected while the agent reported "everything is healthy."
The thesis exit - where the scanner re-evaluates whether the market conditions that triggered the entry still hold - killed more winners than any bad signal. Wolverine v1.1 lost -23.4% because the scanner killed 25 of 27 trades. The one trade it let run (+29.9% ROE) was worth more than all other winners combined.
Lesson 3: Exits Are Infrastructure, Not Intelligence
Entries require intelligence - reading the market, synthesizing signals, making probabilistic decisions. This is what LLMs are good at.
Exits require reliability - track the price, compute the trailing stop, enforce the floor, close when breached. Every 30 seconds, without fail. No hallucinations. No lost state. No silent failures.
We were asking agents to do both. That was the core mistake. The answer was obvious once we saw it: make exits infrastructure.
This led to the Senpi Hyperliquid Agents Runtime - the layer between the agent and the chain. The DSL (Dynamic Stop-Loss) plugin detects positions onchain, manages trailing stops, and executes exits uniformly across every agent. The scanner finds the trade. The runtime protects it.
Lesson 4: Stalker Is Dead
We ran an extensive A/B/C/D/E/F/G test: Stalker mode (detecting slow SM accumulation over 3+ scans) vs Striker mode (detecting violent FIRST_JUMP rank explosions). Orca v1.3 was the definitive test: 58 Stalker trades at 43% win rate, net negative. The one Striker trade it took was positive.
Stalker catches chop, not trends. Across every agent, Striker-only consistently outperformed dual-mode scanners. Every new agent in the fleet is Striker-only or uses a fundamentally different entry thesis.
Lesson 5: The Data Layer Is the Edge
Senpi's Hyperfeed tracks the top 1,000 traders on Hyperliquid in realtime - contribution velocity, momentum events, trader quality scores (TCS/TRP classifications), concentration analysis, market positioning. This is proprietary signal that feeds every agent.
The Gen-2 agents layer multiple Hyperfeed signals for higher-conviction entries. Momentum events confirm that quality traders are driving a move. Trader convergence confirms that multiple proven traders independently arrived at the same trade. Hot streak following confirms that a specific quality trader is on a $5.5M+ run.
The data layer doesn't just find trades - it filters out the noise that killed our first-generation agents.
The New Fleet
After weeks of live trading, bug fixing, and rebuilding, here's the current Senpi Predators fleet. 43 agents, $1,000 each, every agent running on the Senpi Runtime with DSL plugin. No thesis exits. Trade counters that work. Each agent is a distinct experiment testing a different hypothesis about how to extract alpha from Hyperliquid.
Single-Asset Lifecycle Hunters
These agents stare at one asset and nothing else. Every signal source available - smart money positioning, funding rate, open interest, trend structure, volume - feeds into a single thesis. They wait for overwhelming conviction, enter once, and let the DSL trail.
Polar - ETH Alpha Hunter The patience benchmark. When SM commits 80%+ to ETH in one direction with 4H trend confirmation, enter and let the DSL trail. Three-mode lifecycle: HUNT (scan for entry) -> RIDE (DSL trails, scanner stays silent) -> re-HUNT. The proof that single-asset patience wins.
Kodiak - SOL Alpha Hunter Same lifecycle architecture as Polar, adapted for SOL's volatility profile. 7x leverage with wider DSL retrace to accommodate SOL's larger swings.
Grizzly v3.0 - BTC Alpha Hunter (Tightened) Every fleet lesson applied to BTC. Leverage capped at 7x (was 10x in earlier versions). Retrace widened to 8% ROE (was 3% - too tight for BTC). Hard timeout extended to 360 minutes (BTC trends are slower than ETH/SOL). Thesis exit permanently removed. 2 entries per day maximum. 180-minute cooldown between entries.
Grizzly Horribilis - BTC Conviction-Scaled Leverage Hunter Same BTC thesis as Grizzly v3.0 - different risk profile. Leverage scales with conviction: 7x at score 8-9 (standard), 15x at score 10-11, 25x at score 12-13, and up to 40x at score 14+ when SM consensus is overwhelming. Margin scales inversely so dollar risk stays controlled. Always protected by DSL - at 40x, an 8% ROE retrace is only a 0.2% BTC price move, so the DSL cuts in seconds if it doesn't work. The experiment: does extreme leverage on the highest-conviction setups produce outsized returns?
Cheetah - HYPE Predator Hunts HYPE exclusively using SM commitment as the primary signal. Unlike Wolverine (which requires full timeframe alignment), Cheetah fires when SM commitment exceeds 80%. BTC trend is a conviction booster, not a hard gate. 14-point scoring system with HYPE-specific wide DSL tiers that give volatile HYPE positions room to breathe.
Wolverine v2.0 - HYPE Alpha Hunter The conservative HYPE hunter. Requires strict 4H AND 1H alignment before entering - all timeframes must agree. When the market is choppy (no alignment), Wolverine sits at zero trades. That's correct behavior. Head-to-head comparison with Cheetah will show whether SM commitment alone (Cheetah) or full timeframe alignment (Wolverine) produces better results on HYPE.
Multi-Asset Scanners (Striker Architecture)
These agents scan 50-250+ markets every 90 seconds looking for violent SM rank explosions - the FIRST_JUMP signal. An asset that rockets from #40 to #15 on the SM leaderboard in a single scan, confirmed by volume, is a high-conviction entry.
Roach - Striker-Only Experiment The control group. Pure Striker detection - violent FIRST_JUMP rank explosions confirmed by volume. Score 9+ required. Fast-cycling DSL (30-minute timeout, 15-minute weak peak cut, 10-minute dead weight cut). The baseline for comparing every other Striker variant.
Roach-B - Striker-Only Variant B Identical to Roach, deployed on a separate wallet. Head-to-head comparison to validate that Striker performance is consistent across deployments, not luck.
Jaguar v2.0 - Striker with Mega-Jump Override Same Striker logic as Roach, plus one differentiator: when an asset jumps 30+ ranks from #50+ in a single scan, the minimum score drops from 9 to 7. This catches the most violent SM explosions that lack secondary confirmation but are too explosive to ignore. The experiment: do mega-jump signals with lower overall conviction produce profits?
Mantis v4.0 - Striker-Only SM Explosion Scanner The latest Mantis evolution. Stalker permanently removed after Orca v1.3 proved it loses money. Striker-only with the battle-tested Mantis market parser and scan history. Fast-cycling DSL. 6 entries per day maximum.
Gen-2 Intelligence (Layered Hyperfeed Signals)
These agents go beyond the basic SM leaderboard. They layer multiple proprietary Hyperfeed signals - momentum events, trader quality classifications, contribution velocity - for higher-conviction entries.
Orca v2.0 - Gen-2 Striker with Quality Confirmation The best Striker we can build. Same FIRST_JUMP explosion detection as Roach and Mantis, but enhanced with Gen-2 Hyperfeed signals. When Orca detects a rank explosion, it cross-references Tier 2 momentum events ($5.5M+ threshold) to check whether quality traders (TCS ELITE or RELIABLE) are driving the move. Quality confirmation is a score booster, not a hard gate - strong Striker signals still fire without it. This filters out pump-and-dumps driven by CHOPPY/DEGEN traders that vanilla Striker can't distinguish.
Sentinel v2.0 - Quality Trader Convergence Scanner Inverted pipeline: instead of starting with an asset, start with the BEST TRADERS and find where they converge. Queries the top 100 historically-profitable traders (ELITE/RELIABLE classification with open positions), aggregates their holdings by asset and direction, and signals when 5+ quality traders independently converge on the same trade. Cross-confirmed with SM leaderboard concentration. The thesis: when multiple proven traders arrive at the same position independently, that's informed consensus, not coincidence.
Raptor v2.0 - Hot Streak Follower When a quality trader crosses $5.5M+ in delta PnL (Tier 2 momentum event), they're on a hot streak. Raptor identifies their strongest position - the one with the highest delta PnL - and follows them into it, confirmed by SM leaderboard alignment. Event deduplication prevents chasing the same trader twice within 4 hours. Concentration gate ensures the trader is focused, not spread thin. Different from Orca (momentum as confirmation) and Sentinel (multi-trader convergence) - Raptor follows individual hot traders into their best trade.
Phoenix v2.0 - Contribution Velocity Scanner (Hardened) Scans for assets where SM profit velocity (contribution_pct_change_4h) is surging while price hasn't caught up yet. When SM profit contribution is growing 5x-50x faster than price, SM knows something the market doesn't. The same signal that found a HYPE SHORT at 54x divergence ratio peaking at +50% ROE. v2.0 keeps the identical battle-tested scoring (velocity tiers, SM magnitude, rank sweet spot, price lag detection, divergence ratio) but hardens the infrastructure - trade counter increments before output, stale dates auto-reset, daily cap of 4 entries. The signal was never the problem. The infrastructure was.
Multi-Asset Thesis Pickers
Condor v2.0 - Multi-Asset Alpha Hunter Evaluates BTC, ETH, SOL, and HYPE on every scan and picks the strongest thesis. Conviction-scaled margin (25/35/45% based on score). The multi-asset approach means Condor always has the best opportunity across the four major assets, rather than being locked into one.
Specialized Strategies
Fox v3.0 - Contra-Trend Striker Every other Striker follows the trend. Fox bets against it. When SM violently enters OPPOSITE to the 4H price direction, they're front-running a reversal before price catches up. Tighter gates than normal Striker (rank jump 20+ instead of 15, score 10+ instead of 9, SM traders 20+ minimum) because contra-trend is inherently riskier. The highest-risk, highest-reward experiment in the fleet.
Bison v2.0 - Macro Conviction Holder The opposite end of the spectrum from every other agent. While the fleet trades intraday moves, Bison waits for overwhelming macro consensus (SM 10%+, 80+ traders, strong 4H AND 1H alignment) on BTC, ETH, or SOL, and holds for up to 24 hours. 5x leverage. 30% max loss. 15% retrace. Tier 1 at +10% ROE locks NOTHING - the position breathes through any intraday noise. 1 entry per day maximum. Scans every 15 minutes, not every 90 seconds. The experiment: is the real alpha in catching multi-day macro trends and having the patience to hold through the drawdowns?
Bald Eagle v2.0 - XYZ Alpha Hunter The only agent covering non-crypto markets. Trades all 54 XYZ assets on Hyperliquid - commodities (BRENTOIL, CL, SILVER), indices (SP500), equities, and currencies - with a spread gate that rejects illiquid assets (>0.1% spread = rejected). Uses SM concentration from the leaderboard to identify which XYZ assets have strong smart money positioning.
Lemon - Degen Fader Finds DEGEN and CHOPPY traders bleeding at 10x+ leverage and -10%+ ROE, then counter-trades them. The thesis: when a high-leverage degen is deep underwater, the probability of liquidation or forced exit is high. Lemon rides the cascade. Scans 100 traders per cycle, scores them on leverage depth, bleed severity, cluster formation (multiple degens bleeding on the same asset), and SM opposition.
Hydra - Multi-Source Squeeze Scanner Hunts for funding rate squeezes - when crowd funding is extreme in one direction but SM is positioned opposite, and price is starting to move against the crowd. $20M daily volume gate ensures the asset is liquid enough to trade. Squeezes are episodic - Hydra sits quiet for days and fires when conditions align.
Owl - Contrarian Crowding-Unwind Contrarian play on crowded positions. Looks for assets where positioning is extremely one-sided and shows signs of unwinding. Still gathering data.
The Runtime: What Changed Everything
The single biggest improvement wasn't a better scanner or a smarter model - it was the Senpi Hyperliquid Agents Runtime we just launched earlier this week.
Before the runtime: agents managed their own exits via Python crons writing JSON state files. Crons died silently. State files corrupted. Agents hallucinated exits. Nine agents lost their wallet address fields. Positions ran unprotected for hours.
After the runtime: exits are infrastructure. The DSL plugin detects every position onchain automatically, monitors every 30 seconds, trails with a 2-phase system (Phase 1 protects from entry, Phase 2 locks profits as they grow), and places exchange stop-loss orders as a safety net against flash crashes. One YAML config. One install command.
The results were immediate. Polar went from scratching trades at +0.35% ROE after 2 minutes (thesis exit) to holding positions through Phase 2 Tier 3 at +19.8% ROE. Same agent. Same signal. Same market. The only change was who controlled the exit.
The runtime also ships auto-upgrades (every agent gets enhancements automatically) and health checks (honest answers about whether the infrastructure is working, not agent self-reports).
The Architecture
┌──────────────────────────────────────────────────────────────────┐
│ SENPI PLATFORM │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ SENPI BACKEND (proprietary) │ │
│ │ │ │
│ │ ┌───────────────────────┐ ┌───────────────────────────┐ │ │
│ │ │ HYPERFEED │ │ EXECUTION SERVICES │ │ │
│ │ │ (proprietary data) │ │ │ │ │
│ │ │ │ │ Position management │ │ │
│ │ │ SM leaderboard │ │ Order routing │ │ │
│ │ │ Momentum events │ │ Clearinghouse state │ │ │
│ │ │ Trader quality tags │ │ Market prices │ │ │
│ │ │ Contribution velocity│ │ Account management │ │ │
│ │ │ Concentration scoring│ │ Strategy/vault CRUD │ │ │
│ │ │ Top 1K traders │ │ │ │ │
│ │ │ Real-time signals │ │ │ │ │
│ │ └───────────────────────┘ └───────────────────────────┘ │ │
│ │ │ │
│ └──────────────────────────┬─────────────────────────────────┘ │
│ │ │
│ SENPI MCP │
│ (transport layer) │
│ │ │
│ Exposes backend as 48 MCP tools: │
│ market data, leaderboard, positions, │
│ orders, strategies, prices ... │
│ │ │
│ ┌───────────────────────────┴──────────────────────────────────┐ │
│ │ SENPI TRADING RUNTIME │ │
│ (can run on OpenClaw, Hermes, etc) │ │ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ SCANNER SUBSYSTEM 9 built-in scanner types │ │ │
│ │ │ Weighted confluence scoring · dependency resolution │ │ │
│ │ │ Consumes Hyperfeed via MCP provider layer │ │ │
│ │ └────────────────────────────┬────────────────────────────┘ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ EXECUTION ENGINE Serial FIFO queue │ │ │
│ │ │ No LLM in execution path · Zod schema validation │ │ │
│ │ │ Risk guard · Idempotent orders · Atomic state creation │ │ │
│ │ └────────────────────────────┬────────────────────────────┘ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ DSL EXIT ENGINE In-process 30s ticks │ │ │
│ │ │ Two-phase trailing stop · Exchange reconciliation │ │ │
│ │ │ Config immutability · State recovery on restart │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Every failure becomes a schema rule. │ │
│ │ Every bug becomes a structural impossibility. │ │
│ │ The runtime compounds with every agent that runs on it. │ │
│ │ │ │
│ └──────────────────────────────┬────────────────────────────────┘ │
│ │ │
│ ┌──────────────────────────────┴───────────────────────────────┐ │
│ │ LEARNING AGENTS (the swarm) │ │
│ │ │ │
│ │ Read data on every trade · Identify failure patterns │ │
│ │ Propose runtime improvements · Write fixes │ │
│ │ Launch new strategy experiments · Feed back into runtime │ │
│ │ │ │
│ │ Today: human-mediated learning cycle │ │
│ │ Next: autonomous swarm — observe, learn, harden, repeat │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┴──────────────────────────────────┐ │
│ │ SKILLS + PLUGINS (open source) │ │
│ │ github.com/Senpi-ai/senpi-skills │ │
│ │ │ │
│ │ Skills: 18+ trading strategies (Orca, Fox, Polar, etc.) │ │
│ │ YAML config + Python scanners │ │
│ │ │ │
│ │ Plugins: DSL exit engine, risk guard, health monitoring │ │
│ │ Modular runtime components, registry-based │ │
│ │ │ │
│ │ Forkable, copyable — skills + plugins are commodities │ │
│ │ The runtime that orchestrates them is the moat │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ LLM LAYER (commodity) │ │
│ │ │ │
│ │ Gemini · Claude · GPT · any model │ │
│ │ │ │
│ │ Strategy reasoning + signal interpretation ONLY │ │
│ │ Never touches execution (margin, size, leverage) │ │
│ │ Swappable — the model is not the moat │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────┐
│ HYPERLIQUID │
│ (~$7B daily) │
└───────────────────┘
Every agent in the fleet follows the same pattern:
Scanner -> finds the trade. Each scanner embodies a unique thesis - a specific hypothesis about how to extract alpha from Hyperliquid. The scanner outputs a signal with an asset, direction, score, and entry parameters. It does NOT manage exits.
Runtime -> protects the trade. The Senpi Runtime's DSL plugin detects positions onchain, trails stop-losses automatically, and executes exits. The scanner doesn't know when positions close. The runtime handles it.
Hyperfeed -> feeds the scanner. Senpi's proprietary data layer tracks the top 1,000 traders on Hyperliquid in realtime. Contribution velocity, momentum events, trader quality scores, concentration analysis - this is the signal that every scanner consumes.
The model is a commodity. The skills are open source. The proprietary data layer that feeds the agents and the hardened runtime that executes for them - that's what compounds. That's the moat.
Why We Run 43 Agents with Our Own Capital
This isn't just an experiment. It's our R&D engine.
Every agent in the fleet is a live stress test of the entire Senpi stack - the data layer, the runtime, and the execution infrastructure. When Phoenix's trade counter breaks and opens 24 positions in a day, that's a bug we find and fix before any user hits it. When the position tracker drops DSL state on a transient API error, we discover it across 3 agents simultaneously and ship a fix that protects every agent on the platform. When Kodiak enters the same exhausted SOL SHORT 5 times because the cooldown doesn't fire, that's a feature request for the runtime that every future agent benefits from.
We've now documented and fixed six major categories of infrastructure failure - margin hallucination, missing DSL state, wrong cron commands, static price floors, field naming mismatches, and agent self-modification. Each one was discovered by running real agents with real capital. Each fix is now baked into the runtime so no user ever encounters it.
The fleet also teaches us what works in the market. Across 43 agents running different strategies, we now know: single-asset lifecycle hunters outperform multi-asset scanners. Striker-only outperforms dual-mode. Patience outperforms frequency. Quality trader confirmation improves win rates. Contra-trend signals are rare but potentially high-alpha. These aren't hypotheses - they're conclusions from thousands of live trades.
This is the flywheel: the more agents we run, the more edge cases we discover, the more reliable the runtime becomes, the better the data layer gets, the more strategies users can configure, and the more valuable the platform becomes for everyone.
The goal isn't 43 trading skills. The goal is tens of thousands of trading variations - built by users and their AI assistants - each configured with a unique thesis, each protected by the same hardened runtime, each fed by the same proprietary data layer. The skills are open source. The entry gates are declarative YAML. Any developer can build a new strategy in hours. The infrastructure handles the rest.
We're laying the foundation now so that when the ecosystem scales, every agent - whether it's one of ours or one built by a user on the other side of the world - runs on battle-tested infrastructure that was hardened by real capital, real bugs, and real lessons.
What's Next
The fleet is live and trading. Here's what we're building:
Position tracker hardening. The tracker needs to tolerate transient API errors without dropping DSL state. Fix: require 3 consecutive polls confirming a position is gone before archiving.
Per-asset cooldown in the runtime. When the DSL closes a position, enforce a configurable cooldown before the scanner can re-enter that asset. Prevents the churn pattern where a scanner re-enters an exhausted thesis 5 times in a row.
Risk enforcement module. Portfolio-level risk rules that no individual agent can override. Daily loss limits, max exposure per asset, max leverage. The runtime blocks violations before they reach the chain.
Declarative entry gates. Instead of hardcoding trend alignment logic in Python, agents declare entry requirements in YAML. Any developer can add N gates to a scanner by writing configuration, not code.
Scaling to thousands of strategies. Every lesson from the fleet - from DSL tuning to cooldown logic to conviction scaling - feeds back into the runtime and the open-source skills library. The next team building an agent on Senpi starts from where we are now, not from zero.
43 agents. $230M+ in volume. Every one of them running on the Senpi Runtime.
AI models are a commodity. Skills are open source. Everyone will have agents. The proprietary data layer that feeds them and the hardened runtime that executes for them - that's what compounds. That's the moat.
Live now at senpi.ai.
