ToolRadarHQ
// 28 picks · 90 days

Explore the radar’s memory

Every pick we’ve ever shipped, plotted as a living constellation. Picks cluster by topic; threads between clusters reveal cross-topic connections. Zoom in to expand a cluster into its picks — drag, filter, click through to the brief.

// Browse

All picks

Archive results

  • Dong90/oh-my-taiyiforge

    An AI workflow automation plugin wiring Claude and Codex into code generation pipelines. Aimed at developers who want to chain model calls inside an existing editor or automation setup without building the scaffolding from scratch.

  • kennss/SiliconScope

    A native SwiftUI system monitor for Apple Silicon that surfaces ANE usage, Media Engine activity, and memory bandwidth alongside the usual CPU and GPU metrics — all without requiring elevated permissions. For developers who actually want to see what their ML workloads are doing on-device.

  • sums001/Windows-Copilot-API

    Reverse-engineered Windows Copilot exposes GPT-4 and GPT-5 through an OpenAI-compatible REST endpoint with no API key and no billing. Drop it in as an OpenAI client replacement and you get free model access until Microsoft patches it — or you get banned, whichever comes first.

  • basionwang-bot/HermesPet

    A zero-dependency macOS desktop AI companion that lives in the Dynamic Island notch area. Supports multiple LLM backends in parallel, built in Swift 6 and SwiftUI, targeting macOS 14 and above.

  • intellicia-public/parastore

    Draw a store layout, spin up LLM-powered consumer personas, and watch them make purchasing decisions in a live isometric 3D simulation. A sandbox for synthetic market research that skips the survey panel and runs on your own data assumptions.

  • superloglabs/superlog

    An open-source observability layer that routes logs through AI agents to diagnose and attempt automated fixes on software failures. Targets small teams that want incident triage without a dedicated SRE.

  • Purewhiter/mobilegym

    A browser-hosted Android simulator built for training and evaluating mobile GUI agents at scale, with verifiable RL rollouts. Aimed at researchers and engineers who need reproducible, parallelized mobile-environment runs without physical device farms.

  • VibeBench/VibeSearchBench

    A benchmark for evaluating search and retrieval agents on 200 long-horizon tasks with vague, multi-turn, persona-driven queries scored against a knowledge-graph ground truth using triplet F1. Built for teams who suspect their RAG or search agent is fooling easy evals.

  • 2aronS/Duel-Agents

    CLI, SDK, and IDE plugins for a Duel Agents framework, targeting developers who want to run two agents in opposition or collaboration on a task. The core concept is interesting but the repo is early and the use case needs more concrete grounding before it earns a Saturday.

  • gi-dellav/zerostack

    A minimal coding agent written in Rust, built around low memory footprint and raw performance rather than feature breadth. Early-stage and sparse on documentation, but the Rust angle is a real differentiator for teams hitting memory limits with Python-based agent runtimes.

  • study8677/awesome-architecture

    A curated repo of 26 bilingual system-design tutorials, 25 architecture templates, and 6 end-to-end case studies covering distributed systems, RAG pipelines, and coding agents. Rare to find this level of coverage in one place, with bilingual content making it useful across more teams.

  • chiennv2000/orthrus

    Lossless LLM inference using dual-view diffusion decoding that claims meaningfully faster throughput without sacrificing output quality. If the benchmarks hold up under real workloads, this is the kind of architectural bet worth watching before it gets absorbed into a mainstream inference stack.

  • chorus-codes/chorus

    Runs your code decision past 2-4 LLMs simultaneously before you ship, acting as a peer review layer on top of whatever CLI you already use. If you have ever shipped something that Claude approved but GPT would have flagged, this is the obvious next step.

  • agentic-in/elephant-agent

    A self-evolving AI agent that treats a personal model as the primary interface, updating its own behavior over time based on usage. Interesting architecture thesis, but the self-evolution claim needs scrutiny before anyone hands it production responsibility.

  • raindrop-ai/workshop

    A framework for giving coding agents the ability to write and run their own evals. If your team is shipping agent pipelines and tired of manually eyeballing outputs, this is the missing loop that lets the agent measure itself.

  • jmerelnyc/Photo-agents

    A framework for autonomous computer-operating agents with vision-grounded memory and self-written skill accumulation. Aimed at AI engineers who want to build agents that learn new desktop tasks over time rather than starting from a fixed skill set. Early-stage and experimental.

  • YGYOOO/WorldX

    Drop a single sentence into this tool and it builds a living world complete with a map, characters, and self-directed storylines that unfold without any further input from you. AI engineers and indie hackers curious about persistent multi-agent coordination will find the architecture worth reverse-engineering for their own autonomous agent projects.

  • AIScientists-Dev/WorldSeed

    A multi-agent simulation engine where AI agents interact, compete, and form alliances inside a constructed world. Research-leaning sandbox for emergent behavior experiments. Interesting if you are studying agent social dynamics or building a game world that needs autonomous NPC behavior.

  • walkinglabs/hands-on-modern-rl

    A runnable open-source curriculum that takes you from foundational reinforcement learning through RLHF, RLVR, and agentic system design in one coherent sequence. AI engineers and technical PMs building agent products will find the careful middle-ground coverage — the part most resources skip entirely — worth reading about in full.

  • raiyanyahya/how-to-train-your-gpt

    A fully annotated transformer built from the ground up for developers who ship AI products but still feel shaky on what happens inside the model. Every tokenization choice, attention mechanism, and training loop gets a plain-English explanation dense enough to finally give you real opinions in architecture discussions.

  • lightseekorg/tokenspeed

    A self-hosted LLM inference engine claiming genuine throughput gains over established runtimes like vLLM, aimed at AI engineers and SaaS teams where every extra token-per-second translates directly to infrastructure cost savings. The benchmarks are worth pressure-testing yourself before the full write-up tells you exactly what to look for.

  • kessler/gemma-gem

    Runs Google's Gemma 4 model entirely in-browser via WebGPU — no API key, no server, no data leaving the machine. Worth a look for indie hackers building privacy-first tools or demos that need a local model without a Python backend.

  • amitshekhariitbhu/llm-internals

    For AI engineers and technical PMs who ship LLM-powered features without fully understanding what is happening beneath the surface, this structured guide walks through the real internals — tokenization, attention, inference optimization — at a depth that closes the gap between calling an API and actually knowing why your model behaves the way it does.

  • future-agi/future-agi

    A self-hostable observability and eval stack built for teams shipping LLM-powered products who are tired of stitching together three separate tools to get from raw traces to regression tests. The simulation layer — replay agent runs under altered conditions, not just log what happened — is what makes this worth a serious look over the usual suspects.

  • VectifyAI/OpenKB

    An open-source LLM knowledge base you can self-host. Targets teams that want RAG over internal documents without giving data to a SaaS vendor. Worth a look if OpenAI or Notion AI is off the table for compliance reasons.