2026 旗舰构建
项目
如今的作品集围绕着重 AI、重后端的项目展开,更好地反映了我在职业上正在走的方向:模型构建、agent 编排、安全自主性与工作流工程。
2026 · 仅解码器语言模型
Phoenix 125M
一个 LLaMA 风格的 125M 参数模型,在单块 RTX 3080 Ti 上从零训练,配有自定义分词器、数据管道与训练循环。
The project is an end-to-end exercise in model building: corpus curation, tokenization, training stability, benchmarking, and open-source packaging.
✓ ~2B tokens processed, Apache 2.0 release, WinoGrande 0.507.
2026 · 多语言语言模型
Sweta-Hi and Sweta-Kn
基于 LLaMA 风格架构的印地语与卡纳达语预训练工作,配有自定义分词器与端到端的多语言数据管道。
This work is focused on underrepresented language coverage, practical training throughput, and evaluation quality ahead of release.
✓ Custom tokenizers, async data loading, released on HuggingFace.
2026 · 微调 · 文本转 SQL
SQLForge: Mistral 7B QLoRA
一次 4-bit QLoRA 微调,把 Mistral 7B v0.3 变成可靠的文本转 SQL 模型。沿用训练 Phoenix 125M 的那块 12 GB GPU,预留 3.75 GB 显存余量;在首次 WikiSQL 运行暴露出指标失真后,重建了感知 schema 的评估流程。
The project is a focused engineering exercise in capability lift on consumer hardware: model selection by VRAM math, LoRA rank tuning, instruction-template-correct loss masking, and an evaluation harness that does true execution-accuracy comparison against table rows rather than string match.
✓ +77.8 percentage point exact-match lift, 97.4 percent valid SQL, ~8.25 GB peak VRAM on a 12 GB card.
2026 · Agent 化内容管道
LinkedIn Post Swarm
一个多 agent 发布工作流,使用 Claude、Ollama、Playwright 与 Telegram 完成草稿生成、审阅、审批与定时发布。
The system includes critic-revision loops, source aggregation, state management, retries, and escalation paths so autonomy stays controllable.
✓ Human-in-the-loop approvals, resilient retries, scheduled output.
2026 · 自主 AI 安全编排器
Rudra
一个围绕严格范围护栏、沙箱执行与可审计的事件驱动工作流构建的多 agent 攻击型安全架构。
The emphasis is on safe autonomy: typed validation, retry budgets, isolation boundaries, and guardrails that make the system usable for serious testing.
✓ Recon + Analyst agents complete. Scope validation, sandbox, and audit trail fully designed.
2026 · AI 驱动的线索生成工作流
LocalLeads
一个端到端的后端系统,用于商家发现、AI 内容生成、网站组装、部署与个性化触达。
Operational controls include SQLite state tracking, Telegram approvals, deployment automation, and delivery flows aimed at production-style reliability.
✓ ~25 businesses contacted, live deployment automation, Telegram approval gates.
2026 · 自主交易智能系统
ATIS
一个六层自主系统,摄取研究论文与申报文件,构建因果知识图谱,用 walk-forward 与蒙特卡洛验证对论点进行回测,并为 600 只 NSE/BSE 股票生成每日排序的波段交易信号。
The system is built on three principles: every signal traces to a validated thesis in the knowledge graph, every LLM reasoning step is verified against Neo4j facts, and the architecture self-improves through Elo-based thesis lifecycle management and agent decision auditing.
✓ 59 agents built, Rust hot path implemented, 85/100 system effectiveness score on free data alone.
2026 · Model Context Protocol 服务器
Dhan MCP Server
一个 Model Context Protocol 服务器,把我实盘的 DhanHQ 券商账户与印度市场数据暴露给任意 MCP 客户端,让 Claude 或 Copilot 能直接回答关于持仓、仓位与期权链的问题。
Built on the official DhanHQ SDK v2 with 11 live tools across portfolio, market data, options, and instrument discovery, two transports, and order placement implemented but disabled so it can never touch real money unintentionally.
✓ 11 live MCP tools, official DhanHQ SDK v2, order placement disabled by default for safety.
2026 · 代码理解工具
CodeAtlas
一款代码理解工具,把 C 源代码转换为按函数划分的 Mermaid 流程图,配有两套可互换的引擎:确定性的 AST 路径与 LLM 路径,通过 FastAPI 后端与 React 画布提供服务。
The same backend exposes two diagram modes selected by an environment flag. AST mode parses C with ast-grep and emits Mermaid directly, fast and deterministic with no model in the loop. LLM mode sends each function to a model through LiteLLM, so it works against local Ollama, GitHub Models, OpenAI, or Anthropic without code changes.
✓ Two diagram engines (AST and LLM), four-service Docker stack, default to local Ollama so it runs free and offline.
2026 · GraphRAG 知识图谱
GraphMind
一个 GraphRAG 项目,将交易与研究数据摄取到 Neo4j 知识图谱与 ChromaDB 向量库中,再由一个 LangGraph ReAct agent 在图遍历、稠密检索与混合检索之间选择并作答。
Retrieval is three-pronged: Neo4j Cypher for multi-hop relationship questions (which analysts covered an instrument, how funds connect to trades to sectors), dense vectors for semantic similarity, and BM25 for exact terms like tickers. Dense and sparse results are merged with Reciprocal Rank Fusion so meaning and exact-match both count.
✓ Neo4j + ChromaDB + BM25 fused via Reciprocal Rank Fusion, exposed as tools to a LangGraph ReAct agent.
2026 · 带 LLM 防火墙的 GraphRAG
PaperGraph
针对精选研究论文语料的 GraphRAG 问答,前置一道四层 LLM 防火墙,配有离线 RAGAS 评估与自托管的 Langfuse 追踪,并封装好用于 GCP Cloud Run。
PaperGraph fuses a Neo4j knowledge graph with vector retrieval, then guards the entire path with an LLM firewall built to survive untrusted input. It is the flagship for production RAG, AI safety, and LLM observability.
✓ 100% injection-block on a 20-prompt adversarial set, offline RAGAS + Langfuse, 45 tests passing.
2026 · 需求预测 + 库存 copilot
KiranaIQ
一款面向印度小型 kirana 杂货店的需求预测与库存 copilot:拍下一张账单,即可得到按 SKU 的预测、白话解释、补货数量与价格实验。
KiranaIQ bundles five capabilities small retailers have no analytics for today: reading a paper bill, forecasting demand per item, explaining the forecast, recommending what and how much to reorder, and testing price changes safely. It is the flagship for classical and forecasting ML, explainability, and applied product engineering.
✓ Measured WAPE 35.8% vs 62.1% seasonal-naive on synthetic retail data, 64 tests passing.
2026 · learning-to-rank 检索基准
hybrid-search-bench
一个诚实的混合检索基准:将 BM25、SPLADE 与稠密检索融合,再由一个 LambdaMART learning-to-rank 模型重排,并在公开的 BEIR 数据集上度量。
Three retrieval legs are evaluated on the same qrels, fused with reciprocal-rank fusion, and then reordered by a learning-to-rank reranker trained on the train split. It is the flagship for search, ranking, and the classical learning-to-rank skills that pure-LLM portfolios usually miss.
✓ BEIR SciFact nDCG@10 0.778 vs 0.728 RRF (+6.9%); BM25 0.686 matches the published figure.
Earlier work
在当前这一代工作之前,我用一些较小的 ML 与 Web 项目养成了至今仍然重要的习惯:做实验、调试,以及交付完整的系统。
2023
Semantic Search Engine
An earlier information retrieval build that combined semantic search ideas with enterprise documentation use cases and set up later work in retrieval-heavy AI systems.
2022
Super Resolution
An image enhancement project built to understand GAN-based vision pipelines and experiment rigor in visual ML work.
2022
Photo to Monet-style art
A CycleGAN style-transfer exploration that taught me a lot about training instability, qualitative evaluation, and visual debugging.
2022
Library Management
A MERN-stack build that sharpened my full-stack fundamentals around CRUD, search, and practical product structure.
On the roadmap
我接下来计划要做的,在公开构建之前先列在这里。
Open-source and public work
Explore models, code, and experiments
All flagship projects are documented on GitHub. Model weights and cards are published on Hugging Face.