Data infrastructure · HackDuke Best Use of Solana
DataCrawl
Won HackDuke Code for Good 2026 Best Use of Solana with a prompt-to-dataset agent pipeline for validated financial data acquisition.

Overview
DataCrawl automates financial dataset acquisition from plain-English requests. Instead of starting with brittle one-off scripts, it treats acquisition as an orchestrated pipeline from prompt to validated file output.
My contribution
Gemini orchestrator design, LangGraph/FastAPI pipeline wiring, and validation subagent coordination.
Problem
Useful financial data often lives behind inconsistent source structures, so manual collection does not scale and shallow crawlers fail quickly once validation matters.
Approach
- Built a Gemini orchestrator coordinating 5+ subagents for crawling, normalization, and validation.
- Connected LangGraph and FastAPI so the pipeline could move from prompt to schema-accurate output files.
- Designed the flow around repeatable execution rather than one-off scraping sessions.
Result
Won HackDuke Code for Good 2026 Best Use of Solana with full pipeline execution from a plain-English request to validated output files.