EVM On-Chain Indexer
Multi-chain blockchain data platform covering Ethereum Mainnet, Base, BSC, Arbitrum, and Optimism. Sub-one-minute end-to-end latency using Kafka stream processing and ClickHouse UDFs for real-time ABI decoding.
I'm Rajeev — a data engineering team lead in Kathmandu. I spend most of my weekdays building data pipelines, and most of my weekends running, watching F1, or getting dissapointed by my fav football team.
I lead a small data team at 108 Capital. We ship pipelines that index multi-TB blockchain history, scrape the strange corners of social media, and turn audio into searchable text.
A short feed of what's currently consuming my brain — newest at the top.
Pulling from jolpica and openf1 into one store, then a downstream service that compares two drivers in any session — lap by lap, sector by sector. Currently at v0.3, ingestion's stable, the comparator's getting there.
Five-hour goal. The point this time is to finish, not to chase a number. Most runs are after dark, ~18:30 start, when the city cools down.
New regs, weirder than expected. Lights-out is sacred; everything else routes around it.
A novel between systems books — turns out a plot is restful.
Lead a team of engineers shipping production data infrastructure. Owned architecture, deployment, and monitoring across five products in our first year together — from on-chain indexers to NLP-on-warehouse sentiment pipelines.
Built the on-chain indexer from scratch as sole owner — a ClickHouse warehouse with dbt + Prefect ETL keeping EVM data under one-minute latency at multi-TB scale. Then built the Twitter pipeline and started the Spark migration.
Built a multiplayer mobile Ludo game in Flutter and owned the real-time communication layer over WebRTC. Different stack, same taste for low-latency systems.
Four production systems shipped at 108 Capital, plus two personal open-source projects. Each one has a stack story — happy to tell it over coffee.
Multi-chain blockchain data platform covering Ethereum Mainnet, Base, BSC, Arbitrum, and Optimism. Sub-one-minute end-to-end latency using Kafka stream processing and ClickHouse UDFs for real-time ABI decoding.
Social pipeline covering account management and ETL orchestration with Airflow + ClickHouse. GPU-enabled NLP UDFs run on a distributed cluster for in-warehouse sentiment analysis and entity extraction.
Media transcription pipeline that tracks new YouTube channels and podcast feeds, downloads content, and transcribes with AI. Orchestrated with Plomberry; Celery handles the async background work so the queue never blocks the front end.
Multi-platform newsletter and article scraper covering Substack and Seeking Alpha, including paywalled content via authenticated browser sessions. Ingested into SurrealDB through Browserless workers.
Reverse-engineered the Nepal Stock Exchange's undocumented internal API to extract live market data — prices, order books, and trade history — that NEPSE never officially exposed. Built a clean Python client on top so anyone can pull structured stock data without screen-scraping.
F1 data warehouse that ingests from Jolpica and OpenF1, normalises laps, telemetry, results, and standings, then loads everything into ClickHouse. One SQL query away from any race stat — built as a foundation so others can ship dashboards, models, or analyses without worrying about the plumbing.
A non-exhaustive list of obsessions that crowd out engineering on the weekend. None of them are productive. All of them are the point.
Started after my wrist gave up on too much typing. Now somewhere between "stubborn jogger" and "slow marathoner". Targeting sub-4 in October.
Race day starts before lights out. I keep an embarrassingly detailed spreadsheet of pit-stop deltas — old habits.
The bikes lean past 60° and pretend gravity is optional.
I watch midfielders like other people watch the ball. La Liga and the Premier League weekends are sacred.
Systems books, distributed-systems papers, the occasional novel when my brain needs a hard reset.
Kathmandu's hills are the best debugging environment I've found. One hour out, one bug fewer.
Two feeds that update on their own. The runs come from my Strava; the standings come from the current F1 season. No editorialising.
Engineering postmortems, stray opinions, the occasional running journal. New entries at the top.
All notesI'm on Instagram and email. Most replies within a day.