About
Hey! I'm Brian; cs grad from UofT. I'm interested in ml-inference and LLM optimization
My focus is on inference optimization to large-scale ML infrastructure with strong interest in understanding how technical performance impacts AI system reliability and user experience.
Technical Skills
Languages
- Python (OOP, multithreading)
- C / C++
- Java
- JavaScript
- SQL
- HTML / CSS
Frameworks & Tools
- React, Next.js
- Express, Node.js
- Git, Docker
- REST APIs
- Linux, XML
Specialties
- Deep Learning Inference
- Latency Optimization
- Distributed Systems
- Linear Programming
- Network Security
Featured Projects
EdgeRAG
Interactive Demo Available- Edge-optimized RAG system for financial analysis, focusing on retrieval accuracy, latency, and response grounding.
- Iterating on chunking heuristics and metadata-aware retrieval for financial text.
- Comparing standalone LLM outputs vs retrieval-augmented responses for factual consistency.
- Exploring memory usage, model size trade-offs, and inference performance at the edge.
- Investigating real-time data integration, hybrid dense–sparse retrieval, and prompt orchestration.
Performance Bid-Ask Order Book
Interactive Demo Available- Sub-500ns price-time priority matching engine in C++17 with lock-free order ingestion and cache-line aligned structs.
- Profiled via Google Benchmark and perf to eliminate L1 cache bottlenecks on the critical match path.
- Dynamic order book with fixed-size array + linked list for efficient maintenance and retrieval.
Experience
Built fine-tuning and RLHF pipeline, establishing evals and regression tests for agentic workflows.
University of Toronto
Architected end-to-end ML pipeline observability using Prometheus and Grafana for platform serving 18k+ concurrent users p>
RBC Capital Markets
Developed latency optimization tools for trading infrastructure using stochastic modeling and data-driven analytics.
IBM Watson
Enhanced integration testing platform for the DB2 team, improving reliability and coverage of automated test pipelines.
- CSC373: Algorithm Design, Analysis & Complexity
- CSC343: Database Management Systems
- CSC2209: Networking Systems (Graduate Network-ML)
- Network Topology, Operating Systems
Education
Bachelor of Science in Computer Science
University of Toronto · Class of 2025
Specialization: ML & Systems
Role: Teaching Assistant — Network Topology, Operating Systems, and more