Brian Won

Brian Won

About

Hey! I'm Brian; cs grad from UofT. I'm interested in ml-inference and LLM optimization

My focus is on inference optimization to large-scale ML infrastructure with strong interest in understanding how technical performance impacts AI system reliability and user experience.

Technical Skills

Languages

  • Python (OOP, multithreading)
  • C / C++
  • Java
  • JavaScript
  • SQL
  • HTML / CSS

Frameworks & Tools

  • React, Next.js
  • Express, Node.js
  • Git, Docker
  • REST APIs
  • Linux, XML

Specialties

  • Deep Learning Inference
  • Latency Optimization
  • Distributed Systems
  • Linear Programming
  • Network Security

Featured Projects

EdgeRAG

Interactive Demo Available
  • Edge-optimized RAG system for financial analysis, focusing on retrieval accuracy, latency, and response grounding.
  • Iterating on chunking heuristics and metadata-aware retrieval for financial text.
  • Comparing standalone LLM outputs vs retrieval-augmented responses for factual consistency.
  • Exploring memory usage, model size trade-offs, and inference performance at the edge.
  • Investigating real-time data integration, hybrid dense–sparse retrieval, and prompt orchestration.
Python Vector DB RAG FastAPI React

Performance Bid-Ask Order Book

Interactive Demo Available
  • Sub-500ns price-time priority matching engine in C++17 with lock-free order ingestion and cache-line aligned structs.
  • Profiled via Google Benchmark and perf to eliminate L1 cache bottlenecks on the critical match path.
  • Dynamic order book with fixed-size array + linked list for efficient maintenance and retrieval.
C++ Prop Trading L2 Order Book WebSockets Market Signals

Experience

2026

Linkedin

Built fine-tuning and RLHF pipeline, establishing evals and regression tests for agentic workflows.

2025

University of Toronto

Developed concurrent user contest platform for nationwide programming challenges, improving platform scalability.

2023

RBC Capital Markets

Developed latency optimization tools for trading infrastructure using stochastic modeling and data-driven analytics.

2019–2020

IBM Watson

Enhanced integration testing platform for the DB2 team, improving reliability and coverage of automated test pipelines.

Teaching Assistant
University of Toronto
Multiple Terms
  • CSC373: Algorithm Design, Analysis & Complexity
  • CSC343: Database Management Systems
  • CSC2209: Networking Systems (Graduate Network-ML)
  • Network Topology, Operating Systems

Education

Bachelor of Science in Computer Science

University of Toronto · Class of 2025

Specialization: ML & Systems

Role: Teaching Assistant — Network Topology, Operating Systems, and more