Hyoungseo Son

12 projects · 2026

Projects

Selected work across research tooling, hackathon prototypes, and real-time systems.

Hackathon2026MuseHub Track, Music Technology Hackathon, Berklee1st Place

UNREALTIME

Live AI Music Instruments

Live AI music instrument workflow built for the Music Technology Hackathon at Berklee College of Music. Prompt a sound, play it through MIDI or a computer keyboard, record rough performances into a loop station, clone the current loop, and keep playing the cloned vibe as a new instrument.

  • Google DeepMind Magenta RealTime 2 with MLX on Apple Silicon
  • FastAPI backend with a static browser UI and Web Audio worklet for low-latency playback
  • Prompt, weighted node-mix, and loop-clone generation modes
  • Loop station with track recording, playback, save/load state, and current-loop cloning
  • Replaced the live spoken pitch with a Suno-generated presentation song
  • Won 1st place in the MuseHub Track at the Music Technology Hackathon, Berklee
Magenta RealTime 2MLXFastAPIPython 3.12JavaScriptWeb Audiouv
On-stage pitch using a Suno-generated presentation song
Hackathon2026HackPrinceton Spring 20261st Place ($500)

Flanner

Knot API track

A mirror on your delivery habits. Pulls six months of real DoorDash / Uber Eats orders via Knot API, decomposes each dish into ingredients with K2 Think V2, generates a weekly home-cooked plan respecting calendar and dietary constraints, and pushes the exact ingredients to a real Amazon Fresh cart in one tap.

  • Next.js 15 + React 19 on Vercel; FastAPI on Cloud Run; MongoDB Atlas
  • Knot API in production mode: TransactionLink for delivery history, AgenticShopping for real Amazon Fresh cart push (checkout intentionally stubbed)
  • K2 Think V2 for weekly plan reasoning; Gemma 4 31B for photo recognition and check-in classification
  • Photon spectrum-ts bridge for live iMessage orchestration during the demo
Next.js 15React 19FastAPIPythonMongoDBK2 Think V2GemmaKnot APICloud RunVercel
Hackathon2026Precision Neuroscience BCI Hackathon1st Place ($1,000)

BCI Neural Visualization System

Real-time neural activity visualization and BCI array placement guidance system. Processes live data from a 1024-channel micro-ECoG array (32×32 grid) to help neurosurgeons optimize BCI array placement during surgery.

  • CNN-based denoising (ResNet U-Net with CBAM attention)
  • Kalman filter and EMA-based position tracking
  • Real-time heatmap smoothing with configurable web UI
  • 500 Hz sampling, ~20–50 ms end-to-end latency
PyTorchResNet U-NetCBAMKalman FilterReal-timeECoG
Hackathon2026Remy Hackathon 20262nd Place ($1,000)

Trippo

Replaces the group chat for the duration of a trip. A single trip room captures plans, photos, and tickets live; chat messages drive plan updates in real time, EXIF + GPS clusters photos by place and minute, and one tap turns the trip into a 1, 3, or 5-minute vertical recap reel for TikTok / Reels / Shorts.

  • TypeScript backend on @mindstudio-ai/agent for DB + AI primitives; React + Vite mobile-first frontend
  • Chat is the planner: Claude Sonnet extracts plan updates from messages in real time, Haiku polishes captions
  • Photo clustering by EXIF + GPS: same place and minute groups into split-screen, distinct stops become animated map pins
  • One tap renders a vertical 9:16 recap reel via Creatomate for TikTok / Reels / Shorts
TypeScriptReactViteMindStudio AgentClaude SonnetClaude HaikuCreatomateEXIF
MindStudio PM reacts to the demo
Ideathon2025LikeLion US 2025 IdeathonWinner

CoHabitAI

AI Manager for Roommate Living

An Ideathon concept for shared-living harmony. CoHabitAI proposes an AI manager that defuses the everyday friction of roommate life — chore turns, unpaid rent, clashing schedules — by distributing tasks fairly, syncing everyone's calendars, tracking shared supplies with auto-reorder alerts, and sending smart reminders so nothing slips. Presented as the team's pitch for the LikeLion US 2025 Ideathon.

  • Concept: an AI agent that distributes chores, coordinates schedules, and mediates roommate disputes fairly
  • Proposed features — calendar integration, a shared-supplies tracker with auto-reorder alerts, smart reminders, and a household onboarding flow
  • Pitch site built with React + Vite
ReactViteJavaScript
Hackathon2026YHack 2026Finalist

Booky

Social reading platform that transforms solitary reading into a connected experience. Readers highlight passages, discuss with friends, make story-branching choices (Detroit: Become Human style), and explore an AI-generated solar system visualizing reading compatibility.

  • Next.js 16, React 19, Three.js for 3D planet visualization
  • FastAPI backend with Firebase Firestore and ChromaDB
  • Google K2 Think V2 for AI content, Gemini 2.0 Flash for voice Q&A, Vertex AI Imagen for illustrations
Next.js 16React 19Three.jsFastAPIFirestoreChromaDBVertex AI
Hackathon2026Next-Gen Hacks Beta · Spring 2026Submitted

MotZip

Bilingual (EN/KO) restaurant discovery built around two ideas: voice-first filtering on a 3D map, and an AI phone agent that calls restaurants for you. Speak a query, the map sinks non-matching food into the ground; pick questions (reservations, vegetarian, wheelchair access...), click "Call N selected," and watch real Twilio calls return ✓/✗/? per question.

  • Next.js 16 + React 19 + Tailwind 4 frontend; MapLibre GL + Three.js for the 3D scene
  • FastAPI on Cloud Run; Google Places API (New) for restaurant data, dedup'd across 7 cuisine groups
  • Voice search: Google Cloud STT → Gemini 2.0 Flash filter extraction → in-process filtering → Cloud TTS reply
  • Batch calls: Twilio chained <Gather> per question, per-turn STT + Gemini parsing, streamed back as a checklist
  • Graceful degradation: ElevenLabs Scribe/Turbo as STT/TTS fallback, keyword heuristics if Gemini JSON parse fails
Next.js 16React 19Tailwind 4MapLibre GLThree.jsFastAPITwilio VoiceGemini 2.0Google PlacesTRELLIS
Tool2026

TEKKAL

AI-Friendly Slide Authoring Tool

Local-first, AI-agent-driven slide platform. Visual editor backed by a JSON scene graph, where every drag-and-drop action maps to structured code.

  • React 19 + TypeScript, Vite, Tailwind CSS v4, Zustand
  • Monaco Editor for JSON editing, KaTeX for math, TikZ support (server-side & WASM)
  • File System Access API for local project management without a backend
React 19TypeScriptViteTailwind v4ZustandMonacoKaTeXTikZ
Hackathon2026IBM Bob HackathonSubmitted

AsyncPair

Asynchronous pair programming for teammates split across time zones. A post-commit git hook captures development context — commits, diffs, and developer notes — at the moment of every commit. The web app turns each handoff into AI-generated scenarios that predict what your teammate will hit next, plus a contextual AI chat scoped to the full git history. Author a scenario, hand it off, and review AI-generated code changes in a side-by-side diff. The entire project was built inside IBM Bob, IBM's AI-native IDE — the premise of the hackathon.

  • Built end-to-end in IBM Bob, IBM's AI-native IDE — the hackathon's core theme
  • Next.js 14 (App Router) + TypeScript web app; CLI tool built with Commander, installed via npm
  • `asyncpair init` installs a post-commit git hook that runs `asyncpair capture`, extracting commits, diffs and notes through simple-git
  • Gemini generates handoff scenarios and powers a contextual AI chat scoped to the full git history
  • Author → Handoff → Pairing workflow; review and approve AI-generated code changes in a side-by-side diff view
IBM Bob IDENext.js 14React 18TypeScriptTailwind CSSCommandersimple-gitGeminiJest
Datathon2026Zerve × ODSC AI Datathon

Zerve × ODSC AI Datathon

End-to-end MLOps pipeline as a 32-block, 42-edge parallel-converge DAG inside the Zerve canvas: schema/leakage validation → EDA + 15-stage funnel → a 5-candidate AutoML pool → drift detection → champion picked for serving → weekly retraining loop. A Next.js frontend renders the DAG and calls a deployed FastAPI for live inference, reading canvas variables in real time.

  • 5-candidate model pool: calibrated XGBoost + RF + HGB soft-vote ensemble, PyTorch tab-MLP, sklearn GBM — isotonic calibration throughout
  • PR-AUC 0.2645, ROC-AUC 0.812 on 3.5M rows / 17,541 users (1.84% upgrade rate, ~53:1 imbalance); top-5% precision ~9× lift
  • Tree SHAP + segment-level diagnostics map top features to marketing actions
  • Weekly drift watch (PSI 0.10/0.25 + KS) with append-only event store, label-stable retraining gates, would_promote_new_model flag
  • Next.js 14 + ReactFlow frontend draws the canvas DAG and queries the deployed FastAPI for live PNGs and inference
PythonXGBoostPyTorchscikit-learnSHAPFastAPINext.js 14ReactFlowZerve
Hackathon2026Video Understanding AI Hackathon

SegRec

Segment-Level Video Recommendation Engine

A segment-level video recommendation engine. Paste a YouTube URL and SegRec downloads it with yt-dlp and indexes it through the Twelve Labs API, auto-generates a clickable chapter timeline, and — when you click a chapter — finds and ranks similar moments across every indexed video, jumping playback straight to them. Built for the Video Understanding AI Hackathon at Northeastern around Twelve Labs' video foundation models.

  • FastAPI + Python 3.11 backend; Vite + React + TypeScript + Tailwind CSS frontend with react-player playback
  • Twelve Labs API for video understanding — Marengo embeddings for segment similarity, Pegasus for chapter generation
  • Paste a YouTube URL → yt-dlp downloads it → auto-generated, clickable chapter timeline
  • Click a chapter to surface ranked similar segments across the whole library, jumping playback straight to each match
PythonFastAPIReactTypeScriptViteTailwind CSSTwelve Labs APIyt-dlp
Hackathon2025Dream AI Hackathon

EchoBoard

Context-Aware Ad Virtual Camera

An AI-powered virtual camera for video calls. EchoBoard transcribes the live conversation with speech-to-text, uses an LLM to detect the current topic, and composites a matching advertisement or branded background behind the speaker with real-time face detection — so what's on screen always echoes what's being discussed. Built as a 2025 Dream AI Hackathon project.

  • Python 3.11; pyvirtualcam streams the composited feed into Zoom as a virtual camera, set up through OBS Studio
  • Live speech-to-text plus Google's Gemini detect the conversation topic in real time
  • OpenCV face detection overlays a topic-matched advertisement or branded background behind the speaker
PythonOpenCVpyvirtualcamGeminiSpeech-to-TextOBS Studio