Skip to content
View tzujohsu's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report tzujohsu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
tzujohsu/README.md

πŸ‘‹ Hi there β€” I'm Jocelyn (Tzu-Jo) Hsu

I have 2+ years of experience in data and developer roles across the marketing and finance sectors through internships and co-ops. These experiences pushed me to build robust, scalable solutions that bridge data, product thinking, and engineering. I constantly seek new challenges and opportunities to expand my expertise.

πŸ” Portfolio Overview

This portfolio showcases selected projects across various domains: LLMs, NLP, CV, forecasting, analytics, and backend development.

πŸš€ Project Highlights

Tag Project Description
LLM Fast Inference of LLMs via Speculative Decoding Implemented Speculative Decoding and BiLD to accelerate Transformer inference. Benchmarked across models.
LLM GPT-2 from Scratch (ongoing) Reproducing GPT-2 124M model from scratch to understand Transformer internals and training loops.
LLM / NLP ChronoEvents: News Timeline Builder RAG system that transforms podcast news transcripts into summarized event timelines.
LLM / Backend MCP Server with Claude Server tool for real-time document retrieval via Claude and the Serper API.
NLP/ML Multi-Label Content Categorization System Multi-label news categorization using NLP and ML - classify news articles into 17 categories with a live demo and REST API.
Backend (JavaScript) Subscription Tracker Built a backend system with JWT authentication, DB modeling, API architecture, security and automated workflows
Audio / CV Audio Deepfake Detection with LCNN Built a deepfake audio detection system using LCNN with self-attentive pooling.
CV Scene Text Recognition for Jersey Numbers Two-stage OCR pipeline to extract jersey numbers from blurry, occluded sports footage.
ML Rohlik Orders Forecasting Multi-warehouse time series model to forecast 60-day order volume for e-grocery operations.
ML Rossmann Sales Data Prediction Forecasted daily retail sales across 1000+ stores using time series regression models.

Pinned Loading

  1. soccernet-jersey-number-recognition soccernet-jersey-number-recognition Public

    Forked from oliveraw/soccernet-jersey-number-recognition

    Recognizes jersey numbers in video frames, handling motion blur and player obstruction

    Python

  2. audio-deepfake-detection audio-deepfake-detection Public

    trained LCNN to perform audio deepfake classification on AsvSpoof 2019

    Python 1

  3. LLM_speculative_decoding_evaluation LLM_speculative_decoding_evaluation Public

    Accelerating Large Language Models inference with SPS and BiLD algorithms

    Jupyter Notebook

  4. mcp-server-claude mcp-server-claude Public

    setting up a mcp server tool to search the latest documents via Serper API

    Python

  5. subscription-tracker subscription-tracker Public

    Subscription Management System with JWT authentication, database modeling, API architecture, security, automated workflows

    JavaScript