Back to case studies
2022-2024Commercial data tooling

Resilient data collection workflows

Collection and debugging workflows for external web systems where behavior changes and failures must be diagnosable.

Focus
Diagnostics
Stack
HTTP + browser runtime
Output
Reusable logic
Debuggable collection

Changing targets need diagnostics, not just parsers.

The goal is to make failures explainable enough to fix.

1Request and browser behavior inspected together
2Failure cases turned into reusable checks
3Parsers designed around real target behavior
Role

Python Data / Backend Developer

Stack
PythonWeb scrapingReverse engineeringPlaywrightClickHouse
Problem

External targets changed often, and failures were hard to reproduce from a simple error message.

Solution

Worked with request tracing, browser automation, parsers, diagnostics, and reusable collection logic.

Impact

Failures became easier to classify, reproduce, and fix without starting from zero each time.

What I did
  • Analyzed HTTP and JavaScript behavior for changing external systems.
  • Built and adjusted collection logic around real target behavior.
  • Improved diagnostics so failures were easier to reproduce.
What it shows
  • A parser is only useful if the failure path is visible.
  • Data collection work rewards patience with edge cases.
Related work

More projects

2025-2026

nnzen model catalog

Solo

A live LLM catalog that collects model data, normalizes it, and makes model comparison faster.

PythonFastAPILLM APIsRAGVector DB / pgvector
2025

MCP core for an LLM assistant

Solo

Backend core for an LLM assistant with plugin execution, hot reload, tool chains, and explicit context handoff.

PythonFastAPIMCPTool callingLLM APIs