Production-grade web scraping & AI research, written by someone who ships it.
I'm Aleksei. I build Apify actors and write code-heavy tutorials about web scraping, data extraction, and AI workflows. Every article includes runnable Python and real benchmarks. Currently 31 published actors (78 total in portfolio) with real users.
Recent posts
-
Conditional GET in production scrapers: what I learned wiring it into 3 actors
Real numbers from 2,190 lifetime runs: 304 Not Modified saved 32-71% of bandwidth across Trustpilot, exchange-rate and npm-package actors. Code, failure modes, and when to skip it.
-
Three memory-leak patterns in long-running scrapers (and how I caught them after 968 Trustpilot runs)
Production scraping memory leaks: BeautifulSoup retention, growing URL queues, and connection-pool exhaustion. Real fixes with measured before/after from 968+ live Trustpilot scraper runs.
-
Token Economics of Agent-Driven Scraping: When LLM Agents Cost 50× More Than a Cron Job
Six months of production scrapers (970r on a single actor) showed LLM agent loops cost 30-80× more than deterministic crawlers above ~50 pages. Real token math, two narrow agent-win cases, and the fallback-only pattern.
-
5 Apify dataset deduplication patterns that stop double-billing your customers
Five production patterns to prevent silent dataset duplication on Apify — uniqueKey, content hashing, KV-store guards, and SQL-backed dedup. Real numbers from 968 Trustpilot runs.
Need a custom scraper or research solution?
Pilot pricing: $100 for 1 article or $150 for a 3-article series. Email [email protected] with the topic and I'll reply within 24 hours.
