Forem

DEVOPS_ECR

Cool X — Tue, 19 May 2026 07:03:38 +0000

Deploying with AWS ECR + GitHub Actions
A complete CI/CD pipeline: Build → Push to ECR → Deploy to EC2

What is ECR? ECR (Elastic Container Registry) is AWS's private Docker image registry — think of it as a private Docker Hub that lives inside AWS. Instead of pushing your images to Docker Hub, you push them to ECR.

Why ECR over Docker Hub?
• Lives inside AWS — pulling from EC2 is fast and free (no egress costs)
• Private by default — no public exposure
• Integrates natively with IAM — no separate login credentials
• Automatically scans images for security vulnerabilities on push

The Big Picture Here is how all the pieces connect together:

ECR Repositories How many repos do you need? You only create ECR repos for YOUR custom images. Public official images (MariaDB, Adminer) come directly from Docker Hub — no ECR needed for them.

Service Image Source ECR Repo Needed?
Frontend Your custom code YES
Backend Your custom code YES
MariaDB Docker Hub (mariadb:11) NO
Adminer Docker Hub (adminer:latest) NO

ECR Repository URL Structure
After creating a repo, AWS gives you a URI like this:

123456789012.dkr.ecr.ap-south-1.amazonaws.com/myapp-frontend

123456789012 = your AWS account ID
dkr.ecr = ECR service
ap-south-1 = your region
amazonaws.com = AWS domain
/myapp-frontend = your repository name

How to Create ECR Repos
AWS Console → ECR → Create Repository → do this twice:

Settings that matter:
• Visibility: Private (always)
• Tag Immutability: Enabled — prevents overwriting existing tags
• Scan on Push: Enabled — auto scans for CVEs on every push
• Everything else: leave as default

Create two repos named:
myapp-frontend
myapp-backend

IAM — Users and Roles IAM User vs IAM Role IAM User IAM Role What it is Permanent identity with fixed keys Temporary identity, auto-assumed Has permanent keys? Yes No Who uses it External things (GitHub Actions) AWS services (EC2 → ECR) Credentials Access Key + Secret Key Temporary token (auto-rotated)

IAM User — for GitHub Actions
GitHub Actions runs on a machine outside AWS. It needs real credentials to prove its identity to AWS.

Steps to create:

AWS Console → IAM → Users → Create User
Name it: github-actions-ecr
Attach policy: AmazonEC2ContainerRegistryFullAccess
Security Credentials tab → Create Access Key → choose CLI
IMPORTANT: Copy both keys immediately — secret shown only once

IAM Role — for EC2
EC2 lives inside AWS. Attach a role to the instance instead of putting keys on the server (keys on server = security risk).

Steps to create:

AWS Console → IAM → Roles → Create Role
Trusted entity type: AWS Service → EC2
Attach policy: AmazonEC2ContainerRegistryReadOnly
Name it: ec2-ecr-readonly
EC2 → Instances → your instance → Actions → Security → Modify IAM Role → attach it

Verify it worked by SSHing into EC2 and running:
aws sts get-caller-identity
If it returns your account ID and role name, the role is working.

GitHub Secrets Go to: GitHub repo → Settings → Secrets and Variables → Actions → New Repository Secret

Secret Name Value
AWS_ACCESS_KEY_ID Your IAM user access key
AWS_SECRET_ACCESS_KEY Your IAM user secret key
AWS_REGION e.g. ap-south-1
EC2_HOST Your EC2 public IP address
EC2_SSH_KEY Full contents of your .pem file (including header/footer)

For EC2_SSH_KEY — open your .pem file, copy everything including the -----BEGIN RSA PRIVATE KEY----- header and -----END RSA PRIVATE KEY----- footer, paste as the secret value.

ECR Authentication Explained ECR does not use username/password like Docker Hub. It uses temporary tokens that expire every 12 hours. The command to get and apply the token is:

aws ecr get-login-password --region ap-south-1 \
| docker login \
--username AWS \
--password-stdin \
123456789012.dkr.ecr.ap-south-1.amazonaws.com

• aws ecr get-login-password — asks AWS for a temporary password (valid 12 hours)
• | — pipes that password into the next command
• docker login --username AWS — logs Docker into ECR using the temp password
• --username AWS — ECR always uses the literal string AWS as username, not your IAM username

In GitHub Actions, the aws-actions/amazon-ecr-login@v1 action runs this automatically. You never write it manually in the workflow.

The GitHub Actions Workflow steps.login-ecr.outputs.registry Explained This is GitHub Actions syntax for referencing the output of a previous step:

${{ steps.login-ecr.outputs.registry }}

steps = look in previous steps
login-ecr = the id you gave the step (id: login-ecr)
outputs = this step exposes output values
registry = the specific output: your ECR base URL

Resolves to: 123456789012.dkr.ecr.ap-south-1.amazonaws.com

The amazon-ecr-login action automatically figures out your account ID and region from your credentials and exposes it as the registry output. Use it instead of hardcoding your account ID.

Complete Workflow File
Save this as .github/workflows/deploy.yml in your repo:

name: Build, Push to ECR, Deploy to EC2

on:
push:
branches:
- main

jobs:
deploy:
runs-on: ubuntu-latest

steps:
  - name: Checkout Code
    uses: actions/checkout@v3

  - name: Configure AWS Credentials
    uses: aws-actions/configure-aws-credentials@v2
    with:
      aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
      aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      aws-region: ${{ secrets.AWS_REGION }}

  - name: Login to Amazon ECR
    id: login-ecr
    uses: aws-actions/amazon-ecr-login@v1

  - name: Build and Push Frontend
    env:
      ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
      IMAGE_TAG: ${{ github.sha }}
    run: |
      docker build -t $ECR_REGISTRY/myapp-frontend:$IMAGE_TAG ./frontend
      docker tag $ECR_REGISTRY/myapp-frontend:$IMAGE_TAG $ECR_REGISTRY/myapp-frontend:latest
      docker push $ECR_REGISTRY/myapp-frontend:$IMAGE_TAG
      docker push $ECR_REGISTRY/myapp-frontend:latest

  - name: Build and Push Backend
    env:
      ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
      IMAGE_TAG: ${{ github.sha }}
    run: |
      docker build -t $ECR_REGISTRY/myapp-backend:$IMAGE_TAG ./backend
      docker tag $ECR_REGISTRY/myapp-backend:$IMAGE_TAG $ECR_REGISTRY/myapp-backend:latest
      docker push $ECR_REGISTRY/myapp-backend:$IMAGE_TAG
      docker push $ECR_REGISTRY/myapp-backend:latest

  - name: Deploy to EC2
    uses: appleboy/ssh-action@v0.1.10
    with:
      host: ${{ secrets.EC2_HOST }}
      username: ubuntu
      key: ${{ secrets.EC2_SSH_KEY }}
      envs: AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_REGION
      script: |
        aws ecr get-login-password --region $AWS_REGION \
          | docker login \
            --username AWS \
            --password-stdin \
            123456789012.dkr.ecr.ap-south-1.amazonaws.com
        cd /home/ubuntu/myapp
        docker compose pull frontend backend
        docker compose up -d --no-build
        docker image prune -f

Docker Compose File This file lives both in your GitHub repo and on your EC2 instance at /home/ubuntu/myapp/docker-compose.yml

services:
frontend:
image: 123456789012.dkr.ecr.ap-south-1.amazonaws.com/myapp-frontend:latest
ports:
- "3000:3000"
depends_on:
- backend
restart: always

backend:
image: 123456789012.dkr.ecr.ap-south-1.amazonaws.com/myapp-backend:latest
ports:
- "8000:8000"
depends_on:
- db
environment:
DB_HOST: db
DB_PORT: 3306
DB_NAME: myapp
DB_USER: myuser
DB_PASS: mypassword
restart: always

db:
image: mariadb:11
environment:
MYSQL_ROOT_PASSWORD: rootpassword
MYSQL_DATABASE: myapp
MYSQL_USER: myuser
MYSQL_PASSWORD: mypassword
volumes:
- db_data:/var/lib/mysql
restart: always

adminer:
image: adminer:latest
ports:
- "8080:8080"
restart: always

volumes:
db_data:

Things to Change for Your Project
File What to Change Change To
docker-compose.yml ECR URI in image: fields (x2) Your actual ECR repo URIs from AWS console
deploy.yml ./frontend in docker build Actual path to your frontend folder in repo
deploy.yml ./backend in docker build Actual path to your backend folder in repo
deploy.yml ECR URL in docker login script Your actual ECR registry base URL
deploy.yml username: ubuntu ec2-user if using Amazon Linux, ubuntu if Ubuntu
deploy.yml /home/ubuntu/myapp Path where docker-compose.yml lives on EC2
deploy.yml myapp-frontend / myapp-backend Your actual ECR repo names if different
EC2 Instance Setup
SSH into your EC2 and make sure these are installed:

Check Docker

docker --version

Check Docker Compose

docker compose version

Install AWS CLI if missing

sudo apt update
sudo apt install awscli -y

Verify AWS CLI

aws --version

Create app directory

mkdir -p /home/ubuntu/myapp

Copy your docker-compose.yml to EC2 (run this from your local machine):

scp -i your-key.pem docker-compose.yml ubuntu@your-ec2-ip:/home/ubuntu/myapp/

Complete Setup Checklist Do these in order before your first deploy:

• Create ECR repo: myapp-frontend
• Create ECR repo: myapp-backend
• Create IAM User (github-actions-ecr) with ECR full access
• Generate access keys for IAM user
• Add AWS_ACCESS_KEY_ID to GitHub Secrets
• Add AWS_SECRET_ACCESS_KEY to GitHub Secrets
• Add AWS_REGION to GitHub Secrets
• Add EC2_HOST to GitHub Secrets
• Add EC2_SSH_KEY to GitHub Secrets
• Create IAM Role (ec2-ecr-readonly) with ECR read-only access
• Attach IAM Role to EC2 instance
• Install AWS CLI on EC2
• Create /home/ubuntu/myapp directory on EC2
• Copy docker-compose.yml to EC2
• Create .github/workflows/deploy.yml in your repo
• Push to main branch and watch GitHub Actions tab

What is browser, DOM, Real DOM, Virtual DOM

subash — Tue, 19 May 2026 07:02:59 +0000

1.WHAT IS BROWSER?

A browser is software used to access and display websites on the internet.

2.What is DOM?

DOM - Document Object Model
The browser converts HTML into a tree-like structure called the DOM.

Example HTML:

<body>
  <h1>Hello</h1>
  <p>Welcome</p>
</body>

DOM Structure

Body
 ├── h1
 │    └── Hello
 └── p
      └── Welcome

What is Real DOM?

The Real DOM is the actual DOM created inside the browser.

It directly represents the webpage shown to the user.

DOM is created by browser, not by the browser javascript.

1.When the browser reads HTML:

<h1>Hello</h1>

the browser automatically converts it into a DOM structure internally.

Document

   └── h1
        └── "Hello"

JavaScript does not create the DOM from scratch.
JavaScript only:

*accesses the DOM
*modifies the DOM
*updates the DOM

Problems with Real DOM

Updating the Real DOM is slow because:

*The browser must recalculate layout
*Repaint the UI
*Re-render elements
If many updates happen frequently, performance becomes slower.

Especially in:

*Large applications
*Dynamic websites
*Real-time updates

virtual DOM;

The Virtual DOM is a lightweight copy of the Real DOM.
Libraries like React use Virtual DOM to improve performance.

Instead of updating the Real DOM directly:
1.React creates a Virtual DOM copy
2.Changes are made in the Virtual DOM first
3.React compares old and new Virtual DOM
4.Only changed parts are updated in the Real DOM.

This process is called Diffing.

and updating only necessary parts is called Reconciliation

Virtual DOM Working Flow;

User Action
↓
State Changes
↓
New Virtual DOM Created
↓
Compare with Old Virtual DOM
↓
Find Differences
↓
Update Only Changed Elements in Real DOM.

What is New Virtual DOM?

When data changes in React:

1.React creates another updated Virtual DOM
2.This updated version is called the New Virtual DOM

Then React compares:

Old Virtual DOM
    VS

New Virtual DOM

Why React Uses Virtual DOM

 1. Faster rendering

 2.Better performance

 3.Efficient updates

 4.Smooth user experience

 5.Easier UI management

Real DOM vs Virtual DOM

Feature	Real DOM	Virtual DOM
Speed	Slower	Faster
Updates	Updates entire structure	Updates only changed parts
Performance	Less efficient	More efficient
Memory Usage	Higher	Lightweight
User Experience	Can become slow	Smooth UI updates

The browser creates the Real DOM from HTML.
Directly updating the Real DOM can be slow.

To solve this problem, React introduced the Virtual DOM, which acts as a lightweight copy of the Real DOM.

Whenever data changes:

A New Virtual DOM is created
React compares old and new Virtual DOMs
Only necessary changes update the Real DOM

Why Neleto Exists

Martin — Tue, 19 May 2026 07:00:00 +0000

every time. And yet, the tools we rely on to manage websites often feel like they’re stuck in the previous decade.

That tension is exactly why Neleto exists.

The friction we kept running into

As a digitalization partner working with mid-sized companies, we saw the same patterns again and again. Traditional CMS platforms were slow, bloated, and increasingly painful to maintain. Page speed suffered, security required constant vigilance, and every new feature seemed to demand another plugin. Clients loved the end result of a custom site but hated the editing experience — or worse, they accidentally broke things.

Headless and “modern” alternatives solved some problems but created others. They were powerful for developers, yet they often pushed complexity onto clients or required expensive frontend frameworks and ongoing specialist work. Pricing frequently scaled with seats or traffic in ways that punished growing businesses. And data residency? For European companies, that was rarely a first-class concern.

Then AI changed everything.

Tools like Cursor, Claude, and Windsurf started letting developers move dramatically faster. But the content layer — the actual website content that real businesses live and die by — remained disconnected from these new workflows. AI could help write code or generate text, but it couldn’t safely and natively update the live site without fragile custom integrations.

We didn’t want to keep patching around these limitations. We wanted to remove them.

What we decided to build

Neleto was born from a simple conviction: the best CMS should feel invisible to clients and empowering to developers — while being ready for the AI-native future that’s already here.

That conviction led to several non-negotiable decisions:

Performance as a foundation, not a feature. We built the backend in Rust. The result is websites that are dramatically faster than traditional PHP or Node.js solutions — often 10-50× quicker in real-world scenarios. Faster sites rank better, convert better, and cost less to run. For agencies and freelancers delivering client work, that speed advantage compounds every single day.
Two worlds, one system. Developers get full control: direct HTML access, a clean plugin API, and complete freedom to build exactly what they need. Non-technical editors and clients get a genuinely pleasant admin interface where they can manage pages, blog posts, events, translations, and files without training or fear of breaking the site. Role-based permissions keep everything safe and organized.
AI that actually belongs in a CMS. Neleto includes a native MCP (Model Context Protocol) server — the only one we’re aware of built into a CMS from the ground up. This means AI agents can securely read and write content directly, following the same permissions and workflows humans use. It’s not a bolted-on chatbot or a future roadmap item. It’s there today, ready for the way developers and teams are already working.
European pragmatism. We host in regions you choose, with strong GDPR compliance when you select EU/Germany servers. Your data stays where you want it. Pricing is transparent and predictable. There’s no vendor lock-in — export your content whenever you like. And migration from WordPress is built in, because we know many great sites still live there.

Built by people who ship websites every day

Neleto isn’t a theoretical product designed in a vacuum. It grew out of real client work at Triple-A Soft. We kept feeling the same friction points and eventually decided the best way to solve them for our clients (and ourselves) was to build the tool we actually wanted to use.

We made it affordable enough for freelancers and small agencies while powerful enough for teams. We included the content types and features most sites need out of the box so you spend less time configuring and more time delivering value. And we designed it to get better as AI capabilities advance, rather than having to be retrofitted later.

The future we’re building toward

Neleto exists because we believe the next era of the web belongs to teams that can move fast without sacrificing control, performance, or simplicity. Developers should leverage AI as a true collaborator. Editors should feel confident managing their own content. Businesses should own their data and their speed.

That’s the CMS we wanted. So we built it.

If you’ve ever felt the gap between how fast you can develop and how painful it is to hand a site over to a client…

If you’ve ever wished your content tools kept pace with your AI-assisted workflow…

If you care about performance, compliance, and not getting locked into expensive or bloated platforms…

…then Neleto was built for you.

We’re just getting started. Try it for free, explore the documentation, or reach out if you’d like to talk about how it fits your workflow. We’re building Neleto in public with real users, and we’d love to have you along for the ride.

Fast websites. Easy content. AI native.

That’s not just our tagline. It’s why we exist.
visit neleto.io

Cortex Search vs Hybrid SQLite RAG — A Cost and Latency Teardown

soy — Tue, 19 May 2026 06:54:50 +0000

There are two competent ways to build a retrieval system in 2026, and they sit at opposite ends of the build-versus-buy spectrum.

On one end, Snowflake's Cortex Search wraps the entire RAG pipeline — chunking, embedding, indexing, incremental sync — into a single SQL statement. On the other, a 200-line Python script using SQLite's FTS5 module and the sqlite-vec extension can deliver hybrid keyword-plus-semantic retrieval on a laptop, with no servers and no monthly bill.

The marketing case for Cortex is well-rehearsed. The marketing case for SQLite is almost nonexistent because nobody is paid to make it. This post is the teardown — what each one charges for, where the latency actually lives, and the decision criteria for picking one over the other.

For the broader strategic context on why Snowflake's RAG offering exists at all, see the hub piece Why Snowflake's Bet on Streamlit Just Works. This article is the head-to-head.

What Cortex Search actually charges for

The headline pitch is that you create a search service in one statement:

CREATE OR REPLACE CORTEX SEARCH SERVICE my_rag_service
  ON search_text_column
  ATTRIBUTES product_category
  WAREHOUSE = my_warehouse
  TARGET_LAG = '1 hour'
  AS SELECT * FROM my_table;

That is genuinely the entire pipeline. But "no pipeline" does not mean "no cost." Cortex Search bills along several axes that are worth understanding before you commit:

Indexing compute. When you create the service, Snowflake reads the source table, chunks the text, generates embeddings, and builds the index. That work runs on the warehouse you specified. For a corpus of a few hundred thousand documents, this is a one-time charge in the single-digit-credits range. For tens of millions of documents, it is meaningfully more.

Refresh compute, driven by TARGET_LAG. This is the parameter most people underestimate. TARGET_LAG = '1 hour' means Snowflake will wake the warehouse at least once an hour to check for new or changed rows and update the index. Set it to '1 minute' and you are paying for sixty refreshes an hour, even if no data changed. The warehouse auto-suspends, but the "wake, check, sleep" pattern adds up on chatty data.

Serving compute. Each query against the service consumes warehouse seconds. The warehouse needs to be running (or to wake on demand, which adds latency) to serve queries.

Storage. The embeddings and index live on Snowflake storage. For a 500 GB document corpus, the index can add another 50–100 GB depending on chunking and embedding dimensionality.

Egress, if you serve from outside Snowflake. Calling the service from a Streamlit app running in Snowflake is free. Calling it from a FastAPI service running in a different cloud means egress fees on the responses.

None of these are surprises if you read the docs. But the practical effect is that "build a RAG over your warehouse" is a real monthly bill, and that bill scales with how fresh you want the index and how often you query it.

What the SQLite stack charges for

Zero, structurally.

The components — SQLite, the FTS5 module, sqlite-vec, Python — are all free and open-source. They run on whatever hardware you already have. A laptop is fine. A $5 VPS is fine. A Raspberry Pi handles a million-row corpus comfortably.

What you pay instead is:

Engineering time, up front. Setting up the schema, writing the chunking logic, picking an embedding model, building the indexing script, writing the hybrid query, tuning the BM25 parameters. This is a one-time cost measured in days, not months.

Engineering time, ongoing. When the corpus grows, when you change embedding models, when you want to add metadata filters, you write the code yourself. Snowflake does this for you.

Embedding API calls, if you don't self-host the model. OpenAI's text-embedding-3-small runs about $0.02 per million tokens. A million 512-token documents embeds for around $10 one-time, plus pennies per month on queries. Self-host a small embedding model and even that goes to zero.

The economic shape is the inverse of Cortex's. Cortex is "low fixed cost, real variable cost." Self-host is "real fixed cost, near-zero variable cost." Where the lines cross depends entirely on how much traffic you have and how much engineering bandwidth you can spare.

Latency, measured honestly

For a corpus of about a million chunks, here is what each stack looks like on the wire:

Layer	Cortex Search	SQLite + FTS5 + sqlite-vec
Network hop to retrieval	30–80 ms (cloud round-trip)	0 ms (in-process)
Warehouse wake (if cold)	1–5 seconds	n/a
Keyword retrieval	~50 ms	1–5 ms
Vector retrieval	~50 ms	5–20 ms
Rerank / fusion	bundled	1–2 ms (RRF in SQL)
Warm-path total	~100–150 ms	~10–30 ms
Cold-path total	~1–5 seconds	unchanged

The warm-path comparison is closer than people assume. Cortex is genuinely fast once the warehouse is hot. The cold-path comparison is where it bites — if your traffic is sparse enough that the warehouse keeps suspending, your users see multi-second waits on the first query of a session. You can pay to keep the warehouse warm, which puts you back in the variable-cost discussion.

The SQLite path has no cold start because there is no process to wake — the database is just a file, opened on demand. For low-traffic or latency-sensitive applications, this is a real advantage and not a marginal one.

Retrieval quality, measured carefully

This is where the conversation usually goes sideways. Vendor RAGs are assumed to be more accurate than handcrafted ones, mostly because of brand effect. The reality is more interesting.

Cortex Search uses a hybrid retrieval approach internally — BM25-style keyword search combined with vector similarity, with a reranker on top. The reranker is the part you cannot replicate trivially at home; it is a learned model and Snowflake does not expose its weights.

A well-built SQLite hybrid (BM25 trigram via FTS5 + dense embeddings via sqlite-vec, combined with Reciprocal Rank Fusion at k=60) reaches around 90–95% of Cortex-style quality on typical retrieval benchmarks. The gap is the reranker. For a lot of use cases — document Q&A, internal search, even most customer-facing applications — that gap is not the bottleneck. The LLM generating the final answer dominates the quality signal.

If you genuinely need reranker-level precision, you can bolt on a cross-encoder reranker yourself (bge-reranker-v2-m3, runs locally on CPU, free) and close most of the remaining gap. It costs you 50 ms of latency per query.

Governance is where Cortex earns its keep

The argument for Cortex that does not dissolve under engineering scrutiny is governance.

If your corpus is medical records, financial filings, or anything else subject to data residency law, the question "can your retrieval system promise that no document text or embedding ever leaves the warehouse boundary" has a one-word answer in Cortex (yes) and a long, careful answer in any self-hosted setup. Snowflake's RBAC, masking policies, and audit logs apply to Cortex retrieval automatically. The data does not move; the search lives next to the storage.

Replicating that property in a self-hosted stack is possible — you can run SQLite plus your embedding model entirely inside your own private network — but you are now the auditor, the access-control implementer, and the compliance team. For a regulated enterprise, that is not engineering effort, it is regulatory risk. For a solo builder or a non-regulated startup, it is just the normal cost of running your own infrastructure.

When to pick which

The decision is more about your constraints than about technical superiority.

Pick Cortex Search when: your data already lives in Snowflake; your compliance posture requires data residency; you have warehouse credits to spend and not enough engineers to spend; query volume is steady enough that warehouse cost is predictable; you want SQL to be the only language anyone touches.

Pick the SQLite hybrid when: you own the hardware and the data; your traffic is bursty or low; latency matters more than absolute quality; you have at least one engineer who is comfortable with the stack; the marginal cost of a query needs to be zero.

Pick a mix when: the bulk of your corpus is non-sensitive and goes on SQLite for cost; a smaller regulated subset goes in Snowflake for governance; a thin orchestration layer routes queries based on what each corpus contains. This is the actual answer for most mid-sized companies once they look closely.

Where the conversation usually ends

Most vendor-versus-self-host comparisons end with "it depends," which is true but useless. The more honest version is that Cortex Search is a specific, well-designed product solving a specific, expensive problem (the last-mile RAG for an enterprise warehouse), and the SQLite hybrid is a specific, well-tested set of components solving a specific, different problem (cheap retrieval over data you already own).

They are not really competitors. They are answers to different questions, and the cost-and-latency numbers above are mostly useful for figuring out which question you are actually asking.

For the implementation walkthrough of the SQLite stack itself, see Building a Hybrid RAG in 200 Lines. For the strategic context on why Cortex exists at all, see Why Snowflake's Bet on Streamlit Just Works.

Inside Streamlit's Re-Run Model — Why Hot Reload Feels Instant

soy — Tue, 19 May 2026 06:54:17 +0000

The first time you save a Streamlit file in your editor and watch the browser update before your hand leaves the keyboard, you assume it is some clever diffing magic. It is not. The mechanism underneath is closer to a confession than an algorithm: Streamlit just re-runs your entire Python script, top to bottom, every time anything changes.

Once you understand that, the whole framework stops feeling magical and starts feeling honest. This post is about why that decision was the right one, what makes it fast, and the small handful of concepts you actually need to internalize to use it well.

If you want the strategic context for why this matters — Snowflake's acquisition, Cortex Search, the Community Cloud economics — that's in the companion piece Why Snowflake's Bet on Streamlit Just Works. This article is the engineering deep-dive.

What "hot reload" usually costs

In a normal Python web stack — FastAPI plus uvicorn, Flask plus gunicorn, Django plus anything — the server process is long-running. It holds the routes, the app state, the database connection pool, the imported modules. When you change a file, the dev server has to:

Detect the file change.
Tear down the existing process (or at least invalidate its module cache).
Re-import everything from scratch.
Rebind routes and middleware.
Open new sockets and resume accepting connections.

The fastest dev servers in this style — uvicorn with --reload, Flask's debugger, Django's runserver — manage this in a few seconds on a small app and noticeably longer on a real one. You save a file, you tab over to the browser, you refresh, you wait. The loop is one or two seconds, which sounds fine on paper but turns "tweak the padding" into a five-minute task.

The cost is structural. As long as there is a server process to restart, restart time has a floor.

What Streamlit does instead

Streamlit's mental model removes the server-process-as-stateful-thing entirely. The server is still there — it accepts HTTP, it serves WebSocket frames — but your app code is not living inside it across requests. Your script is a script. It runs from the first line to the last line, draws the UI as a side effect, and exits.

When something changes — you saved the file, the user clicked a button, a slider moved — the runner just runs your script again. From scratch. Top to bottom. As if you had typed python app.py at a terminal.

import streamlit as st

st.title("A small demo")
n = st.slider("Pick a number", 1, 100, 50)
st.write(f"You picked {n}.")

When the slider moves, the entire file executes again. st.title runs again. st.slider runs again (and returns the new value). st.write runs again. The browser sees the new state.

The reason this is fast is that there is no restart cost. A Python script of a few hundred lines takes single-digit milliseconds to execute if you've avoided heavy work at module scope. The runner is just calling your function in a loop and shipping the resulting UI tree over a socket.

The WebSocket pipe

The other half of the trick is on the wire. A traditional web app communicates with the browser through HTTP request/response cycles — the browser asks for a page, the server returns one, repeat. Hot reload in that world means the browser has to either poll or re-request.

Streamlit holds a persistent WebSocket connection between the browser tab and the server for the entire session. The server runs your script, builds the UI tree, diffs it against what the browser is currently showing, and pushes only the changed nodes through the socket. No page refresh, no F5, no re-fetch.

This is what closes the loop between "I saved a file" and "the screen updated." A file system watcher inside the Streamlit dev server picks up the change, triggers a re-run of your script, the new UI tree gets diffed against the last one, and the delta lands in the browser through the open socket — all within the time it takes you to glance at the browser window.

In production on Streamlit Community Cloud or Streamlit in Snowflake, the file watcher is gone (the code isn't changing), but the rest of the machinery is identical. Every user interaction triggers a script re-run, and the WebSocket pushes the diff back.

The three concepts you actually have to learn

The re-run model has one obvious problem. If your script runs from scratch every time, how do you keep anything across reruns? How do you avoid re-loading a 4 GB model on every click?

Streamlit's answer is three explicit escape hatches. That is the entire API surface for state and persistence. Learn these three, and you can build almost anything.

1. `st.session_state` for per-session memory

A dictionary scoped to the current browser session. Survives reruns. Does not survive the user closing the tab or the app sleeping.

import streamlit as st

if "count" not in st.session_state:
    st.session_state.count = 0

if st.button("Increment"):
    st.session_state.count += 1

st.write(f"Clicked {st.session_state.count} times.")

Without session_state, that counter would reset to zero on every click, because the script re-runs from scratch and count = 0 would execute again.

2. `@st.cache_data` for expensive data

Decorator for pure-ish functions that return data. Streamlit hashes the arguments, executes the function once, and returns the cached result on subsequent calls with the same inputs. Cache survives reruns and (optionally) reboots.

import streamlit as st
import pandas as pd

@st.cache_data(ttl=3600)  # cache for an hour
def load_sales():
    return pd.read_parquet("sales.parquet")

df = load_sales()
st.dataframe(df)

Without the decorator, that Parquet file would be read from disk on every script re-run — every slider move, every button click.

3. `@st.cache_resource` for expensive objects

Same idea as cache_data, but for things you do not want serialized — database connections, ML models, anything where the object identity matters or where pickling would be wasteful.

import streamlit as st
from sentence_transformers import SentenceTransformer

@st.cache_resource
def get_model():
    return SentenceTransformer("intfloat/multilingual-e5-base")

model = get_model()

Without this, your 400 MB embedding model would be re-loaded into VRAM on every interaction. With it, the model lives for the lifetime of the server process and is shared across all sessions.

That is the entire mental model. session_state for "remember this for this user." cache_data for "remember this value." cache_resource for "remember this object." Compared to learning the React component lifecycle or FastAPI's dependency injection system, this is genuinely a few hours of reading.

Where the re-run model gets in your way

It is not free, and pretending otherwise would be dishonest.

Side effects at module scope are dangerous. If you write requests.get(...) at the top level of your script, that HTTP call fires on every re-run. Wrap anything I/O in @st.cache_data or a function called conditionally.

Long-running operations block the UI. A re-run is synchronous from the user's point of view. If a click triggers a function that takes ten seconds, the UI freezes for ten seconds. Stream output, show progress with st.status, or push the work to a background process.

Mutable globals do not behave the way you expect. If you mutate a module-level list inside your script, that mutation will or will not be visible on the next re-run depending on whether Python's module cache is reused. Use session_state for anything that needs to mutate.

Forms exist for a reason. Without st.form, every widget interaction triggers an immediate re-run. For multi-field inputs where you want one submission, wrap them in a form so the re-run fires only on submit.

None of these are deal-breakers, but they are real, and they reward writing your Streamlit code more like a pure function than like a stateful class.

Why the design holds up

The re-run model is the architectural decision that most defines Streamlit. It is also the one most likely to make a senior backend engineer wince on first contact. "You re-run the whole script every time? That's absurd."

It works because two things are true simultaneously. Python is fast enough at executing a few hundred lines that re-running in milliseconds is achievable. And the three escape hatches — session, data cache, resource cache — give you the exit valves you actually need without inventing a state-management framework.

The result is that the simple case is genuinely simple — five lines of Python and you have a working app — and the complex case is still tractable, you just have to be honest about where your state lives.

For a UI library aimed at data and ML practitioners who do not want to learn web frameworks, this is the right trade. The fact that it also produces the fastest "save file, see change" loop in Python is a free side effect of the architecture, and it is the thing that keeps the framework feeling lightweight even as the apps you build on it grow.

If you want to see this architecture stitched together with Snowflake's strategic bet and Community Cloud's deployment story, the hub article is Why Snowflake's Bet on Streamlit Just Works — And Where Solo Builders Still Win.

Why Snowflake's Bet on Streamlit Just Works — And Where Solo Builders Still Win

soy — Tue, 19 May 2026 06:53:44 +0000

Last night I finished a Streamlit app at 3 AM. It is an electronic whiteboard for a factory floor — monthly schedule, dispatch board, safety announcements, partner-company tallies, attendance heatmap, handwritten notes with PDF support, all on one screen. I thought I was done at 11 PM. The last four hours were the usual: that final 0.5% of padding, alignment, and "why is this one cell two pixels off" that consumes half the project.

Somewhere around 2 AM I started thinking about why Streamlit lets me move this fast, and the answer pulled me into a longer thread about Snowflake's strategy, the economics of free developer tools, and where solo builders like me still have an edge over the enterprise stack.

Here's the take.

The acquisition that quietly made sense

In 2022, Snowflake bought Streamlit for around $800 million. At the time, plenty of people called it strange. Snowflake is a data warehouse company. Streamlit is a Python UI library. What's the connection?

The connection is that Snowflake had a problem most B2B data platforms have: once a customer's data lives inside your warehouse, the most expensive friction is the last mile — building the application that actually surfaces that data to a human. You can charge them for storage, for compute, for query credits, but if every customer has to spin up a separate frontend team to ship a dashboard or a search interface, your platform becomes a tax instead of a product.

Buying Streamlit solved that elegantly. Now the pitch is: keep your data in Snowflake, write a Python script, and you have an internal app. No frontend hire. No deployment pipeline. No infrastructure team. The "last mile" becomes a function call.

Giving Streamlit away for free, including the open-source library and Streamlit Community Cloud, is not a charity move. It is the cheapest enterprise marketing channel ever invented. Every Python developer who builds a side project on Streamlit becomes a potential advocate inside a company that is evaluating Snowflake. The cost to Snowflake is real but bounded — Community Cloud apps run on spare capacity from their massive compute fleet, sleeping when idle, sharing resources tightly. The acquisition pays for itself the moment one of those developers brings Snowflake into a procurement conversation.

This is not a criticism. It is one of the cleanest examples of a developer-tools acquisition strategy I have seen.

Cortex Search: SQL is all you need

The real payoff of the strategy shows up in something like Cortex Search. The whole "build a RAG pipeline" ceremony — load the documents, chunk them, embed them with an OpenAI key, store the vectors in pgvector or Pinecone or Weaviate, wire up retrieval, keep the index in sync — collapses into one SQL statement:

CREATE OR REPLACE CORTEX SEARCH SERVICE my_rag_service
  ON search_text_column
  ATTRIBUTES product_category
  WAREHOUSE = my_warehouse
  TARGET_LAG = '1 hour'
  AS SELECT * FROM my_table;

That is the entire pipeline. Embedding, indexing, incremental sync, the whole thing. Hand this to an enterprise with 500 GB of internal documents and they can stand up a searchable RAG app in an afternoon, ship it on Streamlit in Snowflake, and never move the data outside their security boundary.

For companies that cannot legally let their data leave the warehouse — financial services, healthcare, anything with strict residency requirements — this is not a convenience. It is the only sane architecture. Role-based access control, masking policies, audit logs all carry over from the warehouse layer into the RAG layer automatically. You are not bolting governance onto an AI pipeline; you are inheriting it.

The number of vendors who can match this in 2026 is small.

Four design decisions that make this work

When you stand back from the marketing and look at why this ecosystem holds together, it comes down to four design decisions that are unusually disciplined for a stack this large:

Separation of concerns. Snowflake owns the data and the compute. Streamlit owns the presentation. The boundary between them is a SQL query. There is no ORM layer trying to be clever, no middleware tier to babysit. Each side does exactly one thing.
Progressive complexity. You can start on Streamlit Community Cloud with a public repo and zero credentials, graduate to Streamlit in Snowflake when you need enterprise governance, and self-host when you need full control. The same code runs in all three. Few stacks let you slide along that axis without a rewrite.
Security by default. Secrets live in secrets.toml locally and in the platform's secret manager in production — you never paste a connection string into your source code. RBAC, masking, and audit logs come from Snowflake, not from your app code. The defaults are the right defaults.
Developer ergonomics. Connecting to a Snowflake warehouse and rendering a queryable dataframe is, end to end, this:

import streamlit as st

conn = st.connection("snowflake")
df = conn.query("SELECT * FROM my_table")
st.dataframe(df)

Five lines. Connection pooling, credential management, and query caching are all handled behind st.connection. The simple case is genuinely simple, and the complex case is still possible.

These four together are why "build a data app on this stack" stops being a project and becomes an afternoon.

Streamlit's architectural honesty

The other thing worth appreciating is the way Streamlit itself is built.

Most web frameworks try to look like web frameworks. There is a server process running in the background. You define routes, controllers, state. When the code changes, the server has to reload, which takes a few seconds and breaks any in-progress sessions.

Streamlit does something almost insultingly simple: it re-runs the entire Python script, top to bottom, every time something changes. Save a file. Click a button. Slide a slider. The script runs again like you typed python app.py at the terminal. Browser state? A WebSocket connection carries the diffs. The server does not restart. There is no reload step. There is no controller layer. It is just a Python script being executed in a loop.

This sounds wasteful until you use it. The hot reload is instant because there is no server process to restart — there is only a script to re-execute. The WebSocket pipe pushes UI diffs to the browser without you ever touching fetch or setState. You save the file in your editor and the screen updates before your finger leaves the keyboard.

The cost is that you have to learn st.session_state for anything that needs to persist across reruns, and @st.cache_data / @st.cache_resource for anything expensive. But those are two concepts. That is the entire mental model. Compared to React's lifecycle methods or FastAPI's dependency injection, this is a rounding error.

A live showcase: Streamlit AI Assistant

If you want to see all of this stitched together in one place, Streamlit's own team runs a small, underrated demo at demo-ai-assistant.streamlit.app. It is a chatbot that answers questions about Streamlit and Snowflake by retrieving from their official documentation. Free, no signup, works on mobile.

What makes it worth a visit is not the chat interface — it is what the demo is, structurally. The retrieval layer is Cortex Search over the documentation corpus. The frontend is Streamlit. The hosting is Community Cloud. Every layer this article has talked about so far is sitting in that one URL, in production, serving real traffic. It is the cleanest end-to-end showcase of the ecosystem I have found.

It is also a useful tool in its own right. Ask it a specific Streamlit API question — caching behavior, secrets management, deployment limits — and you get accurate answers with source links into the docs. For day-to-day Streamlit work it is genuinely faster than searching the docs by hand.

The one thing worth noticing as a developer: the answers stay tightly inside the documentation. Ask it to compare Snowflake to a competitor, or to weigh costs against an alternative architecture, and it will politely organize what the docs say and stop there. That is not a limitation, exactly — it is the correct behavior for a vendor-run documentation RAG. The same property that makes it trustworthy on API details also makes it unsuitable for architectural debates. Worth knowing when you use it.

💡 Column — Streamlit Community Cloud as a speed multiplier

If you are prototyping anything in Python and you have not tried this loop yet, do it once. The workflow is genuinely this short:

Write your Streamlit app locally.

Push to GitHub (public or private — both work).

Connect the repo to Community Cloud. You get a public URL in about thirty seconds.

Paste the URL into Slack. Your stakeholders are already using it.

The part that surprises people the first time is how seamless the sharing half is, not just the deploy half. The recipient does not install anything. They do not sign up for an account. They do not need to be on your VPN. They click the link and the app is in their browser — on a laptop, on a phone, on a tablet on a factory floor. There is no "let me schedule a demo" step. Concept and audience meet at the URL.

From that point on, git push is your deploy command. No Docker, no Cloud Run config, no Vercel project, no CI step. Edit locally, push, and the same URL serves the new version within seconds. Everyone who has the link is now looking at the latest build — the "which version are you on?" problem just stops existing. Feedback comes back in minutes, you push a fix, they refresh. The loop is so tight that prototypes start to feel like conversations.

Apps go to sleep after a while of no traffic and wake in a few seconds on the next request, which is fine for internal tools and demos and almost everything that is not customer-facing production. The resource limits are tight (a small slice of CPU and about 1 GB of memory per app), so cache aggressively (@st.cache_data(ttl=3600) for I/O, @st.cache_resource for models and DB connections) and you are usually within budget. Secrets go in the app's settings panel, not in the repo.

The reason this is free is the same reason the whole stack works: Snowflake is running these apps on idle compute they already own. Use it. It is the fastest "idea to public URL to feedback loop" in Python right now.

Where solo builders still win

So if Snowflake plus Streamlit is this good, why am I not building everything on it?

Because Snowflake is a high-end car. You pay to skip the assembly. For companies that cannot or will not assemble their own stack, that is a great trade. For solo builders and small teams who already know how to put the pieces together, the same architecture can be replicated for nearly zero variable cost.

Here is the stack I actually use for personal RAG projects:

SQLite with FTS5 for full-text search, plus BM25 trigram scoring. Hundreds of millions of rows on a single file, sub-millisecond queries, zero servers.
sqlite-vec for vector search in the same database. The same file now does keyword and semantic retrieval.
Hybrid retrieval with Reciprocal Rank Fusion. Run FTS5 and vector search in parallel, combine the rankings with score = Σ 1/(k + rank_i) (k around 60), and you get most of the accuracy of a commercial reranker for the cost of a tiny SQL view.
Cloudflare Tunnel for exposing the local server to the internet without opening ports or buying a static IP.
uv for environment management. The old "set up a venv, activate it, pip install" dance is gone. uv run app.py creates a disposable environment in milliseconds and tears it down when you are done. Astral's tools just got acquired by OpenAI in March 2026, but the MIT license means the worst case is a community fork — not a tool disappearing.

This stack costs me nothing per month. It runs on a laptop or a small server. The data never leaves my hardware. The latency on retrieval is lower than any cloud RAG I have benchmarked, because there is no network hop at all.

The trade is real engineering effort. You have to know how FTS5 tokenizers work. You have to understand why WAL mode matters for concurrent reads. You have to debug your own embedding pipeline. Snowflake hides all of that. I do not want it hidden.

Two roads, both right

Snowflake's strategy is sound. Streamlit's design is honest. Cortex Search is a real product, not a marketing demo. If you are inside an enterprise where data governance is non-negotiable and engineering hours are the scarce resource, the answer is not even close — you ship on this stack and move on.

But if you are a solo builder, or a small team that enjoys assembling pieces, the same problem space — fast UIs, searchable text, semantic retrieval, public deploys — is solvable with uv, SQLite, sqlite-vec, Streamlit Community Cloud, and a Cloudflare tunnel. The total cost is your time and a domain name.

The factory whiteboard I shipped at 3 AM runs on the second stack. It will probably never need the first. But I am glad both exist, and I am glad one of them is paying for the other to be free.

Instruction systems capability ladder: harness leveling

Gábor Mészáros — Tue, 19 May 2026 06:53:07 +0000

submission for the Hermes Agent Challenge

A few months ago I drew a maturity ladder for CLAUDE.md files — does the file exist, are constraints explicit, do skills load on demand. Useful for self-locating, and the ladder generalizes past Claude — CLAUDE.md, AGENTS.md, .cursorrules, Copilot instructions all live on the same rungs.

After a lot (lot lot lot) more time spent with these setups, the ladder is built on a different, broader axis than I first drew it on: the channel each rung runs on.

The new ladder

Level	Name	What's added	Channel
L0	System	System prompt only	attention
L1	Primer	One instruction file (`CLAUDE.md`, `AGENTS.md`, `.cursorrules`)	attention
L2	Composite	Multiple files — user defaults, project overrides	attention
L3	Scoped	Path-scoped rules (`.claude/rules/*.md`)	attention
L4	Delegated	Skills — procedures invoked on demand	attention
L5	Abstracted	Sub-agents — child contexts called by the parent	attention (interface)
L6	Governed	Hooks, MCP gates, deny-permissions	enforcement
L7	Adaptive	Self-improving skills written by the agent	self-writing

Two cuts split the ladder: one between L5 and L6 (soft to hard), one between L6 and L7 (read to write). I'll use attention and soft channel interchangeably from here, and the same for enforcement / hard channel.

A quick tour

L0 (System) is the cold start: the model with whatever the vendor injected, nothing else.

L1 (Primer) is your single root file — the entry every model sees first.

L2 (Composite) is the moment you split user-level config from project-level: ~/.claude/CLAUDE.md vs ./CLAUDE.md, or your global Cursor settings vs a project .cursorrules.

L3 (Scoped) introduces path scoping — the rule about Python tests only loads when the agent touches tests/*.py.

L4 (Delegated) is skills, which let you ship procedures the agent can pull on demand instead of dumping every workflow into the root file.

L5 (Abstracted) is sub-agents — child processes with their own context, called by the parent for a focused subtask. The child's reasoning runs in its own context window, separate from the parent's. What flows back is the result, which re-enters the parent as a new source. The parent–child interface is on the soft channel; the child's internal work runs on its own soft channel, not the parent's.

L0 through L4 share one context — they all compete for the same finite slot against the user's prompt and the recent diff. L5 spawns a second context but couples to the parent on attention. Together that's the soft channel — attention dynamics, by another name.

L6 (Governed) is where it profoundly changes. Hooks are not in the model's context window. A PreToolUse hook that blocks git push on a non-zero pytest exit doesn't get downweighted by a long task. An MCP server that requires authentication before reading a file doesn't depend on the model remembering your auth rule. Deny-permissions in .claude/settings.json for .env and .pem files don't compete with the rest of the spec. L6 is enforcement — outside the context dynamics, deterministic, not subject to load or context rot.

L7 (Adaptive) is different again. The agent writes its own instructions — not because the user said "remember this," but because the agent finished a task and decided some part of the trajectory was worth saving for next time. At read time the artifact lands in the attention channel like anything else. What's different is the writer: the model wrote the file, the trigger was task completion, and the user never saw the prompt that produced it. L7 is self-writing.

That's the ladder.

Two cuts under the ladder: attention channel covers L0–L5, enforcement is L6 alone, self-writing is L7 alone. Cut 1 at L5/L6 marked soft to hard. Cut 2 at L6/L7 marked read to write.

The first cut: L5 / L6

The load-bearing observation in this reframe is the cut between L5 and L6.

L0 through L5 all run on the soft channel — either directly on the parent's field (L0–L4) or on a child's that couples back to the parent through prompts and results (L5). They compete. They decay with load. The model can downweight any of them, lose track of any of them, prioritize the user's prompt over any of them. You can tell a sub-agent "always check tests before reporting done" and it'll do it 80% of the time, or 95%, or 60% — you don't know without measurement. The same instruction in a CLAUDE.md and the same instruction passed to a sub-agent are running on the same physics, just on different fields.

L6 is outside that physics entirely.

Generic example. Suppose your CLAUDE.md says "never push without running tests." That's L1. The model reads it, integrates it into context, weights it against everything else loaded — your other rules, the recent diff, the user's prompt. If you have four thousand tokens of instructions and the model is mid-task, that line is competing with everything else for attention. Sometimes it follows. Sometimes it doesn't.

Now suppose you have a PreToolUse hook on Bash that exits non-zero if pytest fails. That's L6. The model can decide to push or not push. It doesn't matter. The push fails before the model's intent reaches the network.

Same constraint, two channels, two failure modes. Soft channel fails probabilistically. Hard channel fails deterministically. They take different fixes — soft constraints want better content and ordering (the Pink Elephant piece is about that fight), hard constraints want a better hook script, a tighter PreToolUse matcher, or a stricter permission rule.

Calling these the same thing because they're both "in your .claude/ directory" hides the architectural difference.

The second cut: L7 writes itself

L7 - Adaptive isn't a third channel exactly — at read time, what L7 wrote lands in the same context with everything else. The cut is at write time. The agent writes its own instructions.

Most "memory" features in shipping agents aren't L7 by this cut. Claude Code's saved memory writes when the user signals remember this or accepts a prompt to save. Cursor's notepads, Copilot's pinned context, Gemini's saved facts — same pattern. The agent keeps the artifact, but the user authored it. That's persistent context, not self-writing. Call it L6.5 if you want a name for it.

The clearest L7 in print today is Hermes Agent, released by Nous Research. The mechanism is documented: when the agent identifies a saveable trajectory — after a successful task with five or more tool calls, after recovering from errors and finding a working path, after the user corrects its approach, or after discovering a non-trivial workflow — it invokes its skill_manage tool to extract a SKILL.md (markdown with YAML frontmatter) into ~/.hermes/skills/. Future sessions load the skill automatically and it becomes available as a slash command. The user didn't ask for it. The agent decided the trajectory was worth saving.

Three of the four triggers are what make this clearly L7 and not L6.5. Error recovery, user correction, novel-workflow discovery — these are cases where only the agent knows the saveable moment happened. A user-driven memory feature can capture "this task was useful enough to want it remembered" by asking the user after the fact. It can't capture "I tried three approaches and the third worked" unless the agent volunteers it. The artifact format matters too: an auto-extracted SKILL.md lands in ~/.hermes/skills/ in the same format human-written skills use. Next session, the agent loads it and can't tell who the author was. That symmetry is what makes the loop close — every successful trajectory can shape the next one.

Concretely, here's what an auto-extracted skill might look like — illustrative, in the shape Hermes's documented SKILL.md schema specifies, fitting the second trigger (the agent worked through a pytest debugging session, found the working path, and saved the lesson):

---
name: debug-pytest-import-errors
description: When pytest reports ModuleNotFoundError despite a successful editable install, check src-layout configuration before chasing PYTHONPATH.
version: 1.0.0
platforms: [macos, linux]
metadata:
  hermes:
    tags: [python, testing]
    category: dev-workflow
---
# Debug pytest ImportError on src-layout projects

## When to Use
pytest fails with `ModuleNotFoundError` after a fresh clone, even though `pip install -e .` ran and the import works in a Python REPL.

## Procedure
1. Check `pyproject.toml` for `where = ["src"]` under the build-system packages section.
2. Confirm `pythonpath = ["src"]` is set in `[tool.pytest.ini_options]`.
3. Re-run `pip install -e .`; confirm `.egg-info` lands at the package root, not inside `src/`.

## Pitfalls
- `PYTHONPATH=src` as an env var works locally but doesn't survive CI.

## Verification
`uv run pytest` runs without `ModuleNotFoundError`.

The frontmatter is functional — tags and category route the skill in Hermes's index; platforms gates it by OS. The body's When to Use / Procedure / Pitfalls / Verification is the schema's recommended shape. Notice what the agent saved: not the original failing command, not the dead-ends, just the working path plus the trap that would have lured a next session into chasing PYTHONPATH. That's curation, not transcription.

This is why L7 is safe to leave unsupervised in Hermes and risky most places else. The SKILL.md schema enforces moves a well-coupled instruction needs — imperative voice, directive ordering, named constructs, the warning placed after the working path rather than before it. A free-form memory feature has no such structural prior; the agent writes whatever feels worth saving, and the writes degrade as the agent's writing discipline does.

Schema is the cheap version of supervision.

The new failure mode is the self-writing layer running unsupervised. An auto-extracted skill that overfits to one project. A trajectory summary written under a stale assumption that surfaces six weeks later as a phantom instruction. There's no rule file the user authored to grep for the source — the rule is in a markdown file the agent wrote and the user never read, sitting in ~/.hermes/skills/ or its equivalent.

L7 doesn't replace L0–L6. It runs alongside, with its own writes and its own decay. Most agent setups don't have it because most agents don't expose it. The ones that ship a memory feature mostly do L6.5 and call it L7.

When to climb

The dominant pattern I see in real repos is L1 with a thin L6: a CLAUDE.md, maybe a few rule files at L3, deny-permissions for .env. L4 (skills) is rare — most authors haven't built any. L5 (sub-agents) is rarer — most use cases haven't surfaced. L7 is mostly absent — most agents don't expose a self-writing surface, and the few setups that do have one running treat it as opt-in defaults nobody reviewed.

Across 28,721 public repositories with AI configs, 89.9% don't name specific constructs in their instructions — no backticks, no file paths, no function names. That's most of the soft channel running at low coupling: easily downweighted, easily lost. The hard channel is thinner. The adaptive channel is mostly absent.

Large spec, small contract, no adaptive layer. That's the asymmetry — but it's not always a bug. Each rung exists because the rung below it fails in a specific way. The trigger is the failure, not a feature wishlist.

From	To	Symptom that triggers the climb
L0	L1	Re-explaining the same project context every session
L1	L2	One file got long enough that important rules get ignored
L2	L3	Path-irrelevant rules pollute every task
L3	L4	The same procedure gets described inline across multiple rules
L4	L5	A procedure pollutes the parent's context with reasoning chains the parent doesn't need
L5	L6	A constraint must hold 100% of the time, not 95%
L6	L7	You keep correcting the same preference across sessions
L6	L7	You keep watching the agent re-derive the same workaround

The mistake is climbing without the symptom. A repo with three rules in one file doesn't need L3. A solo developer's CLAUDE.md doesn't need a sub-agent. Premature climbs cost context budget for no return; you've added structure the model has to navigate without solving a problem you actually had.

The opposite mistake is more common: under-building the higher rungs because the symptoms feel like model failures rather than rung failures. "The agent didn't run tests before pushing" reads like a prompt-engineering problem; it's a missing L6. "The agent forgot we use Cloudflare Workers" reads like context drift; it's a missing L7. "The agent keeps describing the deploy process every time I ask" reads like verbosity; it's a missing L4.

Climb when the rung below stops working.

Three questions for your repo

Not a recipe. A diagnostic. For any rule in your setup, ask:

Does this fail loudly when violated, or silently? Loudly is L6. Silently is L0–L5.
Does the model see this, or does the runtime enforce it? Sees is the soft channel. Runtime is the hard channel.
Does it get worse when you add unrelated rules to the same file? Yes is L0–L5. No is L6. "Sometimes" is probably L7.

Most rules answer silently / sees / yes. That tells you which channel you're in. The interesting question is whether anything in your setup is on the other channels at all.

A note on related taxonomies

There are other progressive ladders for AI agent setups in print. Vellum's L0–L5 is an autonomy axis — how much the agent decides on its own. Blake Crosley's 4-tier is a concurrent-decomposition axis — how many agents run in parallel. Anthropic's 5-layer ADK frame for Claude Code is a content-boundary axis — what kind of content goes where. Zylon's 5-architectural and GitHub's 3-tier carve different cuts again, mostly around how the agent is wired into a product surface.

The ladder above is on a different axis from any of those. It sorts by the channel each mechanism runs on — soft attention, hard enforcement, self-writing memory — and progresses through the named constructs an agent exposes (CLAUDE.md, scoped rule files, skills, sub-agents, hooks, auto-memory). The two cuts (L5/L6 and L6/L7) are the load-bearing claim; the autonomy and concurrency taxonomies don't draw those cuts because they're sorting on different things.

Different axis, different cuts, different diagnostic. Use whichever maps onto the question you're actually asking.

Terminal output: ails check on a .claude directory. LADDER reads 8 rungs across 3 channels. SETUP reads L1 + L3 + L6 (Primer, Scoped, Governed). Channel/Levels/Count table shows attention L0–L5 count 5, enforcement L6 count 1, self-writing L7 count 1. Cuts named at L5/L6 soft→hard and L6/L7 read→write.

*Previously: CLAUDE.md Best Practices: From Basic to Adaptive — where I drew the ladder the first way. The State of AI Instruction Quality for additional data.

I'm building Reporails, measurement for the attention channel. npx @reporails/cli check runs locally, no account needed.*

Armorer v0.1.19: building the local ops layer for AI agents

Armorer Labs — Tue, 19 May 2026 06:52:44 +0000

Armorer v0.1.19

We have been building Armorer as an experimental local control plane for AI agents.

Getting one agent demo working is usually not the hard part. The harder part is everything right after that: provider configuration drift, Docker or Colima state, partial installs, failed runs, and figuring out what actually changed between attempts.

So Armorer is not another agent framework. It is our attempt at a local ops layer for agents: install them, configure them, run them, supervise jobs, and recover when setup or runtime goes sideways.

What changed in v0.1.19

supervised setup flows instead of silent magic
live workstream visibility during install and runtime
clearer local state around jobs, providers, and failures
local management for NanoClaw and OpenClaw style workflows

Repo: https://github.com/ArmorerLabs/Armorer

It is still experimental, so we care a lot more about honest feedback from people already running local or self-hosted agent workflows than about pretending the product is finished.

I built ChatMandu - a WhatsApp-focused web app from Nepal

Dev — Tue, 19 May 2026 06:52:15 +0000

Hi everyone

I’m a software developer from Nepal with 2+ years of experience in Laravel and Node.js.

Recently, I built a web app called ChatMandu.

ChatMandu is a lightweight web platform built around improving how users interact with chat and communication workflows. The goal was to keep it simple, fast, and practical - not overloaded with unnecessary features.

I built this project while experimenting with real-world product development, focusing on:

clean UX
performance
solving communication-related use cases in a simple way

This is still an early version, and I’m actively improving it based on real user feedback.

I need feedback from you

If you try ChatMandu, I’d really appreciate honest feedback on:

What feels confusing
What feels useful
What should be improved or removed
Would you actually use this

Try it here

https://chatmandu.tech

Thanks for checking it out

Any feedback, even critical, is very welcome.

If anyone is interested in partnership, you can reach me at:
dev20581114@gmail.com

GitHub Copilot CLI as a PR-triage co-pilot: how I keep up with 40+ upstream orgs

Mukunda Rao Katta — Tue, 19 May 2026 06:51:23 +0000

Drafted for the GitHub Copilot Challenge (opens May 21). Will add the official devchallenge tag once the challenge announcement is live.

For the last 18 months I have been running a small one-person open-source program: meaningful PRs across Anthropic, OpenAI, Google, Microsoft, NVIDIA, AWS repos, plus 20-something smaller projects in the MCP and LLM tooling space. The math gets bad fast. You cannot keep 40 repos warm in your head; the cost of context-switching is what kills throughput, not the typing.

GitHub Copilot CLI is the one tool that has actually moved that number for me. Not for writing code: I write most of the code by hand. For navigating code I have never seen before in repos I have just cloned. Below is the workflow that survived two iterations and the prompts I keep coming back to.

The triage loop

When a triage candidate comes in (an issue I tagged earlier, a thread I bookmarked, a TODO I left in a fork), I run roughly this sequence:

# 1. Fast skim: what is this repo, where is the meat?
gh copilot suggest "explain the architecture of this repo from the top-level dirs"

# 2. Locate the file the issue is about, without grepping for an hour
gh copilot suggest "where is the streaming response handler in this repo?"

# 3. Once the file is in front of me, ask copilot to make sense of the
# function I am staring at, not in general but specifically
gh copilot explain "this function" -- src/streaming/handler.py:412-510

# 4. Stage a small, surgical patch and have copilot sanity-check it
gh copilot suggest "review this diff for correctness and side effects"

Three prompts and a diff review is what 80% of my PRs look like in practice. The remaining 20% are the ones where Copilot is wrong (or I am) and I have to slow down. Those are the PRs that ship the most value.

What Copilot CLI is genuinely good at

Mapping a repo I have never read. I ask "what does this repo do" and get a 6-line summary that is correct often enough to be load-bearing. Saves the 20 minutes of skimming I used to do.

Pointing at the right file by description. "Where is the rate limiter implemented?" gets me a path in seconds. The path is right 9 times out of 10. The one time it is wrong, the wrong path is at least adjacent, and that adjacency is itself a clue.

Translating between languages I do not have in working memory. I ship to Python, TypeScript, and Rust regularly. I can write all three fluently but I context-switch slowly. gh copilot suggest "what is the TypeScript equivalent of this Rust pattern" lets me carry an idea between languages without re-reading the syntax for ? operator semantics for the seventh time.

Generating the boring 80% of a CI workflow. GitHub Actions YAML is one of the worst per-keystroke languages I know. Copilot CLI gives me a YAML that is right enough to commit and tweak. The first version is rarely the final version, but it is closer than mine would have been from a blank file.

What it is not good at

Anything that needs to reason about cross-file state. Copilot CLI sees one snippet at a time. If your refactor touches three files and the question is "what breaks downstream," ask a human or a tool with broader context.

Telling you which of two patches is better. I asked Copilot to evaluate two patches I had written against the same issue. It picked the worse one, because the worse one looked tidier in the diff. Aesthetic correctness, not behavioral correctness. Copilot is great for shape, bad for taste.

Replacing your understanding of the codebase. This is the trap. The first month I used Copilot CLI for triage, I shipped a PR that touched a part of the codebase I had not actually read. The review caught it. I have not made that mistake since, and I will not get away with it again. Use Copilot to find the code; do not use Copilot to avoid reading the code.

Concrete win: a 47-second triage

The fastest triage I have had was an open issue on a popular Python MCP SDK. Repo new to me. Issue: a streaming handler dropped final tokens occasionally.

gh repo clone foo/bar && cd bar
gh copilot suggest "where is the streaming response chunked"
# -> src/forem/streaming.py
gh copilot explain "the early-return condition in chunk_iter()" -- src/forem/streaming.py:204-244
# -> "Returns when chunk size is 0, but the producer also emits empty
#     keepalive chunks; the early return ends the stream prematurely."
# Fix:
sed -i 's/if not chunk:/if chunk is None:/' src/forem/streaming.py
gh copilot suggest "write a regression test for keepalive empty-chunk handling"
# -> generates a test that I keep and edit
git checkout -b fix/keepalive-chunks
git commit -am "Don't end stream on empty keepalive chunks"
gh pr create

The PR took 47 seconds to draft. The review took two days. The fix was right.

This is the workflow that the CLI unlocks. Not "write my code for me." It is "tell me where to look so I can spend my brain on the thing only I can do."

Three habits that took me three months to learn

Always confirm the path before reading. gh copilot suggest "where is X" is fast and confident, but it can be wrong. Type the path it suggests into your editor and check the file actually contains what you expect. Two-second sanity check.
Quote real code into the prompt. "Explain this function" is mediocre. "Explain the early-return at line 204" is targeted. The narrower the prompt, the more useful the answer. Copy-paste the line of code into the prompt; do not summarize it.
Treat the first answer as a hypothesis. Copilot will hand you something confident-sounding. The right move is to verify, not to trust. The fastest verifier is the test that should fail before your fix and pass after.

What I still want

A "show me the three places this function is called" command. I know I can gh copilot suggest for it, but a first-class command for cross-file context is the gap between "useful triage tool" and "real refactor partner." If GitHub ships that, I will retire grep -R for half my workflow.

If you are doing OSS contributions across many repos and have not tried gh copilot suggest for repo-mapping, install it once and run it once. It is one apt-install away on Ubuntu, one brew-install on Mac.

gh extension install github/gh-copilot
gh copilot suggest "what is this repo about"

If it sticks, the rest of this post is the playbook.

Happy triaging.

gemma4-safe-agent: a tool-using research agent on Gemma 4 e2b

Mukunda Rao Katta — Tue, 19 May 2026 06:50:55 +0000

Submission for the Gemma 4 DEV Challenge, Build track. Companion to my Write-track post on the five libs behind it.

What it is

A tool-using research agent that runs locally on Gemma 4 e2b via Ollama, in around 200 lines of Node.

You give it a question. It picks between two tools, reads a Wikipedia page, then returns a structured JSON answer with sources. No API key. No rate limit. Two GB of RAM and an Ollama instance is the whole stack.

ollama pull gemma4:e2b
git clone https://github.com/MukundaKatta/gemma4-safe-agent
cd gemma4-safe-agent && npm install
npm run demo -- "What is RLHF?"

{
  "final": "RLHF is a technique that uses human preferences as a reward signal to fine-tune language models.",
  "sources": ["https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback"],
  "steps": 2
}

Repo: github.com/MukundaKatta/gemma4-safe-agent

Why Gemma 4 e2b specifically

Gemma 4 ships in four sizes: e2b and e4b for edge and mobile, a 26B Mixture-of-Experts model, and a 31B dense model for servers. I picked e2b on purpose.

Reasons:

Runs anywhere. Two GB of RAM, no network, no key. The agent works on a CI runner, a Raspberry Pi, an old MacBook. The bigger sizes do not.
Hardest reliability case. A 2B-class model makes more parse mistakes and more arg mistakes than a 26B. If the scaffolding holds at the 2B level, the bigger ones are a drop-in via GEMMA_MODEL=gemma4:e4b.
Real product surface. Cheap, fast, local agents are where on-device AI is going. e2b is the right target for the kind of agent you'd actually ship in a desktop app, a mobile shell, or a browser extension.

The same agent runs against any of the four Gemma 4 variants with one env var change.

How it works

The whole agent is a small loop:

for (let step = 0; step < MAX_STEPS; step++) {
  const fitted = fit(messages, { maxTokens: 4096, preserveSystem: true, preserveLastN: 2 });
  const raw = await ollamaChat(fitted.messages);
  const action = parseAction(raw);

  if (action.kind === 'tool') {
    const result = await TOOLS[action.tool].fn(action.args);
    messages.push({ role: 'assistant', content: raw });
    messages.push({ role: 'user', content: `tool_result: ${result}` });
    continue;
  }

  return cast({ llm, validate, prompt: 'Restate as JSON: ...' });
}

The whole run is wrapped in an agentguard.firewall block. Each tool is wrapped with agentvet.vet and agentsnap.traceTool. That gives me:

Context budget management so Gemma 4 e2b never blows its small window
Network egress allowlist so a prompt injection cannot redirect the agent to fetch an attacker URL
Tool-arg validation so a hallucinated fetch_url({ url: 12345 }) never runs
Trace snapshots so swapping models or tweaking prompts shows up as a CI diff, not a production surprise
Final-answer JSON enforcement with a validate-and-retry loop, which is the load-bearing piece for getting clean JSON out of a 2B model

I wrote about the scaffolding in detail in the Write-track companion post. Here the focus is the agent and the demo.

What you can run

The repo ships three entry points:

npm run demo -- "...": real run against your local Gemma 4 e2b
npm run demo:mock: same agent, with fetch_url returning canned pages (no internet needed)
AGENT_MOCK=1 node examples/run-stub.js: deterministic stub LLM in place of Gemma 4, so the whole pipeline runs in CI without any model at all

The third one is the one I use for snapshot regression tests. It proves the agent's tool-use behavior is stable even with an LLM swapped out.

What surprised me

Two things.

Gemma 4 e2b picks the right tool more often than I expected. The model is small but the tool-selection task is well-bounded ("you have these two tools, here's the schema, return one JSON"). When the surrounding scaffolding catches arg mistakes and JSON glitches, the model's reasoning is the part that doesn't need help.
The final-answer step is where the model really needs the cast loop. Asking for "JSON only, no prose" still produced Sure here you go: {...} enough of the time that I would not trust the agent without agentcast wrapping that step. With it, the post-condition becomes a guarantee.

Try it

Repo: github.com/MukundaKatta/gemma4-safe-agent (MIT)

Issues and PRs welcome. The five scaffolding libs are all on npm under @mukundakatta/* and are zero-dep, so you can pull them into your own Gemma 4 projects one at a time.

If you build something on top of this, drop me a link.

Have fun with Gemma 4.

I Built a 3D Solar System in 300 Lines of React (No Game Engine)

Devanshu Biswas — Tue, 19 May 2026 06:50:30 +0000

Pull up a browser. Drag your mouse. Watch eight planets orbit the Sun, axes tilted, Saturn's rings catching the light.

That's not a game engine. That's not Unity. That's 300 lines of React.

If your mental model of "3D programming" is "scary C++ matrices and a 600-page OpenGL textbook," you're a decade out of date. WebGL has shipped in every browser since 2014. Three.js wraps the boring math. React Three Fiber lets you write the scene as components, the same way you write HTML. The whole pipeline is <mesh>, <sphereGeometry>, <meshStandardMaterial> — three tags, you've made a planet.

Today I'll show you the whole thing.

The core insight: a scene is a tree

Every 3D scene — every Pixar movie, every video game, every product configurator — is the same shape: a tree of objects, where each node has a position, a rotation, a scale, and zero or more children. That's it. The Sun is the root. Earth is a child positioned 9 units to the right. The Moon is a child of Earth, positioned 1 unit further right. Rotate Earth and the Moon comes along for the ride, because it's a child.

In Three.js you build this tree imperatively:

const sun = new THREE.Mesh(geom, mat);
const earth = new THREE.Mesh(geom2, mat2);
earth.position.x = 9;
sun.add(earth);
scene.add(sun);

In React Three Fiber, the tree IS your component tree:

<mesh>           {/* sun */}
  <sphereGeometry />
  <meshBasicMaterial color="yellow" />
  <mesh position={[9, 0, 0]}>    {/* earth, child of sun */}
    <sphereGeometry args={[0.5]} />
    <meshStandardMaterial color="blue" />
  </mesh>
</mesh>

That's the whole conceptual leap. Once you see "the React tree is the Three.js scene graph," the rest is naming things.

The trick that makes orbits cheap

Naïve orbit code looks like this:

useFrame((_, dt) => {
  angle += speed * dt;
  earth.position.x = Math.cos(angle) * 9;
  earth.position.z = Math.sin(angle) * 9;
});

That works, but you're doing two trig calls per planet per frame in JavaScript. 60 fps × 8 planets = 960 sin/cos per second in slow JS.

There's a better way. Put the planet inside a pivot group at the origin. Place the planet at (distance, 0, 0). Rotate the group, not the planet.

<group ref={orbitRef}>                    {/* this group spins → orbit */}
  <mesh position={[distance, 0, 0]}>      {/* planet stays put in local space */}
    <sphereGeometry args={[radius]} />
    <meshStandardMaterial color={color} />
  </mesh>
</group>

useFrame((_, dt) => {
  orbitRef.current.rotation.y += speed * dt;   // ONE addition
});

Now you're doing one addition per planet per frame in JavaScript and zero trig. Three.js's internal matrix update handles the rotation in compiled C++ inside the GPU pipeline. The math still happens — it just happens in the right place.

Same trick for axial rotation: a child group inside the planet rotates on its own Y axis. Tilt the wrapper group on the X axis and Uranus is suddenly tipped 98° like real Uranus. The whole solar system is six nested groups doing addition.

Lighting: three lines, instantly 3D

If you skip lighting, every planet looks flat — like a coloured paper disc. Add one <pointLight> at the Sun's position and use meshStandardMaterial for the planets:

<pointLight position={[0, 0, 0]} intensity={2.5} distance={120} />

<mesh>
  <sphereGeometry args={[0.5]} />
  <meshStandardMaterial color="blue" roughness={0.7} />
</mesh>

meshStandardMaterial is physically-based — it reads the light, bounces it off the surface based on roughness and metalness, and shades the half facing the light bright while the half facing away goes dark. Three lines. Instant 3D.

Pro tip: don't use meshStandardMaterial for the Sun itself. The Sun emits light, it doesn't receive it. Use meshBasicMaterial, which ignores all lights and shows the colour you set, flat. Otherwise you'll have a yellow sphere with a dark side, which looks wrong.

OrbitControls: 80% of the polish for free

Drei (the R3F helper library) ships an <OrbitControls /> component. Drop it in your <Canvas> and you get:

Drag to rotate the camera around the scene
Scroll to zoom
Pinch on mobile
Two-finger rotate

<OrbitControls enablePan={false} minDistance={6} maxDistance={80} />

Three lines, all of "drag to look around" is done. This is the kind of thing that takes a junior developer two weeks in raw WebGL and 30 seconds in R3F. Use the helpers.

HTML overlays beat in-canvas UI

The temptation when you're new to 3D is to put every UI element inside the 3D scene — billboards, sprites, text geometry. Don't. Mount your <Canvas> full-bleed and stack regular HTML on top with position: absolute.

<div className="shell">
  <header className="hero">...</header>
  <Canvas>...</Canvas>
  <aside className="info-panel">...</aside>
</div>

The info panel that slides in when you click a planet is just a styled <aside>. The speed slider is <input type="range">. Your CSS skills transfer 1:1. The 3D part stays focused on 3D.

What I learned actually building this

Real takeaways from an afternoon of this:

1. Three.js is huge but the surface you need is small. The full Three.js bundle is ~600KB. You will use maybe 12 of its 400+ classes. Scene, Mesh, SphereGeometry, MeshStandardMaterial, PointLight, PerspectiveCamera, OrbitControls. That's most of it.

2. Real scale is the enemy. The Sun is 109× the radius of Earth. Neptune orbits ~30× further than Earth. If you use real ratios, the Sun fills the screen and Neptune is a single pixel. Cheat the visuals. Show real numbers in the info panel.

3. useFrame runs 60Hz, so don't allocate. Every frame, that callback fires. If you new Vector3() inside it, you're creating garbage 60 times per second. Either mutate refs you already have, or hoist allocations outside.

4. delta is your friend. R3F's useFrame((_, delta) => ...) gives you seconds since last frame. Multiply your speed by delta and your animation runs the same on a 60Hz laptop and a 144Hz gaming monitor. Without delta, your planets fly off the screen on a high-refresh display.

5. dpr={[1, 2]} is the mobile performance switch. Devices with retina displays would normally render at 3× resolution and tank the FPS. Capping at 2× looks identical to the eye and triples your frame rate on phones.

Why this matters

3D in the browser used to be a specialty — game studios, big agencies, NASA visualizations. It's not specialty anymore. Product configurators, real estate walkthroughs, data visualizations, NFT galleries, classroom physics demos — every web product is starting to have a 3D moment.

R3F is the lever that makes 3D approachable for people who already write React. You don't have to learn imperative scene-graph plumbing. You already know how trees of components work — you're doing 3D, you just have a different leaf type.

So go play. Open the live demo, click each planet, scroll out and look at the layout from the side. Then clone the repo and change a number. Make the Sun blue. Add a moon to Earth — wrap an Earth-sized sphere in an outer group, position the moon (1.2, 0, 0), and watch it follow. That's the entire mental model. You'll be making your own scenes within an hour.

Try it / fork it

🌐 Live: https://threejs-from-zero.vercel.app
🐙 Code: https://github.com/dev48v/threejs-from-zero

This is Day 36 of TechFromZero — a 50-day series where I build one tech from scratch every day with step-by-step commits you can read like a textbook. Yesterday was a voice AI tutor (Web Speech → Gemini → TTS). Tomorrow we're building a multi-agent AI orchestration that has agents argue with each other.

🌐 See all days: https://dev48v.infy.uk/techfromzero.php

Talk to you tomorrow.