productize.life
TH EN
AI · Visibility

Your Site Is Live. But Who Can See It?

I recently found my own site was invisible to both Google and AI, even though every page loaded fine. Here is what I checked and fixed myself in one afternoon.

Yim· written with Dobby (AI Oracle)/Jun 16, 2026

I had just wired up a subscribe form and a sitemap on my own site. Everything looked fine. Pages loaded, links worked. Then I actually checked, and Google could not find my sitemap, because it sat at the wrong path while the standard location was locked behind a gate.

So I opened the robots file, and found something I had never noticed. The host was blocking AI crawlers automatically, GPTBot (ChatGPT), ClaudeBot, Google-Extended, through a short line called a Content-Signal that read search=yes, ai-train=no. In plain terms: fine for search results, but do not use this to train AI.

That is the part that stopped me, because it meant "who gets to read my site" had already been decided before I knew it. Live does not mean found. And these days, "found" is no longer only Google.

A quick note on who I am. I build products with an AI assistant nearly every day, mostly alone. Every step below is something I did myself on my own site, not theory.

Step 1Can Google see your sitemap

A sitemap is the map that tells Google what pages your site has. When Google cannot find the map, it has to guess, and some pages never get picked up.

Do open yoursite/sitemap.xml and check that it shows XML. Then go to Google Search Console, the Sitemaps menu, enter sitemap.xml, and Submit.

Verify the status will move to Success. If you just submitted and it says "Couldn't fetch", do not panic, that is normal until Google goes to read it.

Step 2Who your robots lets in or blocks

The robots file is the sign at your door telling bots whether they may enter. It now has another layer, the Content-Signal, splitting "may search" from "may feed into AI" from "may train AI". If you use a host that sets this automatically, it may block AI bots without you asking.

Do open yoursite/robots.txt and read it. Look for the Content-Signal line and the bots marked Disallow, then decide. If you want ChatGPT or Claude to cite you, let those bots in. If you do not want your content used for training, keep ai-train=no. These are two separate choices.

Verify you know exactly who can get in, on purpose, not by default.

Step 3Add an llms.txt

Robots only says who may enter, not where the important content is. An llms.txt is a short file you write yourself and place at the root, telling AI what your main pages are and what to read first.

Do create an llms.txt at the root in simple markdown: your site name, one line on what you do, then a list of key pages with a short description each. That is enough to start.

Verify open yoursite/llms.txt and see the file. Done.

Step 4Make each page citable

AI answers people by lifting clear chunks from a page and citing them. If your page is one long wall with no headings, it is hard to lift.

Do add clear subheadings, a table of contents on long pages, a short summary near the top, and schema markup that says this is an article, who wrote it, and when.

Verify each section becomes a unit that can be linked to and cited on its own.

Step 5Check that you are actually seen

Search site:yoursite on Google and see how many pages show. Go back to GSC and confirm the sitemap reads Success with your pages picked up. Then ask ChatGPT or Claude about something you wrote, and see whether it knows your site.

Toolsthat actually help

Mostly free, all doable solo. These are the ones I actually use, not a long list to look impressive.

Recapthe afternoon checklist

  1. /sitemap.xml opens, and it is submitted in GSC
  2. you have read /robots.txt and know who is allowed in, on purpose, not by default
  3. you have an /llms.txt pointing to your main pages
  4. every page has clear headings, a summary up top, and schema markup
  5. site: shows your pages, and asking an AI shows it knows you

🔒 Held backGet the full checklist

The principle is right here, free: live does not mean found, so go open the real files one by one instead of assuming people see you.

What I have not put in this post is the automated version: how to serve /sitemap.xml and llms.txt correctly at the edge for every page at once, without editing files one by one, plus a repeatable AI-visibility audit. That is the difference between a one-time fix and a system that maintains itself.

🔒 Get the full checklist → the automated edge version plus a repeatable AI-visibility audit. The principle above is free; the part you can drop into a system is the held version.

You can do every step of this yourself. You do not have to wait for anyone, and you can start in an afternoon. It comes down to one line: do not assume people see you, go open the real files, starting with step 1 today.

More in the series

Let Claude cite you, don't let the rest train on you free · the other side of visibility is choosing who gets to read you

Not Every Action Needs a Human, a 3-tier model for deciding with AI · once AI acts, who owns the decision

Put AI in Discord without letting anyone command it for you · the access gate that lets AI start acting

Make your site agent-ready: I did 8, refused 3 · the flip side of being found by AI: being usable by agents

See all posts

Follow along

Get new posts and free resources first

Leave your email. New posts and the occasional free resource land in your inbox. No spam.

Email only, for updates.

Comments

Join the conversation

Share a thought.

Name is shown publicly. Email stays private and is never shown.

Loading comments…