The text file that runs the internet

From The Verge, Feb. 14, 2024, by David Pierce

For three decades, a tiny text file has kept the internet from chaos. This text file has no particular legal or technical authority, and it’s not even particularly complicated. It represents a handshake deal between some of the earliest pioneers of the internet to respect each other’s wishes and build the internet in a way that benefitted everybody. It’s a mini constitution for the internet, written in code. 

It’s called robots.txt and is usually located at yourwebsite.com/robots.txt. That file allows anyone who runs a website — big or small, cooking blog or multinational corporation — to tell the web who’s allowed in and who isn’t. Which search engines can index your site? What archival projects can grab a version of your page and save it? Can competitors keep tabs on your pages for their own files? You get to decide and declare that to the web.

continue reading

Previous
Previous

Five Myths About Generative AI That Leaders Should Know

Next
Next

AI is Reinventing the Legal Industry