AI companies have grown into data-hungry entities as their models require ever-larger datasets to train on. To meet that need, many AI startups defy long-standing internet conventions — like respecting robots.txt files, which signal to automated crawlers which parts of a website are off-limits — and scrape data aggressively. This has forced websites to restrict access to their data and, in some cases, strike licensing deals with AI companies. Fitness and social running company Strava is making a move in this direction by restricting its website and introducing fees for developer access.
To stop scraping, the company is increasing security
We don’t just report the news, we deliver it through the voices of multiple expert staff writers, each selected to broaden our scope and deepen our storytelling.


