How to tell Google not to crawl certain pages?

I have scenarios where I'd like Google not to crawl certain pages, just to not waste crawl budget. The pages all have a canonical tag set to another page, which Google respects (after crawling the page). I'd like to tell Google to not even crawl the page though.

Scenario #1:

Main URL: /first-slug Duplicate URL (with canonical set to Main URL): /first-slug/second-slug

Note that first-slug and second-slug have thousands of dynamic combinations, so putting each combination manually in robots.txt is not an option.

Scenario #2:

Main URL: /some-slug Duplicate URL (with canonical set to Main URL): /some-slug?page=x

That's just your basic pagination, where x can be any page number.

Is it somehow possible to do this with robots.txt (without specifying thousands of entries)? Or is there a rel attribute that I can use for the internal links to those pages, which does not have any negative effects except that Google won't crawl those pages?

submitted by /u/cMVjwDjN2OwoJm0DYn86
[link] [comments]

Digitalmarketing

Digital marketing agency in darbhanga

Digital marketing agency in Darbhanga digital marketing agency Local seo service in Darbhanga Best digital marketing agency in patna

0 Comments