Hi,
We have a webpage on our website that I have marked in the settings to present Google or other search to not crawl it. The webpage has to be on the site, just not searchable by a webcrawler. I have also asked Google to disallow indexing, which can be done temporarily. This is the webpage: https://www.nafo.int/Libr...rking-Papers/STACFAD
I also included in the page header tags a no index metatag:
I have also coded in the config files/robots to disallow these pages:
Disallow: /portals/0/Images/Secretariat/ Disallow: /Working-Papers/STACFAD Disallow: /stacfadwp21-05.pdf
Why do the images from these pages still show up in google, especially if I use the search term "NAFO LOGO"? The images show up. I do not want them to show up. I thought I had set everything up in these areas to prevent this from happening.
Any suggestions are welcome. There are old images showing up as well that people seem to access and would prefer if they were not accessible via the internet.
Alexis
First, it sounds like you're doing all of the correct things.
In my personal and anecdotal experience doing these things should still always be done when it makes sense. However, there are nuances to this too. I've also found that Google will still crawl and have record of everything it can, but "indexing" is treated differently over "knowing the content is there."
In the case of the images, specifically, that's very interesting. It looks like you, again, doing the right things. I'd recommend going into the Google Search Console to try and remove them from the index and search results.
In our case there were jpg thumbnails of copyright pdfs. The pdfs were protected behind registration sign-up, the copyright holder did not want anything indexed by google (not even the thumbnail images), but we still had to show the thumbnails on the site to encourage people to sign up.
Posted By RichardHowells on 2/9/2024 5:32 PM I think that robots.txt ONLY requests "Please don't crawl this area". In my mind that's not "Please don't crawl this area AND delete anything you already have." So I assume that if those pages have EVER been crawled then, in principle, Google has them. It's never been entirely clear to me *why* we would block a crawler. The pages/images presumably are not secret/confidential. If they were then I'd expect they'd be behind a password challenge. Why not just let the crawlers crawl?
You certainly want to make sure whats'a indexed; you can type "site:yourdomain.com" in Google search and it'll show all the pages indexed by Google from the specifeid domain. I use this a lot whenever I've to double check what's public and what's not.
Hi everybody! I just found this article that will help you delete an image from Google index,
https://www.searchenginej...-search-index/508458
These Forums are dedicated to the discussion of DNN Platform.
For the benefit of the community and to protect the integrity of the ecosystem, please observe the following posting guidelines:
Awesome! Simply post in the forums using the link below and we'll get you started.