Google and your website: a blind alliance
Suppose you have a website “onlineshopperdowrycom “and when you google it with keywords “website for online shoppers” You may get a sneak peek at your website page results and other websites related to your keyword. That’s pretty universal, as we all urge Google to find and index our websites. This is quite common on all e-commerce websites.
A. Your website “onlineshopperdowrycom “is directly allied with Google.
B. Your website and your web server (where you have all your usernames and passwords saved) are directly allied to each other.
C. Alarmingly, Google is indirectly associated with your web server.
You may be convinced that this is normal and you may not expect a phishing attack with Google to retrieve information from your web server. Now, on second thought, instead of looking “website for online shoppers” on Google, what if I search “online shoppers website usernames and passwords”Will Google be able to provide the list of usernames and passwords for the online shoppers website? As a security consultant, the answer will be “MAYBE, SOMETIMES!”, But if you use Google dorks (proper keywords to access Google), the answer will be a big “YES!” if your website ends up with missing security settings.
Google Dorks can be intimidating.
Google appears as a watchdog on duty until you see the other side. Google may have answers to all your queries, but you need to frame your questions correctly and that’s where GOOGLE DORKS comes in. It is not complicated software to install, run and wait for results, it is a combination of keywords. (intitle, inurl, site, intext, allinurl etc) with which you can go to Google to get exactly what you are looking for.
For example, your goal is to download JAVA related pdf documents, normal google search will be “free download of pdf documents in java” (free is a mandatory keyword without which no Google search is complete). But when you use Google dorks, your search will be “file type: pdf intext: java”. Now with these keywords, Google will understand what exactly you are looking for than your previous search. Plus, you’ll get more accurate results. That looks promising for an effective Google search.
However, attackers can use these keyword searches for a very different purpose: stealing / extracting information from your website / server. Now assuming I need usernames and passwords that are cached on servers, I can use a simple query like this. “filetype: xls passwords site: in”, this will give you Google results of the cached contents of different websites in India that have usernames and passwords saved in it. It’s as simple as that. In connection with the online buyer’s website, if I use a query “file type: xls inurl passwords: onlineshopper.com” the results could put anyone off. In simple terms, your private or confidential information will be available on the Internet, not because someone hacked your information, but because Google was able to recover it at no cost.
How to prevent this?
The file named “robots.txt” (often referred to as web robots, vagrants, crawlers, spiders) is a program that can traverse the web automatically. Many search engines like Google, Bing, and Yahoo use robots.txt to scan websites and extract information.
robots.txt is a file that gives search engines permission what to access and what not to access from the website. It is a type of control you have over search engines. Setting up Google dorks is not rocket science, you need to know what information is allowed and not allowed in search engines. The sample robots.txt configuration will look like this.
Allow: / website content
Do not allow: / user details
Don’t allow: / admin-details
Unfortunately, website designers often overlook these robots.txt settings or configure them inappropriately. Surprisingly, most of the Indian government and university websites are prone to this attack as they reveal all the sensitive information about their websites. With malware, remote attacks, botnets, and other types of high-level threats flooding the internet, Google dork can be more threatening as it requires a working internet connection on any device to retrieve any sensitive information. This does not end with the recovery of sensitive information alone, using Google dorks anyone can access vulnerable CCTV cameras, modems, email usernames, passwords and order details online just by searching Google.