Google Safety and You
DISCLAIMER: As with all things in this blog, information can be misused. Please keep in mind that too much Google Hack commands will result in Google verification capta.
Because the search engine, Google, has become practically
synonymous for the Internet these days, to be safe online—really is to be safe
through Google. Not only are jobs and relationships gained and lost based on Google—but
an ever-increasing danger—has become online stalking. We all need to be mindful that we may be potential phishing targets. But how,
you might ask, can one be safe through a tool whose very nature is to expose?
The answer is as simple--as it is complex: understand the underlaying
structures supporting the search engine--and redirect its purpose to better
inform and protect yourself.
So together lets take a moment to talk about:
·
Google: a
nest of spiders
·
From
basic to advanced: Google search commands
·
That’s
not me: Image theft and fake accounts
·
What
others are Googling: Trending searches, and who is looking
·
Taking
control of the Spiders: Google alerts and automated searches
·
Controlled
content: Setting up restricted search engines within Google
Google: a nest of
spiders
So—what exactly IS Google, and what makes it different
than the other search engines? What sets Google apart from the other search
engines, is something called Page Rank. Page Rank is a proprietary algorithm
that assigns a rank to websites, as they are discovered online by network
traverses by programs known as Google Spiders (Googlebots) who crawl the web.
As they crawl, they build up the Google Cache, ie a collection of known sites
with known characteristics. Page Rank determines what shows up first when you
do a Google search. Other search engines place their ranking on the number of
links leading to a page. Google, on the other hand, takes into consideration
not only the number of links—but the importance of the pages being linked from
to the target source. The importance of a page, in turn, is determined by the
number of visitors—and in some cases—the money paid to Google to give it a
higher ranking. There are of course other controversial factors in page ranking
(for example, your current location determines search results), but this is the
basic idea.
Websites can also ‘talk’ to these spiders through a
robot.txt page, which tells the spider not to return certain pages or
extensions. (try Googling inurl:robots.txt
filetype:txt for some examples) You might have heard about something
called the ‘Dark Web.’ These are simple sites that the spiders fail to crawl,
or cannot reach (DNS error, server error, or just an incorrect robots.txt file
can cause this too).
From basic to
advanced: Google search commands
In the last section, we referenced Google commands that
you might not be familiar with—the inurl and the filetype. There are many more
of these commands, known as Google Hacks. We care about them—because the bad
guys use them to look at us online. We need to understand what they see, so
that we can better protect ourselves against online vulnerabilities.
Lets do a basic Google search to show you that concept in
practice. We will do this manually—but keep in mind that hackers, stalkers,
hobbyists, and intelligence/law enforcement—use programs that send these
commands en mas.
I simply Google myself, Olivia Terrell. Nothing much
comes up, except for some random jewelry company that I have nothing to do
with, and pictures that look nothing like me, even on a good day.
Looking at the results, we see that pages with just
Olivia or just Terrel are returning. So next I refine that search a little bit
to include only cases where Olivia and Terrell appear together. I can do this
one of two ways—with a ‘+’ sign, or with “ “. The ‘+’ means that both Olivia
and Terrell have to both appear somewhere in a document—but not necessarily
together. Googling “Olivia Terrell” brings the intact phrase back.
You will probably see a lot of your information on sites
where you never even signed up (PeopleSearch, Spokeo, Ancestry, ect). They got the
information from lists that are bought and sold online from a variety of
sources—some are open source related to CORA (like when you register to vote)
and some are through loans or other transactions, or when you sign up for
“free” social media site accounts. Some of your information ends up online
because of well meaning relatives, trying to do genealogy research.
Remember to also google your email addresses, phone
number, address and usernames to see where your data is being used. It is amazing how long data hangs around. For
example, if I google oeterrel, all the tech forums and even an old powerpoint
out of college turn up.
You may also see your email being sold in
spam lists.
Suppose I cared not just about keywords linked to a
page—but the contents of the page. That’s where the intext command comes in.
The number of hits that are actually me, climbs. That command would look
something like this:
Intext:” Olivia Terrell”
Now, lets suppose I only want to see PDFs or documents
with my information in them. Why? Because you’d be amazed how many documents
you turn up in online that you don’t remember or know about—and how much about
you they can reveal. To do this, I add
the filetype command:
intext:"Olivia
Terrell" filetype:pdf
If I wanted to see pdfs or docs, I would use the pipe
command, which Google understands as ‘or’.
intext:"Olivia Terrell" filetype:pdf |
filetype:doc
Notice that Google commands are additive. We can just
keep adding to our string. Order does not matter either. Now suppose I want
only items with my name in them, with file type of pdf, but only on websites
that are .edu. That is where the inurl command comes in. The syntax would look
something like this:
intext:"Olivia Terrell" filetype:pdf |
filetype:doc inurl:edu
Or, you could simply narrow your search to a particular
website:
intext:"Olivia Terrell" site:www.denvergov.org
This list is by no means exhaustive—there are many more
commands. But just from these very simple Google hacks, a complete stranger,
knowing only my name could quickly figure out:
·
The names of my family
·
Where I live
·
My voting status (if you find your voting status
online—there is a link at the bottom to delete it)
·
My phone number(s) and email addresses
·
Where I work
·
Any groups/forums/committees or interests
All of this would be more than enough to spoof an email
to me at home or work, pretending to know my family or to have met me through
work somewhere, in the hopes of conning me out of information or money.
There are other nefarious uses as well. Consider for
example, if a person were to search for part of a login screen URL—or part of a
video path (i.e. something like: inurl:/view/viewer_index.shtml ) on an
unsecured video networking system, an index.html page, or even just searching
for xlxs files that contain the word pwd in them. This actually happens all the
time.
That’s not me:
Image theft and fake accounts
If someone steals your written information and creates
accounts, you can easily find them via these Google Hacks—but what if all they
steal is your image? Most of us are familiar with this happening on Facebook
(if you have not, check out https://www.nbcnews.com/business/consumer/fake-facebook-profiles-cause-heartbreak-families-colleagues-n895091
for an example story)—but few of us realize that dating sites often buy entire
databases of images to populate fake accounts and attract members.
To see if someone has stolen your profile image, go to https://images.google.com/
It will look like an ordinary search screen. Drag and drop your profile image
from your computer straight into where you would usually type your search:
While there are many places you might find a fake
account, in Facebook, the steps to report an account may be found here:
If you find your information on some strange dating site,
find a help link to their tech support—and complain. Most companies would
rather quietly delete an issue to make it go away—rather than face unwanted
publicity.
What others are
Googling: Trending searches, and who is looking
There is another feature of Google worth noting—Google
Trends https://trends.google.com/
Google Trends allows you to see how popular other searches are. This doesn’t
matter as much for online security—but is super useful for tracking the progress
of things like the flu. Most people do not even think of the flu (or any
disease) until they have symptoms, then the first stop is usually Google.
Googling “flu symptoms” in Google Trends over the past 90 days, shows that the
virus is spreading at a pretty even rate—and Kentucky is the hot spot.
If we localize to Colorado—Grand Junction is the
epicenter of Google flu searches so far.
Another reason to watch Trends—is because phishers and
spammers do. Whenever there is a significant uptick in a particular search,
like stories about natural disaster or candidate, the phishers and spammers may
use this information in their attacks.
Taking control of
the Spiders: Google alerts and automated searches
Ok, so we’ve learned there’s a lot of Google queries out
there we can run—but who has time for that? Never fear—you can tell Google to
do all that for you. Simply go to Google alerts at https://www.google.com/alerts
From there, simple paste in some of the Google searches we talked about earlier
in this document. Remember you can string commands together or use the ‘|’ (logical
or) to make a bunch of queries into one.
If you expand the ‘show options’, you will be able to
control how often the Google query is performed and sent to you. By default,
Google performs the query at least once a day.
Controlled content
Another nifty tool that Google provides—is the ability to
customize the search to only a specific set of sites. This is useful when you
have younger children, have a specific topic you want to research, or just
elderly parents who don’t care for all the clutter online. This tool is called
Google Custom search. To use Google search, go to https://cse.google.com/cse/ and
login with a gmail account.
Click the Add button:
Fill in the sites you’d like to restrict a search to
Once you get a custom Google set up the way you’d like,
you can also make it the default homepage.
I hope this very brief tutorial has peaked your interest
in just how powerful the search engine Google is—and inspired you to research
further into Google hacks, trends, alerts, searches and page rankings. With
endless open information—sadly, comes open risk. Be safe, and happy Googling.