Huge number of non-existent web pages in Google Analytics

Options
We are getting hundreds of hits to non-existent pages showing up in Google Analytics. I'm wondering if anyone has any insight into how in the world site visitors are even getting these URLs. I can't believe they are all manual mistypes, although most source/medium are (direct)/(none), and referral paths are (not set).. Here's a tiny example:


/about-the-bay/state-of-the-bay-report/so /tb-about-the-indicators.html

/about-the-bay/state-of-the-bay-report/so tb-about-the=indicators.html

/about-the-bay/state-of-the-bay-report/so tb-about-the-indicators.html

/about-the-bay/state-of-the-bay-report/sobt-about-the-indicators

/about-thebay/state-of-the-bay-report/sobt-about-the-indicators.hmtl

/about-the-bay/state-of-the-bayreport/sobt-about-the-indicators.html

/about-the-bay/state-ofthe-bay-report/sotb-about-the-indactors.html

/about-the-bay/state-of-the-bay-report/sotb-about-the-indactors.html

/about-the-bay/state-of-the-bay-report/sotb-about-the-indcators.html

/about-the-bay/state-of-the-bay-report/sotb-about-the-indicaors.html

/about-the-bay/state-of-the-bay-report/sotb-about-the-indicaters.html

/about-the-bay/state-of-the-bay-report/sotb-about-the-indicators&html

/about-the-bay/state-of-the-bay-report/sotb-about-the-indicators.hmtl

/about-the-bay/state-of-the-bay-report/sotb-about-the-indicators.htlm

/issues/chemical/contamination/

/issues/chemical_contaminants

/issues/chemical-containments

/issues/chemicalcontamiants

/issues/chemical-contaminants/

/issues/chemical-contamination/ gjhgdtg

/issues/chemical-contamination/

/issues/chemical-contaminations

/issues/chemical-contaminats/

/site/pageserver?pagename=lrn_sub_students_center

/site/pageserver?pagename=lrn_sub_teachers_professional_immersion

/site/pageserver?pagename=lrn_sub_teachers_professional_immersion_june

/site/pageserver?pagename=resources_facts_deadzone

/site/pageserver?pagename=resources_facts_nonnative_oysters

/site/pageserver?pagename=resources_facts_oysters


Any thoughts are appreciated.


Kim

Tagged:

Comments

  • Via StackExchange:

    That is likely due to referral spam. A spammer is likely using your Google Analytics tracking ID so that activities on their own website end up getting recorded in your own Analytics account. To check, assign "hostname" as your secondary dimension when checking your data and you will see which website is actually the culprit.

    To get rid of your spam data and ensure that only actual visits to your website get recorded, create a hostname filter:

    1. Go to "Admin".
    2. Click on "Filters".
    3. Click on "New Filter".
    4. Tick "Create New Filter".
    5. Specify your "Filter Name" (ex. Spam Remover).
    6. Select "Custom".
    7. Select "Include Only" for the "Filter Type".
    8. Select "Traffic to the Hostname" for the "Source or Destination".
    9. Select "That Contain" for the "Expression".
    10. And then under "Hostname", indicate your domain name with a \\ before any period. (ex. if your site is http://www.mysite.com.au, then it will be mysite\\.com\\.au)

    And then that's it. Google Analytics will then show you only visits that really ended up on pages with your domain name.




    BPM

  • Brian -

    Hostname is showing up as ours. www.cbf.org There are a few that show www.cbf.org.googleweblight.co but those are legitimate pages.


    Kim

Categories