Tweaking GoAccess for Analytics

GoAccess Dashboard
GoAccess Dashboard for this server

GoAccess is an open source server–side web log analyzer. Server side means that it will process web server logs and compile the results into a real–time graphical view. It works with many web log Nearly all log formats such as Apache, CloudFront, and Nginx according to GoAccess’ website.

By default goaccess will provide a A complete graphical overview can allow you to spot patterns or quirks in your stack. picture of the traffic to your server. This includes crawlers, bots, and various HTTP requests. We can make some changes to goaccess’ configuration to make it behave a bit more like a Check out Matomo and Fathom. web analytic View this server’s goaccess analytics page.

“I believe tracking visitors at the client level deflates the actual number of visitors. On the other hand, server–side tracking gives you a more accurate number at the cost of not knowing for sure if the client is a human behind a browser.” GoAccess Author Explains Tracking using Client vs. Server

GoAccess allows command line flags and shell piping, but we’ll do most of the work from a central goaccess.conf The default configuration file can be viewed at the goaccess code repository. Let’s start off by enabling real–time HTML for live updates through the socket connection.

cfg
# Enable real-time HTML output.
real-time-html true

# Set output HTML path.
output /srv/http/goaccess/index.html

The backend web server is nginx so enable the combined log format and set the access log path.

cfg
# Set log format.
log-format COMBINED

# Specify the path to the input log file.
log-file /var/log/nginx/access.log

Exclude localhost so that goaccess ignores counting internal requests as unique visitors. We can exclude multiple public IPv4 and IPv6 addresses here as well.

cfg
# Exclude an IPv4 or IPv6 address from being counted.
exclude-ip 127.0.0.1
exclude-ip xx.xx.xx.xx

Ignore counting crawlers. This should make the unique visitors count more accurate.

cfg
# Ignore crawlers from being counted.
ignore-crawlers true

You can further refine the output by adding more crawlers to ignore. This can be done by setting a browsers-file An example is provided in the repository. This file must be tab delimited.

cfg
# Include an additional delimited list of browsers/crawlers/feeds etc.
browsers-file /opt/goaccess/config/browsers.list

Let’s enable IP address anonymization. In future versions of goaccess you’ll be able set the level of IP address anonymization with the command line flag --anonymize-level and the configuration option anonymize-level.

cfg
# IP address anonymization
anonymize-ip true

# Pedantic IP address anonymization
anonymize-level 3

By default goaccess does not add client errors to the unique visitors count.

cfg
# Do not add 4xx client errors to the unique visitors count.
4xx-to-unique-count false

We can also remove specific HTTP response codes from the visitor’s count too.

cfg
# Ignore parsing and displaying one or multiple status code(s)
ignore-status 429

Referrer spam inflates and skews the log data. There is a hard limit of 64 ignored entries. To accommodate larger lists adjust settings.h accordingly. visitors from a list of We can use a systemd or cron timer to refresh the list periodically. by using ignore-referer. My personal preference is to use Matomo’s list.

cfg
# Ignore referrer from being counted.
ignore-referer www.example.com

Sort the most important panels by visitor count, data, and bandwidth in descending order.

cfg
# Sort panels on initial load by visitors, data, and bandwidth.
sort-panel BROWSERS,BY_VISITORS,DESC
sort-panel CACHE_STATUS,BY_VISITORS,DESC
sort-panel GEO_LOCATION,BY_VISITORS,DESC
sort-panel HOSTS,BY_VISITORS,DESC
sort-panel KEYPHRASES,BY_VISITORS,DESC
sort-panel MIME_TYPE,BY_VISITORS,DESC
sort-panel NOT_FOUND,BY_BW,DESC
sort-panel OS,BY_VISITORS,DESC
sort-panel REFERRERS,BY_VISITORS,DESC
sort-panel REFERRING_SITES,BY_VISITORS,DESC
sort-panel REMOTE_USER,BY_VISITORS,DESC
sort-panel REQUESTS,BY_VISITORS,DESC
sort-panel REQUESTS_STATIC,BY_BW,DESC
sort-panel STATUS_CODES,BY_VISITORS,DESC
sort-panel TLS_TYPE,BY_VISITORS,DESC
sort-panel VIRTUAL_HOSTS,BY_VISITORS,DESC
sort-panel VISITORS,BY_DATA,DESC
sort-panel VISIT_TIMES,BY_DATA,DESC

Change the theme and table specifications on the page by using a string of json The theme is set to dark blue, with 20 results per graph. The visitors and visit time graphs are set to use bar charts instead of line charts.

cfg
# Set default HTML preferences.
html-prefs {"theme":"darkBlue","perPage":20,"visitors":{"plot":{"chartType":"bar"}},"visit_time":{"plot":{"chartType":"bar"}}}

Make sure that all static files — including files with a query string are categorized under the static files table.

cfg
# Include static files that contain a query string in the static files
all-static-files true

Show statistics based on country by loading in a GeoIP database. You can install a database from your Linux distribution of choice.

cfg
# Set GeoIP database path.
geoip-database /usr/share/GeoIP/GeoLiteCity.dat

Everything runs in memory. Check the configure options to compile with Tokyo Cabinet Support. For example — ./configure --enable-utf8 --enable-geoip=legacy --enable-tcb=btree --disable-zlib --disable-bzip Tokyo Cabinet and store the results as a database on the file system.

cfg
### GoAccess version <= 1.3

# Persist parsed data into disk.
keep-db-files true
# Load previously stored data from disk.
load-from-disk true
# Path where the on-disk database files are stored.
db-path /tmp/

Newer versions use a different syntax and will not require setting up specific configure options.

cfg
### GoAccess version >= 1.4

# Persist parsed data into disk.
persist true
# Load previously stored data from disk.
restore true
# Path where the on-disk database files are stored.
db-path /tmp

Now stream the logs into goaccess using our souped–up config. GoAccess will process the rotated logs of nginx in addition to the current access log stipulated in goaccess.conf.

shell
zcat --force /var/log/nginx/access.log-* | goaccess --config-file=/opt/goaccess/config/goaccess.conf -
11 June 2019 — Written
18 March 2021 — Updated
Thedro Neely — Creator
tweaking-goaccess-for-analytics.md — Article

More Content

Openring

Web Ring

Comments

References

  1. https://thedroneely.com/git/
  2. https://thedroneely.com/
  3. https://thedroneely.com/posts/
  4. https://thedroneely.com/projects/
  5. https://thedroneely.com/about/
  6. https://thedroneely.com/contact/
  7. https://thedroneely.com/abstracts/
  8. https://ko-fi.com/thedroneely
  9. https://thedroneely.com/tags/analytics/
  10. https://thedroneely.com/tags/goaccess/
  11. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#isso-thread
  12. https://thedroneely.com/posts/rss.xml
  13. https://thedroneely.com/images/tweaking-goaccess-for-analytics.png
  14. https://goaccess.io/
  15. https://github.com/matomo-org/matomo#matomo-formerly-piwik---matomoorg
  16. https://github.com/usefathom/fathom#fathom-lite---simple-website-analytics
  17. https://thedroneely.com/analytics
  18. https://github.com/allinurl/goaccess/issues/789#issuecomment-305504049
  19. https://raw.githubusercontent.com/allinurl/goaccess/master/config/goaccess.conf
  20. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-8fb85ec
  21. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-82b5e66
  22. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-849576e
  23. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-633327c
  24. https://raw.githubusercontent.com/allinurl/goaccess/master/config/browsers.list
  25. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-bfdfcb4
  26. https://github.com/allinurl/goaccess/commit/178eecebbc4de567d75969ca91e5b24b6bcae5e9
  27. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-33dfabb
  28. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-84944dc
  29. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-80f1b67
  30. https://en.wikipedia.org/wiki/Referrer_spam
  31. https://github.com/allinurl/goaccess/blob/0ae49c356b837aeea1e24e6273b00611bf5421f8/src/settings.h#L39
  32. https://github.com/matomo-org/referrer-spam-blacklist
  33. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-ed3c0cd
  34. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-1253fcd
  35. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-7833439
  36. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-665ffc9
  37. https://en.wikipedia.org/wiki/Internet_geolocation
  38. https://archlinux.org/packages/extra/any/geoip-database/
  39. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-a10af87
  40. https://goaccess.io/download
  41. http://fallabs.com/tokyocabinet/
  42. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-8b7002e
  43. https://github.com/allinurl/goaccess/commit/960923e604840f63c8257c5f67ae3ac83eea0a52
  44. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-5d56cbb
  45. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#code-block-bae6869
  46. https://www.thedroneely.com/posts/tweaking-goaccess-for-analytics.md
  47. https://thedroneely.com/posts/generating-archive-pages-with-hugo/
  48. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/
  49. https://thedroneely.com/posts/hugo-is-good/
  50. https://git.sr.ht/~sircmpwn/openring
  51. https://drewdevault.com/2022/11/12/In-praise-of-Plan-9.html
  52. https://drewdevault.com/
  53. https://mxb.dev/blog/the-indieweb-for-everyone/
  54. https://mxb.dev/
  55. https://www.taniarascia.com/simplifying-drag-and-drop/
  56. https://www.taniarascia.com/
  57. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#isso-thread
  58. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-8fb85ec
  59. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-82b5e66
  60. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-849576e
  61. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-633327c
  62. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-bfdfcb4
  63. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-33dfabb
  64. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-84944dc
  65. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-80f1b67
  66. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-ed3c0cd
  67. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-1253fcd
  68. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-7833439
  69. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-665ffc9
  70. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-a10af87
  71. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-8b7002e
  72. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-5d56cbb
  73. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#code-block-bae6869
  74. https://thedroneely.com/archives/posts/
  75. https://thedroneely.com/posts/a-few-abstracts/
  76. https://thedroneely.com/posts/writing-nixos-modules-and-switching-to-cgit/
  77. https://thedroneely.com/projects/voiceover-website/
  78. https://thedroneely.com/posts/nixos-in-the-wild/
  79. https://drewdevault.com/2022/09/16/Open-source-matters.html
  80. https://mxb.dev/blog/make-free-stuff/
  81. https://thedroneely.com/sitemap.xml
  82. https://thedroneely.com/index.json
  83. https://thedroneely.com/resume/
  84. https://gitlab.com/tdro
  85. https://github.com/tdro
  86. https://codeberg.org/tdro
  87. https://thedroneely.com/posts/tweaking-goaccess-for-analytics#
  88. https://creativecommons.org/licenses/by-sa/2.0/
  89. https://thedroneely.com/git/thedroneely/thedroneely.com
  90. https://opensource.org/licenses/GPL-3.0
  91. https://www.thedroneely.com/
  92. https://thedroneely.com/posts/tweaking-goaccess-for-analytics/#