splunk

How to Set Up Logging with Rails on Heroku and Splunk Cloud

- Updated June 17, 2018

Heroku is a fast, simple way to deploy web applications. With the Heroku Getting Started guide, you can get an application deployed within minutes. However, sometimes what makes Heroku simple also makes it difficult to customize the deployment when there’s not an available plugin. Recently, I ran into this issue while integrating a client’s Heroku applications with Splunk Cloud, a popular central logging system. In this post, I’ll use Ruby on Rails to show how I accomplished this and what I consider the best way to get Heroku integrated with Splunk.

(By the way, this article assumes you have an instance of Splunk Cloud provisioned and set up)

Heroku: Log Drains

First, a little on how Heroku’s framework manages logs. Heroku categorizes three types of logging output when running a service on their service:

  • App logs: STDOUT and STDERR streams from the deployed application.
  • System logs: Heroku platform logs. Typically, these are administrative logs about system errors, restarting a process, etc.
  • API logs: Logged events from users/developers such as deploying new code or modifying the Heroku configuration.

Each group can have a set of “log drains” that a user can configure for the Heroku application. These log drains describe where to redirect the logging output and are either an HTTP endpoint or a TCP Syslog endpoint.

In simplified terms, these log drain configurations set a URL and protocol type. Many folks may get stuck here when integrating with Splunk Cloud. The most documented approaches for getting log data into Splunk include installing other software called the Universal Forwarder, formatting the log event into a JSON document, or connecting to a generic TCP+Syslog endpoint.

All of these approaches are difficult in a Heroku environment — other software is challenging to install and configure without official Heroku support and formatting log events isn’t supported.  Using Splunk Cloud’s TCP+Syslog input may work, but then we would expose a public endpoint without any authentication (Note: Splunk’s support can turn on IP whitelisting but with Heroku, IP addresses rotate all the time).

Splunk HTTP Event Collector and the RAW API

Splunk has a feature called the HTTP Event Collector or HEC. This opens an HTTP(s) endpoint that accepts log events and stores the logs in Splunk’s storage systems. Administrators configure HEC endpoints with their own rules for how to index or process the incoming data. Additionally, each HEC endpoint sets their own special Token string that any client must offer to authenticate the requests.

This seems like a good option especially since Heroku’s log drain setup has HTTP(s) endpoints as an option.  However, Splunk’s HEC endpoints traditionally require both (1) the secret Token packed into an Authentication header and (2) a JSON format for the log event.  Heroku log drains can’t offer either of these options.  Seems like we would be out of luck, but Splunk Cloud recently implemented a RAW API endpoint for their HEC endpoints.

The RAW event API accepts any kind of HTTP data, not just JSON documents.  It allows for HTTP Basic Authentication, something that Heroku supports. For example, you can set up a Heroku log drain for Splunk’s HTTP Event Collector with the following command:

heroku drains:add https://anystring:YOUR_SPLUNK_TOKEN@http-inputs-YOURSPLUNKCLOUDNAME.splunkcloud.com/services/collector/raw

your version of the command will look something like this:

heroku drains:add https://x:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX@http-inputs-devopsgorilla.splunkcloud.com/services/collector/raw

You can read more about the RAW endpoint here: http://dev.splunk.com/view/event-collector/SP-CAAAE8Y

If you don’t have HTTP Event Collector enabled on your Splunk Cloud instance, you can ask for this feature by contacting Splunk Support.

What about Application Logs?

Using the RAW API endpoint works great for Heroku’s System and API log categories. However, you may notice quickly that multiline log events like a program’s Stack Trace doesn’t work well with this approach.  Each line of a stack trace is treated as its own individual event with its own timestamp and metadata. That’s no good if you want to read the entire stack trace later in Splunk’s Web UI.  Putting stack traces aside, many developers expect informative logs with lots of metadata on about the running application. Stuffing so much metadata into a single line can be difficult or not possible with in some cases.

For Application Logs, I recommend forgetting about the Heroku log drains and instead use the application code to send logs directly to Splunk Cloud. Most frameworks and languages have rich logging libraries that can send events to remote logging servers and some even have specific support for Splunk.

To show this I’ll explain how to easily do this with Ruby on Rails.

First, follow the official Heroku Getting Started guide for Rails 5.x (https://devcenter.heroku.com/articles/getting-started-with-rails5)

After completing the guide, you should have a deployed application on Heroku and a project directory on your local workstation. We’re going to integrate your application with your instance of Splunk Cloud.  Open your Gemfile and add the following line towards the bottom:

gem 'rails_semantic_logger'

and run git add Gemfile && git commit -m 'Add rails_semantic_logger package' && git push heroku master

This tells Heroku to install the package rails_semantic_logger .  Future deployments will have this package and its logger library available to your application.

Next, open the file config/environments/production.rb and find the block you added in the Getting Started instructions that starts with: if ENV["RAILS_LOG_TO_STDOUT"].present?

and replace this block with the following:

if ENV["RAILS_LOG_TO_STDOUT"].present?
  STDOUT.sync = true
  config.rails_semantic_logger.format = :json
  config.rails_semantic_logger.started = true
  config.rails_semantic_logger.add_file_appender = false
  config.semantic_logger.add_appender(
      appender: :splunk_http,
      url: 'https://http-inputs-YOUR_SPLUNKCLOUD_NAME.splunkcloud.com/services/collector/event',
      token: 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX',
      level: config.log_level,
      application: 'YOUR_APP_NAME',
  )
end

and run the commands git add config/environments/production.rb && git commit -m'Send app logs to Splunk Cloud' && git push heroku master

That is all you need to get Rails running on Heroku fully integrated with your Splunk Cloud instance.

For other frameworks and programming languages, you can use a similar approach but the way to configure the log client will differ.

Final Notes

For integrating Heroku and Splunk Cloud, use a two prong approach:

  1. For system level logs, use Heroku log drains and the Splunk Cloud HEC raw API
  2. For application level logs, use your programming language’s logging system and send directly to Splunk Cloud at a different HEC endpoint

Some of these features are only available in more recent versions of Splunk, check with Splunk support to make sure this approach works for you.

Related Links