Greetings, world. It’s been a while, hasn’t it? It’s not as though I haven’t had interesting technical problems to blog about in the past few years, only that my previous job had kept me so busy I didn’t have the time or energy to write up the problems I solved. But as of last December I started another new job, and I now have the time I need to really reflect and write up the things I’m learning.
(Expect a lot of terraform going forward 🙂 )
Today, let’s look at a practical problem I’m having. Warning: This will include several dead-ends I ran into along the way, or problems I didn’t have the technical knowledge to solve. If you came here from google and just want the solution, you’ll want the section on Logstash, as that’s my preferred solution.
The ticket: write a POC for porting Jenkins build logs to “somewhere searchable”. It was trivial to use the logstash plugin to port them to an ELK stack in AWS, but one of the options on the table was Newrelic Logging, which proved to be a more challenging solution.
Newrelic provides the following methods for log ingestion as of the time of this writing:
– Forward your logs using our Infrastructure agent or our Kubernetes plugin.
– Use our plugins for well-known log forwarders, like Fluentd, Fluent Bit, and Logstash.
– Stream or ship your logs from Amazon using AWS Lambda or Kinesis.
– Send your logs data using the Logs API.
So let’s start at the top and work our way down.
Our jenkins setup is on an ec2 instance, so it’s simple as pie to follow the included NewRelic instructions to set up an infrastructure agent on the box. Installation took about a minute, and I was able to hook it up to the file
/var/log/jenkins/jenkins.log, which is listed as the Jenkins master log file. Problem solved, right?
You can see the contents of this file in Jenkins: from the dashboard, click Manage Jenkins, then click System Log. And there you’ll find what I found: this log is supremely unhelpful. Well, I take that back. It’s useful for diagnosing problems with Jenkins itself, and it does list things like errors checking git repositories for changes, but the build logs it ain’t. This might be useful to have in New Relic, sure, but it’s not what I was looking for: the build logs, so we can determine what caused build failures.
(To uninstall the infrastructure agent, by the way, you can do
sudo yum remove newrelic-infra or whatever the reverse operation is for your package manager.)
So onto the next idea, and the idea that seemed most likely to work. Jenkins has plugins for fluent.d and logstash (I’ve never heard of logbit before today). I used the logstash plugin to push logs to the ELK stack in that POC, so that seemed like a logical place to start.
For Newrelic capability, there is a plugin to logstash. Note that we’re talking about two kinds of plugin here: the plugin to jenkins called “jenkins logstash plugin”, and the plugin to logstash itself called “newrelic logstash plugin”. These two things are not interchangeable and do not go on the same machine.
The logstash plugin for Jenkins is a bit of a misnomer; it started with only logstash, but it also has pure elasticsearch capability. What this means is that you can point it directly to an AWS elasticsearch instance, set the plugin to elasticsearch mode, and you’re good to go. However, logstash itself is a separate thing: a program that sits on top of elasticsearch and does preprocessing on the data coming into the system. Logstash is the ‘L’ in ELK stack, as it plays well with both elasticsearch and kibana. But when you create an elasticsearch domain on AWS, you only get an ‘EK’ stack — you have to run Logstash separately.
In order to integrate NewRelic with Logstash, you have to run Logstash somewhere, and AWS will not manage that for you. This means one of two less than ideal options: you can run Logstash on its own EC2 instance, incurring the costs associated with that, or you can run Logstash on the Jenkins instance. I would opt to do the latter, as this is the only place we’d be using Logstash in our infrastructure, but you can make the decision you need to make for your infrastructure needs. Once you have Logstash running, you need to install the NewRelic plugin into Logstash, then set up the Logstash Jenkins plugin to forward to Logstash which then forwards to New Relic.
The other option is Fluent.d. We are already using Fluent.d in our container environment as a sidecar to our ECS instances, and using that to forward our production logs to NewRelic. So this is the option I’d prefer to use for our ecosystem. However, this comes with its own challenges.
In Jenkins, the Logstash plugin was simple to set up. Furthermore, it contained a checkbox that allowed all build logs to be automatically sent through the plugin to Logstash or Elasticsearch. The Fluent.d plugin, by contrast, requires every build be edited to include a “send log through fluent.d” step. This means we would have a staggered rollout, and it’s all too likely for a build to be missed or set up incorrectly in the future such that the logs are not shipped to New Relic. Obviously this is far from ideal.
Other than that, fluent.d is basically identical to Logstash in broad concept. You will need to run the fluent.d software somewhere: an EC2 instance, on the Jenkins node, et cetera. It’s easy enough to attach a sidecar to an ECS container, but you’ll need to keep the agent up to date wherever it’s installed for Jenkins to use. Once installed, you will have to install the NewRelic plugin to configure it to send the logs to NewRelic.
At the end of the day, for this application I don’t see a real advantage to using Fluent.d over Logstash. The plugin is clunkier, and you still have to maintain the software on the Jenkins node.
A note about Fluent Bit: I’ve never used it, but from a quick google it claims to be less processor-intensive than either of the other two options. So it may be a good fit for installing alongside Jenkins. It works in basically the same way: run the agent somewhere, install the NewRelic plugin, and set up Jenkins to send to it. I don’t see a plugin for it, however, so I ruled it out of my search for now.
The Logs API
I’ll skip Kinesis and Lamda for now, as we don’t have Jenkins hooked up to either of those systems to try and ingest the logs. I do see a plugin to hook Jenkins up to Kinesis, but it doesn’t look, ah… quite production-ready. So that leaves one other interesting option: the logs API.
Jenkins has a plugin that allows one to write the console log to disk at the end of the build as a build step. Couple this with a script task and you can then use curl or wget to upload the file to the Logs API.
Or so I thought. As it turns out, the Logs API is set up to take one log message at a time, rather than a whole file to be chopped up. I expect that’s why they suggest using a log parser like Logstash in the first place: it can do the slicing and dicing, then call the Logs API with each message in the log.
The Bottom Line
At the end of the day, you’re going to need something to process the logs from Jenkins. I recommend installing Logstash on the same box as the Jenkins server and using the Logstash plugin to forward to localhost on the correct port. That way you get the benefit of an easier to use plugin, and an industry-standard log parsing engine to process the logs and forward them on to NewRelic. But if there’s some reason to use Fluent.d instead, that’d be my second choice, and honestly it’s not that much worse than Logstash.