flame.svg How big is the fire?Ĭoincidentally enough, the night before I sat down to write this blog post we had an incident in one of our services. # Collect a sample from the remote node echo "running the debug script on the remote node: $INSTANCE" IP=$(aws ec2 describe-instances -instance-ids $INSTANCE -query "" -output text) Sed -i -e "s/PID_MATCHER/ $PID_MATCHER/g" Sed -e "s/DURATION/ $DURATION/g" perf.sh > PID_MATCHER= 'pgrep node | xargs | sed -e "s\/ \/,\/g"' fi echo "building perf script." If this is not the case, simply change # s/PrivateDnsName/PublicDnsName/. # This script currently assumes that you are profiling EC2 instances inside # a private VPC and that you ahve setup a bastion tier with SSH forwarding # appropriately. # It is also assumed that you have setup your AWS credentials so that the # `aws` cli functions correctly. # - $PID is a specific process ID to profile, by default we profile all `node` # process on the remote instance. # - $DURATION is an optional duration specified in seconds (defaults to 30). profile.sh $INSTANCE_ID # Where: # - $INSTANCE_ID is the EC2 identifier of the instance to profile the node # processes on. That script looks like: #!/bin/bash # See: for details. The first script builds the second, by embedding how long we'd like to profile for and if we'd like to profile all node processes on the box or only a single one. To simplify creating these flamegraphs, we wrote a few bash scripts that build the flamegraphs remotely on server of interest and then extract them back to the engineer's machine. It does mean that we do get some unknown entries in our flamegraphs, but it also ensures that we won't run out of memory on our servers any time soon #tradeoffs. To mitigate the symbol map growth issue, we simply run all of our node processes with the -perf_basic_prof_only_functions flag. The solutionĪs it turns out, you can pass multiple process IDs to the -p flag of perf record as long as they are comma separated. To compound the issue, the JIT symbol map required to build flame graphs grows endlessly.
![www mixmax pl www mixmax pl](https://maxmax.pl/pol_pl_NAKLEJKA-NA-SCIANE-SAMOLOT-1-6360_1.jpg)
The first issue we faced when building these scripts was having too many processes to profile: we use Node clustering to get the most out of our instances, however there are very few examples of profiling multiple Node processes at the same time in order to build flamegraphs. These scripts also handle some advanced deployment scenarios and mitigate some bugs reported by other material on flamegraphs. Today we're going to give you a pair of scripts that allow you to do just that given that you're running on AWS. However, each example is given only in terms of profiling a process locally (once the user has SSH'd into a server) - there isn't a lot of tooling for profiling servers remotely.
Www mixmax pl how to#
There is already a bunch of amazing material on how to build flamegraphs. One way to figure out what your process is doing is to use a flamegraph - a visualization of the kinds of work its doing (think stack traces), and how long each component of that work is taking (in terms of CPU usage). Have you ever watched the CPU usage of your Node.js process spike on your servers? Did it feel like you were helpless to wonder at the reason why?
![www mixmax pl www mixmax pl](https://maxmax.pl/pol_pl_Dywan-do-Przedszkola-Torino-14-3483_2.jpg)
![www mixmax pl www mixmax pl](https://maxmax.pl/pol_pl_Dywan-Velbird-Bezowy-8533_11.jpg)
Tl dr - We run our node apps so that we can profile their CPU usage on the fly. The previous post on December 8th was about the Gmail PubSub API.
![www mixmax pl www mixmax pl](https://www.maxmax.sk/images/thumbnails/1280/1280/detailed/315/detska_3d_tapeta_disney_auta_4_1.jpg)
This blog post is part of the Mixmax 2016 Advent Calendar.