Browse Source

Scaling system for AWS Engine Workers

kermit 2 years ago
parent
commit
6fd784933f

+ 5 - 0
engine-scaler/.gitignore

@@ -0,0 +1,5 @@
+.bundle
+tmp
+vendor/ruby
+.env.production
+

+ 1 - 0
engine-scaler/.ruby-version

@@ -0,0 +1 @@
+2.2.2

+ 11 - 0
engine-scaler/.travis.yml

@@ -0,0 +1,11 @@
+rvm:
+- 2.2.2
+services:
+- redis-server
+deploy:
+  provider: heroku
+  api_key:
+    secure:
+  app: sidekiq-cloudwatch
+  on:
+    repo: dwilkie/sidekiq-cloudwatch

+ 18 - 0
engine-scaler/Gemfile

@@ -0,0 +1,18 @@
+source 'https://rubygems.org'
+ruby "2.2.2"
+
+gem 'rake'
+gem 'sidekiq'
+gem 'aws-sdk', '~> 2'
+
+group :development do
+  gem 'foreman'
+end
+
+group :test do
+  gem 'dotenv'
+  gem 'rspec'
+  gem 'rack'
+  gem 'webmock'
+  gem 'vcr'
+end

+ 70 - 0
engine-scaler/Gemfile.lock

@@ -0,0 +1,70 @@
+GEM
+  remote: https://rubygems.org/
+  specs:
+    addressable (2.3.8)
+    aws-sdk (2.1.4)
+      aws-sdk-resources (= 2.1.4)
+    aws-sdk-core (2.1.4)
+      jmespath (~> 1.0)
+    aws-sdk-resources (2.1.4)
+      aws-sdk-core (= 2.1.4)
+    celluloid (0.16.0)
+      timers (~> 4.0.0)
+    connection_pool (2.2.0)
+    crack (0.4.2)
+      safe_yaml (~> 1.0.0)
+    diff-lcs (1.2.5)
+    dotenv (2.0.2)
+    foreman (0.78.0)
+      thor (~> 0.19.1)
+    hitimes (1.2.2)
+    jmespath (1.0.2)
+      multi_json (~> 1.0)
+    json (1.8.3)
+    multi_json (1.11.2)
+    rack (1.6.4)
+    rake (10.4.2)
+    redis (3.2.1)
+    redis-namespace (1.5.2)
+      redis (~> 3.0, >= 3.0.4)
+    rspec (3.3.0)
+      rspec-core (~> 3.3.0)
+      rspec-expectations (~> 3.3.0)
+      rspec-mocks (~> 3.3.0)
+    rspec-core (3.3.1)
+      rspec-support (~> 3.3.0)
+    rspec-expectations (3.3.0)
+      diff-lcs (>= 1.2.0, < 2.0)
+      rspec-support (~> 3.3.0)
+    rspec-mocks (3.3.1)
+      diff-lcs (>= 1.2.0, < 2.0)
+      rspec-support (~> 3.3.0)
+    rspec-support (3.3.0)
+    safe_yaml (1.0.4)
+    sidekiq (3.4.1)
+      celluloid (~> 0.16.0)
+      connection_pool (>= 2.1.1)
+      json
+      redis (>= 3.0.6)
+      redis-namespace (>= 1.3.1)
+    thor (0.19.1)
+    timers (4.0.1)
+      hitimes
+    vcr (2.9.3)
+    webmock (1.21.0)
+      addressable (>= 2.3.6)
+      crack (>= 0.3.2)
+
+PLATFORMS
+  ruby
+
+DEPENDENCIES
+  aws-sdk (~> 2)
+  dotenv
+  foreman
+  rack
+  rake
+  rspec
+  sidekiq
+  vcr
+  webmock

+ 54 - 0
engine-scaler/README.md

@@ -0,0 +1,54 @@
+# Engine Scaler
+
+AWS CloudWatch metrics for Sidekiq for Autoscaling an Elastic Beanstalk Sidekiq environment based off of the queue size
+
+## Configuration
+
+Copy `.env` to `.env.production` and fill in `.env.production` with the real configuration values
+
+## Testing in Development
+
+```
+bundle exec foreman run -e .env.production rake metrics:update
+```
+
+Check the AWS console to see your custom metric
+
+## Deployment
+
+### Heroku
+
+You can monitor your Sidekiq instances on AWS for free using Heroku Scheduler
+
+1. Fork the repo
+2. Push your fork to Heroku
+3. Setup your config vars with `heroku config:set`
+4. Use Heroku Scheduler to trigger `rake metrics:update` or `rake metrics:schedule_update` every 10 minutes
+5. Have a beer
+
+Notes:
+
+To get finer granularity on Heroku you can run a job every minute with Heroku Scheduler by creating 10 jobs spaced 1 minute apart. `POST` to scheduler using the following command:
+
+```
+curl 'https://scheduler.heroku.com/jobs' --data 'command=rake+metrics%3Aupdate&dyno_size=11&every=10&at=9'
+```
+
+`at` is the minute you want to start the job and takes the values `0` to `9`. You'll also need to post your cookie. To get the full `curl` command I used Chrome Develper Tools and captured a request.
+
+Scheduling and update with `rake metrics:schedule_update` also works but the worker process will sleep on a free Heroku dyno.
+
+## Autoscaling with Elastic Beanstalk
+
+Elastic Beanstalk doesn't let you configure alarms for custom metrics for autoscaling out of the box. However you can manually change the Autoscaling group to add custom Scaling Policies.
+
+1. Configure an Elastic Beanstalk Environment with Autoscaling
+2. Select CPU Utilization as your Trigger measurement with sensible defaults e.g. Upper threshold: `90`, Lower threshold `45`
+3. After the environment has been created it will create a new Autoscaling Group. You can find it in the AWS Web Console under `EC2 -> Autoscaling -> Autoscaling Groups`
+4. Note the name of the autoscaling group and configure it in the Heroku config. e.g. `heroku config:set AWS_CLOUDWATCH_DIMENSION_VALUE=<auto-scaling-group-name>`
+5. Set up two new CloudWatch alarms for your Sidekiq metric in the AWS Web Consule under `CloudWatch -> Custom Metrics`. One should be for `sidekiq-worker-overload` and the other should be for `sidekiq-worker-underload`
+6. Back in the AWS Web Console under `EC2 -> Autoscaling -> Autoscaling Groups -> Scaling Policies` add two new policies. One for `aws-eb-autoscale-up` and the other for `aws-eb-autoscale-down`. Select the appropriate alarm created in step 5 for each policy.
+7. Remove the autogenerated scaling policies created by Elastic Beanstalk.
+8. Optionally remove the autogenerated alarms created by Elastic Beanstalk or configure them to send a notification.
+
+Note if you ever rebuild or terminate your environment, Elastic Beanstalk will create a new autoscaling group and CloudWatch put the CloudWatch alarms in it. You'll need to repeat steps. 3-8 above.

+ 5 - 0
engine-scaler/Rakefile

@@ -0,0 +1,5 @@
+#!/usr/bin/env rake
+# Add your own tasks in files placed in lib/tasks ending in .rake,
+# for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.
+
+Dir["#{File.expand_path('../lib/tasks', __FILE__)}/*.rake"].each {|file| import file }

+ 4 - 0
engine-scaler/config/sidekiq.yml

@@ -0,0 +1,4 @@
+---
+:concurrency: 1
+:queues:
+  - sidekiq_cloudwatch_metric_update_queue

+ 85 - 0
engine-scaler/lib/sidekiq/cloudwatch/client.rb

@@ -0,0 +1,85 @@
+module Sidekiq
+  module Cloudwatch
+    class Client
+      require 'aws-sdk'
+
+      DEFAULT_NAMESPACE = "Namespace"
+      DEFAULT_METRIC_NAME = "MetricName"
+      DEFAULT_DIMENSION_NAME = "AutoScalingGroupName"
+      DEFAULT_DIMENSION_VALUE = "AutoScalingGroupName"
+
+      METRIC_UNITS = {
+        nil => "None",
+        :count => "Count"
+      }
+
+      attr_accessor :metric
+
+      def initialize(metric = nil)
+        self.metric = metric
+      end
+
+      def put
+        client.put_metric_data(put_metric_data_payload)
+      end
+
+      private
+
+      def put_metric_data_payload
+        {:namespace => namespace, :metric_data => metric_data}
+      end
+
+      def metric_data
+        [
+          {
+            :metric_name => metric_name,
+            :dimensions => dimensions,
+            :timestamp => timestamp,
+            :value => value,
+            :unit => unit
+          }
+        ]
+      end
+
+      def namespace
+        ENV["AWS_CLOUDWATCH_NAMESPACE"] || DEFAULT_NAMESPACE
+      end
+
+      def metric_name
+        ENV["AWS_CLOUDWATCH_METRIC_NAME"] || DEFAULT_METRIC_NAME
+      end
+
+      def dimension_name
+        ENV["AWS_CLOUDWATCH_DIMENSION_NAME"] || DEFAULT_DIMENSION_NAME
+      end
+
+      def dimension_value
+        ENV["AWS_CLOUDWATCH_DIMENSION_VALUE"] || DEFAULT_DIMENSION_VALUE
+      end
+
+      def dimensions
+        (ENV["AWS_CLOUDWATCH_DIMENSIONS"] && JSON.parse(ENV["AWS_CLOUDWATCH_DIMENSIONS"])) || default_dimensions
+      end
+
+      def default_dimensions
+        [{:name => dimension_name, :value => dimension_value}]
+      end
+
+      def timestamp
+        Time.now
+      end
+
+      def value
+        metric.value
+      end
+
+      def unit
+        METRIC_UNITS[metric.unit]
+      end
+
+      def client
+        @client ||= ::Aws::CloudWatch::Client.new
+      end
+    end
+  end
+end

+ 31 - 0
engine-scaler/lib/sidekiq/cloudwatch/metric/base.rb

@@ -0,0 +1,31 @@
+require 'sidekiq'
+require 'sidekiq/api'
+
+module Sidekiq
+  module Cloudwatch
+    module Metric
+      class Base
+        DEFAULT_VALUE = "0.0"
+        DEFAULT_UNIT = nil
+
+        def self.descendants
+          ObjectSpace.each_object(Class).select { |klass| klass < self }
+        end
+
+        def value
+          DEFAULT_VALUE
+        end
+
+        def unit
+          DEFAULT_UNIT
+        end
+
+        private
+
+        def stats
+          @stats ||= ::Sidekiq::Stats.new
+        end
+      end
+    end
+  end
+end

+ 19 - 0
engine-scaler/lib/sidekiq/cloudwatch/metric/queue_size.rb

@@ -0,0 +1,19 @@
+require_relative 'base'
+
+module Sidekiq
+  module Cloudwatch
+    module Metric
+      class QueueSize < ::Sidekiq::Cloudwatch::Metric::Base
+        UNIT = :count
+
+        def value
+          stats.queues.values.inject(0, :+)
+        end
+
+        def unit
+          UNIT
+        end
+      end
+    end
+  end
+end

+ 40 - 0
engine-scaler/lib/sidekiq/cloudwatch/metric_update.rb

@@ -0,0 +1,40 @@
+module Sidekiq
+  module Cloudwatch
+    class MetricUpdate
+      require_relative 'client'
+      require_relative 'workers/metric_update_worker'
+
+      Dir[File.dirname(__FILE__) + "/metric/**/*.rb"].each { |f| require f }
+
+      DEFAULT_METRIC_UPDATE_FREQUENCY_SECONDS = 60
+      DEFAULT_METRIC_UPDATE_SCHEDULER_FREQUENCY_SECONDS = 600
+
+      def schedule!
+        (metric_update_scheduler_frequency_seconds / metric_update_frequency_seconds).times do |schedule|
+          ::Sidekiq::Cloudwatch::MetricUpdateWorker.perform_at(Time.now + (schedule * metric_update_frequency_seconds))
+        end
+      end
+
+      def run!
+        ::Sidekiq::Cloudwatch::Metric::Base.descendants.each do |sidekiq_metric_class|
+          client.metric = sidekiq_metric_class.new
+          client.put
+        end
+      end
+
+      private
+
+      def metric_update_frequency_seconds
+        (ENV["SIDEKIQ_CLOUDWATCH_METRIC_UPDATE_FREQUENCY_SECONDS"] || DEFAULT_METRIC_UPDATE_FREQUENCY_SECONDS).to_i
+      end
+
+      def metric_update_scheduler_frequency_seconds
+        (ENV["SIDEKIQ_CLOUDWATCH_METRIC_UPDATE_SCHEDULER_FREQUENCY_SECONDS"] || DEFAULT_METRIC_UPDATE_SCHEDULER_FREQUENCY_SECONDS).to_i
+      end
+
+      def client
+        @client ||= ::Sidekiq::Cloudwatch::Client.new
+      end
+    end
+  end
+end

+ 15 - 0
engine-scaler/lib/sidekiq/cloudwatch/workers/metric_update_worker.rb

@@ -0,0 +1,15 @@
+module Sidekiq
+  module Cloudwatch
+    class MetricUpdateWorker
+      require 'sidekiq'
+      require_relative '../metric_update'
+
+      include ::Sidekiq::Worker
+      sidekiq_options(:queue => :sidekiq_cloudwatch_metric_update_queue, :retry => false)
+
+      def perform
+        ::Sidekiq::Cloudwatch::MetricUpdate.new.run!
+      end
+    end
+  end
+end

+ 14 - 0
engine-scaler/lib/tasks/metrics.rake

@@ -0,0 +1,14 @@
+namespace :metrics do
+  require './lib/sidekiq/cloudwatch/metric_update'
+
+  desc "Updates the metrics immediately"
+  task :update do
+    require './lib/sidekiq/cloudwatch/metric_update'
+    Sidekiq::Cloudwatch::MetricUpdate.new.run!
+  end
+
+  desc "Schedules the metrics to be updated"
+  task :schedule_update do
+    Sidekiq::Cloudwatch::MetricUpdate.new.schedule!
+  end
+end

+ 8 - 0
engine-scaler/lib/tasks/rspec.rake

@@ -0,0 +1,8 @@
+unless ENV["RACK_ENV"] == "production"
+  require 'rspec/core/rake_task'
+
+  desc "Run all examples"
+  RSpec::Core::RakeTask.new(:spec)
+
+  task :default => [:spec]
+end

+ 49 - 0
engine-scaler/spec/fixtures/vcr_cassettes/aws_cloudwatch_put_metric_data.yml

@@ -0,0 +1,49 @@
+---
+http_interactions:
+- request:
+    method: post
+    uri: https://monitoring.us-east-1.amazonaws.com/
+    body:
+      encoding: UTF-8
+      string: Action=PutMetricData&MetricData.member.1.Dimensions.member.1.Name=AutoScalingGroupName&MetricData.member.1.Dimensions.member.1.Value=AutoScalingGroupName&MetricData.member.1.MetricName=MetricName&MetricData.member.1.Timestamp=2015-07-11T10%3A43%3A29Z&MetricData.member.1.Unit=None&MetricData.member.1.Value=0.0&Namespace=Namespace&Version=2010-08-01
+    headers:
+      Content-Type:
+      - application/x-www-form-urlencoded; charset=utf-8
+      Accept-Encoding:
+      - ''
+      User-Agent:
+      - aws-sdk-ruby2/2.1.4 ruby/2.2.2 x86_64-linux
+      X-Amz-Date:
+      - 20150711T104329Z
+      Host:
+      - monitoring.us-east-1.amazonaws.com
+      X-Amz-Content-Sha256:
+      - sha
+      Content-Length:
+      - '349'
+      Accept:
+      - "*/*"
+  response:
+    status:
+      code: 200
+      message: OK
+    headers:
+      X-Amzn-Requestid:
+      - abc251a7-13b8-99e5-7654-7a64577f4317
+      Content-Type:
+      - text/xml
+      Content-Length:
+      - '212'
+      Date:
+      - Sat, 11 Jul 2015 10:43:30 GMT
+    body:
+      encoding: UTF-8
+      string: |
+        <PutMetricDataResponse xmlns="http://monitoring.amazonaws.com/doc/2010-08-01/">
+          <ResponseMetadata>
+            <RequestId>abc251a7-13b8-99e5-7654-7a64577f4317</RequestId>
+          </ResponseMetadata>
+        </PutMetricDataResponse>
+    http_version:
+  recorded_at: Sat, 11 Jul 2015 10:43:30 GMT
+recorded_with: VCR 2.9.3

+ 59 - 0
engine-scaler/spec/sidekiq/cloudwatch/client_spec.rb

@@ -0,0 +1,59 @@
+require 'spec_helper'
+require './lib/sidekiq/cloudwatch/client'
+require './lib/sidekiq/cloudwatch/metric/base'
+require 'rack/utils'
+
+describe Sidekiq::Cloudwatch::Client do
+  let(:metric_class) { Sidekiq::Cloudwatch::Metric::Base }
+  let(:metric) { metric_class.new }
+
+  subject { described_class.new(metric) }
+
+  describe "#put" do
+    let(:request_body) { Rack::Utils.parse_query(WebMock.requests.last.body) }
+
+    before do
+      VCR.use_cassette(:aws_cloudwatch_put_metric_data) { subject.put }
+    end
+
+    it "should send put_metric_data request to AWS" do
+      expect(request_body["Action"]).to(
+        eq("PutMetricData")
+      )
+
+      expect(request_body["Namespace"]).to(
+        eq(described_class::DEFAULT_NAMESPACE)
+      )
+
+      expect(request_body["MetricData.member.1.MetricName"]).to(
+        eq(described_class::DEFAULT_METRIC_NAME)
+      )
+
+      expect(request_body["MetricData.member.1.Dimensions.member.1.Name"]).to(
+        eq(described_class::DEFAULT_DIMENSION_NAME)
+      )
+
+      expect(request_body["MetricData.member.1.Dimensions.member.1.Value"]).to(
+        eq(described_class::DEFAULT_DIMENSION_VALUE)
+      )
+
+      expect(request_body["MetricData.member.1.Dimensions.member.1.Name"]).to(
+        eq(described_class::DEFAULT_DIMENSION_NAME)
+      )
+
+      expect(request_body["MetricData.member.1.Timestamp"]).not_to be_empty
+
+      expect(request_body["MetricData.member.1.Value"]).to(
+        eq(metric_class::DEFAULT_VALUE)
+      )
+
+      expect(request_body["MetricData.member.1.Unit"]).to(
+        eq(described_class::METRIC_UNITS[metric_class::DEFAULT_UNIT])
+      )
+
+      expect(request_body["MetricData.member.1.Dimensions.member.1.Name"]).to(
+        eq(described_class::DEFAULT_DIMENSION_NAME)
+      )
+    end
+  end
+end

+ 12 - 0
engine-scaler/spec/sidekiq/cloudwatch/metric/queue_size_spec.rb

@@ -0,0 +1,12 @@
+require 'spec_helper'
+require './lib/sidekiq/cloudwatch/metric/queue_size'
+
+describe Sidekiq::Cloudwatch::Metric::QueueSize do
+  describe "#value" do
+    it { expect(subject.value).to be_a(Integer) }
+  end
+
+  describe "#unit" do
+    it { expect(subject.unit).to eq(described_class::UNIT) }
+  end
+end

+ 32 - 0
engine-scaler/spec/sidekiq/cloudwatch/metric_update_spec.rb

@@ -0,0 +1,32 @@
+require 'spec_helper'
+require 'sidekiq/testing'
+require './lib/sidekiq/cloudwatch/metric_update'
+
+describe Sidekiq::Cloudwatch::MetricUpdate do
+  describe "#run!" do
+    let(:request) { WebMock.requests }
+
+    before do
+      VCR.use_cassette(:aws_cloudwatch_put_metric_data) do
+        subject.run!
+      end
+    end
+
+    it { expect(WebMock.requests.size).to eq(Sidekiq::Cloudwatch::Metric::Base.descendants.count) }
+  end
+
+  describe "#schedule!" do
+    let(:scheduled_jobs) { ::Sidekiq::Cloudwatch::MetricUpdateWorker.jobs }
+    let(:scheduled_job) { scheduled_jobs.last }
+
+    before do
+      Sidekiq::Testing.fake!
+      Sidekiq::Worker.clear_all
+      subject.schedule!
+    end
+
+    it { expect(scheduled_jobs.size).to eq(described_class::DEFAULT_METRIC_UPDATE_SCHEDULER_FREQUENCY_SECONDS / described_class::DEFAULT_METRIC_UPDATE_FREQUENCY_SECONDS) }
+    it { expect(scheduled_job["at"]).not_to eq(nil) }
+    it { expect(scheduled_job["class"]).to eq("Sidekiq::Cloudwatch::MetricUpdateWorker") }
+  end
+end

+ 22 - 0
engine-scaler/spec/sidekiq/cloudwatch/workers/metric_update_worker_spec.rb

@@ -0,0 +1,22 @@
+require 'spec_helper'
+require './lib/sidekiq/cloudwatch/workers/metric_update_worker'
+
+describe Sidekiq::Cloudwatch::MetricUpdateWorker do
+  describe ".sidekiq_options" do
+    let(:sidekiq_options) { described_class.sidekiq_options }
+
+    it { expect(sidekiq_options["retry"]).to eq(false) }
+    it { expect(sidekiq_options["queue"]).to eq(:sidekiq_cloudwatch_metric_update_queue) }
+  end
+
+  describe "#perform" do
+    let(:metric_update) { double(Sidekiq::Cloudwatch::MetricUpdate) }
+
+    before do
+      allow(Sidekiq::Cloudwatch::MetricUpdate).to receive(:new).and_return(metric_update)
+      allow(metric_update).to receive(:run!)
+    end
+
+    it { expect(metric_update).to receive(:run!); subject.perform }
+  end
+end

+ 3 - 0
engine-scaler/spec/spec_helper.rb

@@ -0,0 +1,3 @@
+RSpec.configure do |config|
+  Dir[File.dirname(__FILE__) + "/support/**/*.rb"].each {|f| require f}
+end

+ 2 - 0
engine-scaler/spec/support/dotenv.rb

@@ -0,0 +1,2 @@
+require 'dotenv'
+Dotenv.load

+ 6 - 0
engine-scaler/spec/support/vcr.rb

@@ -0,0 +1,6 @@
+require 'vcr'
+
+VCR.configure do |c|
+  c.cassette_library_dir = File.expand_path(File.join(File.dirname(__FILE__), "..", "fixtures/vcr_cassettes"))
+  c.hook_into :webmock
+end

+ 28 - 0
engine-scaler/spec/support/web_mock.rb

@@ -0,0 +1,28 @@
+require 'webmock/rspec'
+WebMock.disable_net_connect!
+
+module LastRequest
+  def clear_requests!
+    @requests = nil
+  end
+
+  def requests
+    @requests ||= []
+  end
+
+  def last_request=(request_signature)
+    requests << request_signature
+    request_signature
+  end
+end
+
+WebMock.extend(LastRequest)
+WebMock.after_request do |request_signature, response|
+  WebMock.last_request = request_signature
+end
+
+RSpec.configure do |config|
+  config.before do
+    WebMock.clear_requests!
+  end
+end