AWS indicates that when a function is executed, there are some things that get leftover between runs. The execution environment, which AWS indicates is a container model, contains everything necessary to run the function. After a function is invoked for the first time, it will “freeze” this environment such that it can quickly be called a second time, which will help improve the performance of subsequent runs by avoiding the initialization step of getting your code out into this execution environment. This container reuse, however, can be a benefit or a downside depending on how you write your function. One of these is related to the memory usage of an execution environment.

Say you have the following code:

'use strict';

var arrayA = [];
var arrayB = [];

module.exports.test = (event, context, callback) => {
  var arrayC = [];
  var arrayD = [];

  arrayA.push("A");
  arrayB.push("B");
  arrayC.push("C");
  arrayD.push("D");

  console.log("Array A:", arrayA);
  console.log("Array B:", arrayB);
  console.log("Array C:", arrayC);
  console.log("Array D:", arrayD);

  arrayA = [];
  arrayC = [];

  callback(null);
};

We initialize 2 empty arrays, arrayA and arrayB, outside of the handler. When the function is run, these two arrays will remain in memory when the function is frozen. Inside of the hander we’ve got 2 more empty arrays, arrayC and arrayD. During the execution of the function we simply push a string item to it and for testing, we clear our arrays arrayA and arrayC. Let’s look at what this does when AWS invokes this function three times:

serverless-freezing-demonstration % serverless invoke -f test -s dev -l
null
--------------------------------------------------------------------
START RequestId: 778bcf95-a285-11e6-aa5a Version: $LATEST
2016-11-04 07:54:40.248	778bcf95-a285-11e6-aa5a	Array A: [ 'A' ]
2016-11-04 07:54:40.284	778bcf95-a285-11e6-aa5a	Array B: [ 'B' ]
2016-11-04 07:54:40.284	778bcf95-a285-11e6-aa5a	Array C: [ 'C' ]
2016-11-04 07:54:40.284	778bcf95-a285-11e6-aa5a	Array D: [ 'D' ]
END RequestId: 778bcf95-a285-11e6-aa5a
REPORT RequestId: 778bcf95-a285-11e6-aa5a	Duration: 37.54 ms	Billed Duration: 100 ms 	Memory Size: 128 MB	Max Memory Used: 14 MB

serverless-freezing-demonstration % serverless invoke -f test -s dev -l
null
--------------------------------------------------------------------
START RequestId: 7b760204-a285-11e6-bf4a Version: $LATEST
2016-11-04 07:54:46.714 7b760204-a285-11e6-bf4a	Array A: [ 'A' ]
2016-11-04 07:54:46.714 7b760204-a285-11e6-bf4a	Array B: [ 'B', 'B' ]
2016-11-04 07:54:46.714 7b760204-a285-11e6-bf4a	Array C: [ 'C' ]
2016-11-04 07:54:46.714 7b760204-a285-11e6-bf4a	Array D: [ 'D' ]
END RequestId: 7b760204-a285-11e6-bf4a
REPORT RequestId: 7b760204-a285-11e6-bf4a	Duration: 0.68 ms	Billed Duration: 100 ms 	Memory Size: 128 MB	Max Memory Used: 14 MB

serverless-freezing-demonstration % serverless invoke -f test -s dev -l
null
--------------------------------------------------------------------
START RequestId: 7ecbbbed-a285-11e6-99ff Version: $LATEST
2016-11-04 07:54:52.288 7ecbbbed-a285-11e6-99ff	Array A: [ 'A' ]
2016-11-04 07:54:52.289 7ecbbbed-a285-11e6-99ff	Array B: [ 'B', 'B', 'B' ]
2016-11-04 07:54:52.289 7ecbbbed-a285-11e6-99ff	Array C: [ 'C' ]
2016-11-04 07:54:52.289 7ecbbbed-a285-11e6-99ff	Array D: [ 'D' ]
END RequestId: 7ecbbbed-a285-11e6-99ff
REPORT RequestId: 7ecbbbed-a285-11e6-99ff	Duration: 0.52 ms	Billed Duration: 100 ms 	Memory Size: 128 MB	Max Memory Used: 14 MB

You’ll notice in the first run, we have pushed one object to all four arrays. In the second run, arrayB has a second object in the array, and in the third run, a third object is added to arrayB. arrayA doesn’t exhibit this behavior because it is reinitialized to an empty array prior to stopping the function. arrayC and arrayD technically get reinitialized at every run.

With this small demo, we can see some value and some potential headache when created incorrectly. This could be a great strategy if we need to do something expensive, such as setup a database connection. Inside of the handler, we can call upon that object to reconnect to the database without needing to go through the heavy lifting every single time.

To avoid leaking memory, we would want to ensure as much data as possible is located within the handler to minimize problems.

Openshift version 3, or Openshift Origin, has been a fantastic Platform as a Service to play with. The team I work on have come up with an excellent and not so obvious documented method, but certainly legit procedure of deploying applications.

Openshift has a model that they’ve been concentrating on, which as a developer, you ship it the location of your code and it’ll build a working docker image for you. Whether it be just your application code, or a Dockerfile inside of the repo, both are good options and as long as openshift has an STI builder for your code, you’re good to go.

However, at our organization we’ve already got docker in some form of production state. We also have a plethora of Jenkins boxes that run various things to spit out the artifacts that run our applications. Adding another build layer is a bit unnecessary in our use case and would probably cause more headaches at this moment in time. So we’ve determined to utilize a model where a service team creates a usable docker image, and we’ll help throw it onto openshift.

The strategy is essentially the following:

  1. A developer pushes their code
  2. Our enormous Jenkins infrastructure kicks off the various pipelines (for testing and such)
  3. An artifact is created a stored
  4. A docker build job is completed
  5. A push job shoves that docker image into the openshift internal registry
  6. Which in turn completes a deploy

To utilize this strategy, one would need to configure the following:

  • The built-in registry needs to be exposed
  • An application configuration will need to have been pre-populated
    • The deployment configuration will point to the internal registry for the docker image
    • The deployment configuration should also have the appropriate triggers
  • Any jobs that push to the registry need appropriate access

This strategy works by utilizing the hooks that openshift have built around the docker registry. These are the same hooks that are used if openshift was the builder of the docker image. When a user kicks off a deploy, their pre-built docker image is simply pushed up to the exposed registry. After the image is pushed, the trigger is kicked forcing openshift to complete a deploy.

The most difficult part of this strategy is probably the configuration, which we’ve got fully automated. When we spin up a Jenkins box, our tooling will look inside of openshift for the service account that was setup for access, grab those credentials and shove them into jenkins. The deployment configurations for applications are also automated. When we spin up a new application, it’ll again, take a peak into openshift to figure out where the registry endpoint is. And it’ll pre-populate the image configuration for the container for us. Despite an image not existing beforehand in the registry, openshift will check the naming scheme of the project and app name allowing one to push the docker image just fine. The very first deploy will probably always fail since the image will not yet exist. You certainly cannot push until the deployment configuration is in place, otherwise the docker registry will choke. Which is good, it’ll help keep the trash out of that registry.

The bonus we get out of this:

  • Images are built outside of the realm of openshift. This allows our developers to easily grab the docker image and spin it up locally if needed for any additional troubleshooting or testing.
  • This easily integrates with our existing pipelines. We have jenkins all over the place. Hudson has literally thrown up in our environment. It’s disgusting and I’d rather avoid touching what is in place as much as possible. With this strategy, it’s one additional job that takes what was already built and sends it to a new location.
  • Just as the purpose of openshift, it helps keep devs from having to know some of the inner bits of openshift. Create a working container, ship it to openshift and it’ll just work.

The downsides to this:

  • Images are built outside of the realm of openshift. Our devs do not have direct access to openshift. This is a challenge since it makes it hard for devs to troubleshoot. The final image needs to adhere to the security features that openshift adds, which can be difficult to get over initially. Especially considering when running the docker image locally you aren’t going to be subject to the same limitations. However, openshift has great documentation around this.
  • We aren’t utilizing the features openshift has built around the use case of building source code. The project appears to be taking the building portion and expanding upon the available features in the future. We’ll simply be missing out on whatever those features are.

The main push for us to utilize this strategy is to help integrate our existing stuff into openshift. Our environment is large and has been around for many many years. Switching to openshift would not be feasible if every service team needed to rewrite their pipelines from the scratch to work with this platform. Another reason we chose this route was due to the nature of which our devs can’t touch openshift. Which is lame, but such is life in a large organization with contractual obligations.

Plus 1 to the openshift team. It’s been a pleasure to work on and integrate into our environment.

Haproxy threw me for a loop today. It wasn’t until I did a packet capture that I discovered I was doing something correctly.

When doing URL rewrites, haproxy doesn’t log the final outcome of a GET request to its logs. So while during testing something wasn’t working correctly, and the client was sending the request properly, I was confused as I initially expected the rewritten GET request to be logged when in fact it is not. Here’s my example haproxy configuration:

backend success_backend
reqirep ^([^\ :]*)\ (/[^/]+/[^/]+)(.*) \1\ /replaced\3
server localhost 127.0.0.1:4567

Here’s my curl:

[root ~]# curl http://localhost/something1/something2/get
reached via replaced/get
[root ~]#

The response received is what I expected.

Here’s what we see in the logs:

May 28 21:16:33 default-centos-66-x86-64 haproxy[22376]: 127.0.0.1:9856
[28/May/2015:21:16:33.831] tester14 success_backend/localhost 0/0/0/4/4 200 306
- - ---- 1/1/0/1/0 0/0 "GET /something1/something2/get HTTP/1.1"

Here’s what show up during a packet capture:

21:16:33.831962 IP 127.0.0.1.9856 > 127.0.0.1.80: Flags [P.], seq
3445881728:3445881927, ack 2505859966, win 1024, options [nop,nop,TS val 3145623
ecr 3145623], length 199
E.....@.@.7.........&..P.c...\c~...........
./.../..GET /something1/something2/get HTTP/1.1
User-Agent: Mozilla/2.0 (compatible; MSIE 3.0; Windows 3.1)
Host: localhost
Accept: */*

21:16:33.832107 IP 127.0.0.1.42314 > 127.0.0.1.4567: Flags [P.], seq
3175948661:3175948869, ack 1519403319, win 1024, options [nop,nop,TS val 3145624
ecr 3145624], length 208
E.....@.@.]..........J...M!uZ.A7...........
./.../..GET /replaced/get HTTP/1.1
User-Agent: Mozilla/2.0 (compatible; MSIE 3.0; Windows 3.1)
Host: localhost
Accept: */*
client-ip: 127.0.0.1

That first packet is the curl going to haproxy, the second packet is haproxy reaching out to the tiny tester app. Haproxy did exactly what was intended.

Resources:

This is just a very minor setup that I’m using that I thought I’d share for testing acl rules and such.

Ruby sinatra is a great and very quick way to spin up a tiny webservice. By default it’ll run webrick which is fine for this method of testing. Here’s what my app looks like:

require 'sinatra'
set :port, ARGV[1]
get '*' do
  "#{ARGV[0]}"
end

And then I have this spun up via a very tiny barely functional init script (we
use chef, I didn’t feel like spinning up a ruby environment, so I hacked into
chef’s embedded ruby):

#!/bin/bash
/opt/chef/embedded/bin/ruby /root/test-web-app.rb success 4567 >> /dev/null 2>&1 &
/opt/chef/embedded/bin/ruby /root/test-web-app.rb fail 4568 >> /dev/null 2>&1 &
RETVAL=$?
exit $RETVAL

And then when I need to test a rule in haproxy I set it up like so:

backend success_backend
  server localhost 127.0.0.1:4567 check
backend fail_backend
  server localhost 127.0.0.1:4568 check

Then in my frontend I can create acl’s and send them to either backend for quick testing.

acl successful url_beg /yay
use_backend success_backend if successful
default_backend fail_backend if failure

And then doing a test using curl we see that this worked well enough.

[root~]# curl http://localhost/yay
success
[root~]# curl http://localhost/aww
fail
[root~]#

This becomes limiting when you need to start messing with things like redirects or if you need to pay attention to headers and cookies. But for starting off, especially learning and getting the hang of haproxy, this was very ideal. Sinatra makes this incredibly easy for helping me test rewrites.

Resources:

So at the company I work for these days we utilize a nifty vagrant setup that utilizes virtualbox for test kitchen.  Sometimes when jobs fail we get lingering vboxes just kinda sit around eating space until we’re alerted about it.  Here’s a command that we use to roll through our various build users and removes things that probably shouldn’t exist

find /home/*/VirtualBox\ VMs/ -mtime +5 | sed 's/VirtualBox\sVMs/VirtualBox\ VMs/g' | xargs rm -rf
for vm in `VBoxManage list vms| awk '{print $2}' |sed 's/[{}]//g'`; do VBoxManage unregistervm $vm; done