Saturday, June 13, 2020

Automating Sandbox Environments - Part I

This series is based on an internal project I started last year.  Working on multiple projects at the same time usually means my time is very valuable, so I'm always looking to improve, automate tasks, and empower our Team.


The Problem

Periodically I get asked to provision an AEM environment to showcase our work to existing and potential partners and clients.  Of course with any environment you provision, those that have access to it don't want you to remove it,  even when the project is over, just in case they need it for 'something'. 

Once word about this environment gets out, other teams, such as Analytics, UX/UI, and Marketing may also want to use the environment as well. What ends up happening is there really is no one 'managing' the application, you have a single set of servers trying to fulfill different needs for different teams.

 Over time you find the original owner of the environment is no longer with the company, code and content has become stale and the application may be starting to throw errors.


Acceptance Criteria

Analyzing the problem above we find that there are actually several problems we need to solve for.

  1. Each team should have their own environment that is specific to their use-case.
    Use-Case: If the Frontend dev team wants to showcase a SPA, that work should not conflict with anything the Data Engineering team wants to show for integrating Data Layers.

  2. Each team should be able to have more than one environment.
    Use-case: The team may be showcasing a particular feature or implementation and want to give the customer limited access to test the feature.

  3. Environments should be repeatable and have a short shelf-life.
    Use-case: We don't want to have someone playing 'cleanup' at the end of every demo in order to get it prepared for the next demo. In addition, if the servers aren't being used we don't want to pay for their up-time.

  4. Provisioning an environment should be quick.
    Use-case: Sometimes we will have a request for an environment months before it is needed, other times we may be alerted hours prior to a meeting with a potential partner that they wish to see certain features.

  5. Our teams should be empowered.
    Use-case: While we want to be able to place strict controls over what is provisioned and how it is provisioned, our teams should be empowered to 'self-serve' a bit and dictate what that environment is used for.

  6. We should be able to provision different versions of the application.
    Use-case: While most demos will always be with the latest and greatest, sometimes a client wants to see things on the same version they have. In addition, we don't want to spend a lot of time performing upgrades.

  7. Employees should be able to log in to the application right away with their corporate IDs.
    Use-case: We don't want people sharing an account, which is a security issue, and we don't want someone to have to manually create and manage user accounts and permissions in the application

  8. We don't want to be locked in to platform or provider.
    Use-case: With very little work, we should be able to adapt our solution to be applied to AWS, Azure, VMWare, or any other provider.

The Solution

While you may already be thinking of different possibilities, such as Dockerfiles, AMIs,  Cloud Formation, Ansible, etc. Alone, each possible solution has it's pros and cons and may not meet all of our criteria.


For our solution we will be using several technologies together. Initially we will start with provisioning to Amazon Web Services. Later we may add support to provision to Azure as well.

We will use Packer to create images; Ansible to customize the images; Terraform to provision the resources on AWS, and Make to help tie it all together and make the commands more friendly. Lastly, we will use Docker to create a Dockerfile of our control machine so that we don't need a dedicated resource.

NOTE: This series does not aim to teach these technologies, you should already be familiar with them.

Requirements:

  • An AWS account with an IAM user that has privileges to provision EC2 instances, create AMIs and other resources.
  • GitLab or GitHub account for our project, but also access to other application project repositories that we will be deploying.
  • Artifactory or other binary repository to stage content packages, but also to maintain the different application versions
  • A VM or machine that will be running our automated tools. This can be a dedicated VM for this purpose, in our example we will create a Docker container for this vm.

    The machine/vm should have the following installed:

In the next part of our series we will set up our vm workspace, create our project structure, and dive right in. Each part of this series will cover implementing a different piece of the puzzle.  

While we are working on these pieces we could go ahead and get our teams thinking about what content and configurations they may want installed by default.



AEM Dispatcher - Troubleshooting Filters

If you have worked with Adobe Experience Manager long enough, eventually you will find yourself trying to figure out why a page, asset, or other call is returning a 404 at the dispatcher but working fine on Author and Publish instances.

The most common problem is a dispatcher filter blocking the call at the webserver and not allowing the traffic to make it to the Publish instance(s).  

What is a Dispatcher Filter?

Without getting too technical, dispatcher filters can be thought of as ACLs applied at the webserver. 

The main purpose is to provide an extra layer of security by preventing public access to the protected areas of the Author and Publish instances.  But don't these areas require credentials with proper permissions in AEM? Yes, but with properly written dispatcher filters, you don't even allow users to get prompted for a password, thus adding an extra layer between the public and your AEM instance(s).


Dispatcher Log

By default, the dispatcher is set to 'info' and logs all output to the dispatcher.log file located typically with your web server log files.

When the dispatcher log level is set to 'debug', it will print to the log file that the request was rejected due to a filter rule.

[Thu Jun 11 21:35:55 2020] [D] [pid 66692] Filter rejects: 
GET /content/dam/<project_folder>/AdobeStock_edited.jpg.transform/2x1x/image.jpg HTTP/1.1


But which one?

If we change the dispatcher log level to 'trace' or loglevel '4' and restart the dispatcher, the logs will now tell us more about the request and the specific filter that denied that request. 

[Thu Jun 11 21:35:55 2020] [T] [pid 66692] 
Filter rule entry /0001 blocked 'GET /content/dam/<project_folder>/AdobeStock_edited.jpg.transform/2x1x/image.jpg HTTP/1.1'


Looking at the actual request and the filter that rejected that request, you should now be able to update or create a new rule to allow that call and other calls matching it.