The public cloud revolutionized the way we store and access data, but it also introduced new security challenges. This is because it involves sharing resources and infrastructure with multiple users, creating a risk of unauthorized access and data breaches. When we migrate our web services to the public cloud, in addition to storage layer data encryption and end-to-end encryption in transit, we implement application-layer encryption to protect customer-sensitive data like Personally Identifiable Information (PII). This article explores how the Asherah Application Encryption SDK works and how we encrypt PII data in our Ruby on Rails applications.
Application Layer Encryption is the process of encrypting data by the application that received or generated the data. The data is encrypted before it is transported over a network or saved to a database, restricting access to the data only within the application’s memory space. It differs from storage layer encryption, which can protect the data stored in a database when the server is powered off or the storage media is stolen. However, when the database server is running and authorized users or applications access the data, encryption at the storage layer is not sufficient to protect the data.
Asherah is an application-layer encryption SDK developed by GoDaddy that uses envelope encryption and has a hierarchical data encryption model. At the top of the hierarchy, the master key is managed by a Hardware Security Module (HSM) or Key Management Service (KMS). Below that, there are system and intermediate keys. At the lowest level, there are data row records that represent the individual encrypted rows.
The following is a brief overview of how the data and encrypted keys are stored at the data layer using a few sample data structures to illustrate the encryption pattern. Note: Go to the Asherah design and architecture page for more information.
Let’s say we have PII data that we want to encrypt, starting at the row level (or in Ruby on Rails terminology, at the model level). The Asherah SDK generates a data row key to encrypt that row data. The final payload that we need to store on the row level is named the data row record. It has a reference to its parent key called the intermediate key that is used to encrypt the data row key:
{
"Data": "<base64(encrypted_data)>",
"Key": {
"Created": 1534553138,
"Key": "<base64(encrypted_key)>",
"ParentKeyMeta": {
"KeyId": "_IK_123_marketing_email",
"Created": 1534553075
}
}
}
Asherah generates an intermediate key unless one already exists for the given partition. Partitions create a distinct chain of encryption keys and are a way to isolate the encrypted data and limit the blast radius. Usually, we choose the primary resource id for a partition id (i.e., user_id
). The intermediate key envelope points to its parent key (the system key):
{
"Id": "_IK_123_marketing_email",
"Created": 1534553075,
"Key": "<base64(encrypted_key)>",
"ParentKeyMeta": {
"KeyId": "_SK_marketing_email",
"Created": 1534553054
}
}
Asherah generates a system key unless one exists or is expired. By default, system keys have a lifespan of 90 days, after which Asherah generates a new key. This action also initiates the creation of new intermediate keys. The key_meta
in the system key envelope specifies the parent key used to encrypt it.
{
"Id": "_SK_marketing_email",
"Created": 1534553054,
"Key": "<base64(key_meta)>",
}
A parent key of the system key can be:
When using AWS KMS, Asherah first generates a data key with it. This data key is the master key used to encrypt the system keys. The data key is encrypted by the KMS and stored in the encryptedKek
. During a decrypt operation, the KMS initially decrypts the data key, which in turn decrypts the system key. The system key then decrypts the intermediate key, and the intermediate key decrypts the data row key. The data key is encrypted with multiple AWS regions to support a fallback when a region is unavailable.
{
"encryptedKey": "<base64(encrypted_key)>",
"kmsKeks": [
{
"region": "<aws_region>",
"arn": "<arn>",
"encryptedKek": "<base64(key_encrypted_key)"
},
...
]
}
The default cipher that Asherah uses for encryption is AES-256-GCM.
You might wonder why we don’t use AWS KMS directly for each encrypt and decrypt operation. We can, but consider the following:
Asherah implements Secure Memory to safely generate, store, and cache encryption keys. By using a secure memory heap, it guards against memory leaks with swapping, core dumps, debugger memory scans, and CPU vulnerabilities like Spectre. A secure memory heap is not part of the language-managed memory, but it can be implemented using some known native calls.
To allocate secure memory, the following steps must be performed:
To read from secure memory, the following steps must be performed:
Asherah-Ruby is a Ruby FFI wrapper around the Asherah Go implementation of the application-layer encryption SDK. The Asherah Go implementation is exposed to Ruby via the asherah-cobhan’s Go wrapper and compiled to a native shared library with Cgo. Currently supported platforms for Asherah Ruby are Linux and Darwin operating systems for x64 and ARM64 CPU architectures.
To configure the Asherah library in a Ruby on Rails application, we must first install the Asherah gem. After installing the gem, we need to create the following migration for the encryption_key
table to store the system and intermediate keys. Asherah supports MySQL and DynamoDB metastores, and can be extended to support additional adapters. For our test, we will use MySQL.
class CreateEncryptionKey < ActiveRecord::Migration[7.0]
def up
execute("
CREATE TABLE encryption_key (
id VARCHAR(255) NOT NULL,
created TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
key_record TEXT NOT NULL,
PRIMARY KEY (id, created),
INDEX (created)
);
")
end
def down
drop_table :encryption_key
end
end
We have to create an initializer to configure Asherah. To do so, we set the service_name
and product_id
used for the key naming. We configure metastore
, and connection_string
for the keys storage. We need a separate connection_string
from the default Active Record connection because Asherah Go manages the connection for writing and reading the encrypted keys. Then we configure enable_session_caching
for performance and specify the kms
details. We use a static key in development and test environments, and in the production environment, we use the AWS KMS service. Here is the Asherah configuration:
Asherah.configure do |config|
config.service_name = 'marketing'
config.product_id = 'email'
config.metastore = 'rdbms'
config.enable_session_caching = true # default: false
c = ActiveRecord::Base.connection_db_config.configuration_hash
config.connection_string = "#{c[:username]}:#{c[:password]}@tcp(#{c[:host]}:#{c[:port]})/#{c[:database]}"
if ENV['ASHERAH_KMS_ENABLED'] == 'true'
config.kms = 'aws'
config.preferred_region = ENV.fetch('AWS_REGION')
config.region_map = { ENV.fetch('AWS_REGION') => ENV.fetch('KMS_KEY_ARN') }
elsif Rails.env.development? || Rails.env.test?
config.kms = 'static' # The static key used for encryption is `thisIsAStaticMasterKeyForTesting` (defined in Asherah Go)
else
raise "Asherah client not configured for: #{Rails.env}"
end
end
Once we have all that set, we can call the encrypt
and decrypt
operations with Asherah:
partition_id = 'user_1'
data = 'user@example.com'
encrypted_data = Asherah.encrypt(partition_id, data)
decrypted_data = Asherah.decrypt(partition_id, encrypted_data)
In Ruby on Rails models, we frequently use open schema columns of type text
and leverage ActiveRecord::Store with JSON serialization. That way, we store data without having to run migrations for each new column we add. We’ll start by creating the table users
with text column params
to store personally identifiable information like name
and email
. Let’s create the migration:
class CreateUsers < ActiveRecord::Migration[7.0]
def change
create_table :users do |t|
t.text :params
t.timestamps
end
end
end
Each model that implements application layer encryption needs to include the DataEncryption
module we’ll define below. This module defines the data_encryption
method used to specify the encrypted attributes’ name
and email
and how we reference them from the model. For the partition_id
, we use the global
value, but if we had a parent account model, we could partition by the account_id.
Next, we’ll define the User
model:
class User < ActiveRecord::Base
include DataEncryption
store :params, accessors: [:enc_data], coder: JSON
data_encryption :raw_data, :enc_data, store_name: :params, accessors: [:name, :email]
private
def partition_id
'global'
end
end
The DataEncryption
module defines before_save
and after_find
callbacks to ensure proper encryption and decryption of data when models are saved or retrieved from the database. The models that include it must define the partition_id
for the encryption session. The data_encryption
method expects the following arguments:
raw_data
- a virtual attribute that holds the raw dataenc_data
- an attribute to store the encrypted datastore_name
- the name of the store where enc_data
will be storedNext, we will define the DataEncryption
module:
module DataEncryption
extend ActiveSupport::Concern
DataEncrypt = Struct.new(:raw_attr_name, :enc_attr_name, :store_name)
included do
class_attribute :data_encrypt, default: nil
before_save :encrypt_data_callback
after_find :decrypt_data_callback
end
class_methods do
def data_encryption(raw_attr_name, enc_attr_name, store_name: , accessors: [])
self.data_encrypt = DataEncrypt.new(raw_attr_name, enc_attr_name, store_name)
attribute raw_attr_name, default: -> { HashWithIndifferentAccess.new }
accessors.each do |accessor|
define_method(accessor) do
public_send(raw_attr_name)[accessor]
end
define_method("#{accessor}=") do |value|
public_send(raw_attr_name)[accessor] = value
end
end
end
end
private
def encrypt_data_callback
data = public_send(data_encrypt.raw_attr_name)
if data.present? || public_send(data_encrypt.enc_attr_name).present?
public_send("#{data_encrypt.enc_attr_name}=", encrypt_data(data))
end
end
def decrypt_data_callback
enc_data = public_send(data_encrypt.enc_attr_name)
if enc_data.present?
data = decrypt_data(enc_data)
data = ActiveRecord::Store::IndifferentCoder.as_indifferent_hash(data)
public_send("#{data_encrypt.raw_attr_name}=", data)
end
end
def encrypt_data(data)
Asherah.encrypt(partition_id, JSON.dump(data))
end
def decrypt_data(enc_data)
JSON.parse(Asherah.decrypt(partition_id, enc_data))
end
end
Our PII data is encrypted and stored in the database, but we can’t search for it because it is not indexed. One way to implement a search for encrypted PII data is to use a cryptographic technique called a blind index. Blind indexes are created by applying a one-way cryptographic hash function to the data, generating a unique fixed-length string that represents the data without revealing the actual content. To further enhance the security of the hashed data, we use a pepper, a secret key added to the input of the hashing function to create a peppered hash. Next, we’ll define the hashing function:
class Hasher
def self.hash(value)
Digest::SHA256.hexdigest(value.downcase + ENV.fetch('HASHING_PEPPER'))
end
end
To implement a blind index, we will add a column named hashed_email
with an index to the table users
. That way, we’ll be able to search for an exact match of the hashed email (though we still can’t do a full-text search or use LIKE queries). Next, we’ll add the migration:
class AddHashedEmailToUsers < ActiveRecord::Migration[7.0]
def change
add_column :users, :hashed_email, :string
add_index :users, :hashed_email
end
end
We can then add a before_validation
callback to our model to hash the data for the PII columns and define helper class methods like find_by_email
. Finally, we’ll extend the User
model with the following code:
class User < ActiveRecord::Base
before_validation :hash_pii_columns
def self.find_by_email(email)
where(hashed_email: Hasher.hash(email)).take
end
private
def hash_pii_columns
self.hashed_email = Hasher.hash(email) if email.present?
end
end
The following are some things to consider before deploying Asherah-Ruby to production:
Rails.configuration.after_initialize do
Asherah.encrypt('global', 'warmup')
end
AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
ENV var when running in AWS Fargate containers. Go os.Getenv()
does not see variables set by C.setenv() as reported in this issue and documented in the wiki.
AWS_ECS_ENV_VAR_NAME = 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI'
Asherah.set_env(AWS_ECS_ENV_VAR_NAME => ENV.fetch(AWS_ECS_ENV_VAR_NAME)) if ENV[AWS_ECS_ENV_VAR_NAME].present?
Asherah’s cross-language support, secure memory management, and granularity with the hierarchical key encryption model are some of the key features that help us minimize attack exposure and increase the security of our customer data. Revoking keys due to a suspected compromise is also built into the key rotation model. We have been using Asherah successfully in production for a few years now. We’ve iterated through a few different distributions of it for Ruby projects specifically, using an Asherah Go sidecar, a pure Ruby implementation of Asherah, and finally landing on Asherah-Ruby that’s using Asherah Go under the hood. With version 7 of Ruby on Rails, we saw the light of the built-in Active Record Encryption for encrypting data at the application layer. It’s great to see more alternative solutions bringing their features and advantages.
Cover Photo Attribution: Photo by Glenn Carstens-Peters on Unsplash