Building serverless APIs using single table design and test-driven development

Heeki Park
10 min readJan 12, 2024

--

During the winter break when things were relatively quiet, I decided to build an application that manages and tracks daily reading activities using a serverless architecture. The application sends a daily reading via email to a list of subscribed users and implements some basic management functionality. The architecture includes both synchronous APIs and asynchronous event-driven components.

In a series of posts, I plan to share the process I took for building the application with the aim of demonstrating some serverless development practices and helping you as you build new applications with serverless approaches.

In this first post, I walk through building the synchronous APIs. The process entails writing core CRUD functionality that interacts with DynamoDB, integrating that core logic into a Lambda function, and finally exposing those capabilities via an API Gateway endpoint.

Infrastructure stack for synchronous APIs on AWS

In a follow-up post, I plan to walk through a scheduled workflow for generating and sending out the daily reminder emails and subsequently tracking progress and achievements. In another follow-up post, I plan to cover observability practices and operating the application in production.

The code for this project has been published to this repository.

Starting with blueprints

When first ideating on the application, I knew I would start with a few APIs, so I instantiated my project using my API Gateway blueprint, which I cover in further detail in a prior post. I set up the configuration file and deployed the resources, standing up a basic hello world API behind a public custom domain name, e.g. https://api.example.com. This blueprint ensures that the infrastructure and security components are set up appropriately.

Infrastructure and security components deployed as part of the blueprints

For API Gateway, a public custom domain name is created and associated with an ACM certificate. An associated A record is created in Route 53 to point the custom domain name to the API Gateway endpoint.

For Lambda, a resource policy is configured to allow the API Gateway endpoint to invoke the function, and an execution role is created to allow the function to interact with DynamoDB.

This is a good starting point for setting up the API endpoint in AWS. Like a chef prepping his ingredients, I set this stack aside and started writing the core application logic.

Implementing basic CRUD functionality using single table design

I then set out to build the basic CRUD (create, read, update, delete) logic for the application. I chose to use DynamoDB as my backend persistence layer and opted to use a single table design, as described in this blog post. I find many developers with expertise in relational databases are unfamiliar with how to convert from normalized relational data models to denormalized sparse data models. I figured this might be a good opportunity to walk through that process.

A friend of mine had put together a relational data model that looked something like this:

Relational data model for tracking users, groups, plans, and readings.

I then implemented a denormalized version of that data model in DynamoDB by setting the hash key as the category (user, group, plan, reading) and setting the range key as the uid for the item.

Example data in a DynamoDB table using single-table design

Using that setup, I then created a global secondary index (GSI) still using categoryas the hash key but now using description as the range key. This allows me to search for the uid of an item by using the description (or really the name, but the name key is reserved in DynamoDB). I also project everything into the index so that I can get the associated data, if needed. Because my use case will likely have no more than 100–200 users, 3 groups, 1 plan, and about 250 readings per year, this approach with the GSI is ok from a complexity, performance, and cost perspective.

Building on this, I then configured queries to get specific items that I want by using filter expressions on non-key attributes. With readings, I created a filter expression to check if the sent_date attribute begins with a date string using begins_with(“YYYY-MM-DD") to get the reading for a specific date. With users, I created a filter expression to check if group_ids or plan_ids, which is a list of uid strings, has the associated uid using contains(“uid”). This allows me to query for all users that belong to a particular group or for all users that are subscribed to a particular plan. This also allows me to have another attribute like active (not depicted) to filter on users that still have an active subscription.

This setup allows me to have all the data in a single table without having to manage foreign key relationships and writing complex join queries. The downside, however, is that my application is now responsible for that logic, which might have previously been optimized in a database. Trade-offs concerning application complexity and scale need to be considered when choosing this design approach.

Writing code using test-driven development

As I was building out the core functionality, I used some principles from test-driven development, as it enabled me both to validate that the core logic was implemented correctly now and to ensure that I didn’t break functionality later. I tested at three different layers, each of which served a different purpose.

Unit testing core functionality
When I was writing the Python code to implement the CRUD functionality described above, I used the unittest framework, first writing the tests and then implementing the logic to satisfy those tests. This allowed me to iterate quickly on the core business logic.

Local application testing with unittest

For example, I started by implementing the group resource. In the test, I start by listing the groups to get a baseline count. I then create a group. Afterwards, I list the groups again. This allowed me to then write an assertion that the baseline count should have incremented by one.

I slowly added more tests and more functionality. I added the method to get a group, to update a group, to delete a group, and to ensure that invalid inputs are handled properly. Below is an example of some of the tests that I ran, along with some of the outputs.


> python3 test/validation.py

created group with uid=4ab3c51d-7186-4b7a-b822-d0ecd35ae6ea, base_count=3, updated_count=4
{"category": "group", "uid": "4ab3c51d-7186-4b7a-b822-d0ecd35ae6ea", "description": "test group 7fd7e316aa54", "is_private": false}
updated group with uid=4ab3c51d-7186-4b7a-b822-d0ecd35ae6ea
deleted group with uid=4ab3c51d-7186-4b7a-b822-d0ecd35ae6ea, base_count=3, final_count=3
.
created user with uid=74b033a7-8a36-446d-a5e0-b932bc4de11a, base_count=1, updated_count=2
{"category": "user", "uid": "74b033a7-8a36-446d-a5e0-b932bc4de11a", "description": "test user 7ec2321b4402", "email": "test@example.com", "group_ids": ["d3aa4ef7-b938-4d0e-b936-272e32139dce"], "plan_ids": ["0369ca53-0374-4869-905d-56c204ff1048"]}
updated user with uid=74b033a7-8a36-446d-a5e0-b932bc4de11a
deleted user with uid=74b033a7-8a36-446d-a5e0-b932bc4de11a, base_count=1, final_count=1
.
group_id=4c4f6a7c-6d71-4b80-a1e7-3704682af565 user_count=0
group_id=84db3228-0174-45bf-8b67-2e9c62b5ecf7 user_count=1
group_id=8fe42e6c-ea78-4c61-b922-2e5776a87b6b user_count=0
listed users by group_id
.
plan_id=d1fb745a-01bf-4f6b-84f5-9b63fe39533f user_count=1
listed users by plan_id
.
created plan with uid=63d86b14-e89b-43c7-852b-fec1fb636dda, base_count=1, updated_count=2
{"category": "plan", "uid": "63d86b14-e89b-43c7-852b-fec1fb636dda", "description": "test plan 8e0dcf23a537", "is_private": false}
updated plan with uid=63d86b14-e89b-43c7-852b-fec1fb636dda
deleted plan with uid=63d86b14-e89b-43c7-852b-fec1fb636dda, base_count=1, final_count=1
.
created reading with uid=836f0c6c-5d80-4eef-b8fe-7e3951f0b2ab, base_count=5, updated_count=6
{"category": "reading", "uid": "836f0c6c-5d80-4eef-b8fe-7e3951f0b2ab", "description": "test reading 794368582826", "plan_id": "14dfd01b-bfbc-4a1a-aedb-b5762e17449c", "sent_date": "2024-01-11T16:11:38.271264", "sent_count": "1", "body": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua."}
updated reading with uid=836f0c6c-5d80-4eef-b8fe-7e3951f0b2ab
deleted reading with uid=836f0c6c-5d80-4eef-b8fe-7e3951f0b2ab, base_count=5, final_count=5
.
.
----------------------------------------------------------------------
Ran 7 tests in 2.759s

OK

Up to this point, the development effort was focused on building business logic and interacting with the downstream DynamoDB table. The code is not specific to Lambda or API Gateway. The same code could be taken and run in other compute platforms with a little bit of shim or framework code.

Invoking locally for Lambda integration
When integrating the core functionality into Lambda, I used SAM CLI to test if I was properly wiring up the event payload with the application code.

Local invoke testing with SAM CLI

I started with sam local generate-event apigateway aws-proxy (reference) to generate a sample event payload. Once I had a sample event payload, I tested the integration with Lambda using sam local invoke, which allows me to invoke a function locally using a container image with my event payload and defined environment variables. The output of that invocation shows what would be logged to CloudWatch had the function been invoked within my AWS account.

> sam local invoke -t iac/sam/api.yaml --parameter-overrides "ParameterKey=pApiDomainName,ParameterValue=api.example.com ParameterKey=pApiBasePath,ParameterValue=reading ParameterKey=pApiStage,ParameterValue=dev ParameterKey=pFnMemory,ParameterValue=128 ParameterKey=pFnTimeout,ParameterValue=15" --env-vars etc/envvars.json -e etc/event.json FnGroup | jq -r '.body' | jq

Invoking fn.handler (python3.11)
arn:aws:lambda:us-east-1:546275881527:layer:xray-python3:3 is already cached. Skipping download
arn:aws:lambda:us-east-1:580247275435:layer:LambdaInsightsExtension:38 is already cached. Skipping download
LayerLibraries is a local Layer in the template
Local image is up-to-date
Building image.............................
Using local image: samcli/lambda-python:3.11-x86_64-3dcdd13b49b56f059d4662f7c.

Mounting /github/reading/src/group as /var/task:ro,delegated, inside runtime container
START RequestId: 60536cf2-e412-4cb4-add6-badb092fa495 Version: $LATEST
{"timestamp": "2024-01-11T21:12:37Z", "level": "INFO", "message": "Found credentials in environment variables.", "logger": "botocore.credentials", "requestId": ""}
{"timestamp": "2024-01-11T21:12:37Z", "level": "INFO", "message": "successfully patched module sqlite3", "logger": "aws_xray_sdk.core.patcher", "requestId": ""}
{"timestamp": "2024-01-11T21:12:37Z", "level": "INFO", "message": "successfully patched module botocore", "logger": "aws_xray_sdk.core.patcher", "requestId": ""}
{"resource": "/user", "path": "/reading/user", "httpMethod": "GET", "headers": {"Authorization": "tbd", "content-type": "application/json", "Host": "api.example.com", "range": "bytes=0-262144", "User-Agent": "Amazon|StepFunctions|HttpInvoke|us-east-1", "X-Amzn-Trace-Id": "Root=1-659f1cbf-2966a7d96a3854f9624da9d6", "X-Forwarded-For": "34.193.90.113", "X-Forwarded-Port": "443", "X-Forwarded-Proto": "https"}, "queryStringParameters": {"plan_id": "d1fb745a-01bf-4f6b-84f5-9b63fe39533f"}, "multiValueQueryStringParameters": {"plan_id": ["d1fb745a-01bf-4f6b-84f5-9b63fe39533f"]}, "pathParameters": null, "stageVariables": null, "requestContext": {"resourceId": "vycaws", "resourcePath": "/user", "httpMethod": "GET", "extendedRequestId": "RWFt6FTloAMEo-g=", "requestTime": "10/Jan/2024:22:39:59 +0000", "path": "/reading/user", "accountId": "546275881527", "protocol": "HTTP/1.1", "stage": "dev", "domainPrefix": "api", "requestTimeEpoch": 1704926399179, "requestId": "dda0d43b-99b8-43f4-9a8b-04028369ebad", "identity": {"cognitoIdentityPoolId": null, "accountId": null, "cognitoIdentityId": null, "caller": null, "sourceIp": "34.193.90.113", "principalOrgId": null, "accessKey": null, "cognitoAuthenticationType": null, "cognitoAuthenticationProvider": null, "userArn": null, "userAgent": "Amazon|StepFunctions|HttpInvoke|us-east-1", "user": null}, "domainName": "api.example.com", "apiId": "mel8c2adu2"}, "body": null, "isBase64Encoded": false}
{"timestamp": "2024-01-11T21:12:37Z", "level": "WARNING", "message": "Subsegment dynamodb discarded due to Lambda worker still initializing", "logger": "aws_xray_sdk.core.lambda_launcher", "requestId": "fe94d3d7-9398-495a-af13-93e154466fdb"}
{"timestamp": "2024-01-11T21:12:37Z", "level": "WARNING", "message": "No subsegment to end.", "logger": "aws_xray_sdk.core.context", "requestId": "fe94d3d7-9398-495a-af13-93e154466fdb"}
END RequestId: fe94d3d7-9398-495a-af13-93e154466fdb
REPORT RequestId: fe94d3d7-9398-495a-af13-93e154466fdb Init Duration: 0.03 ms Duration: 876.01 ms Billed Duration: 877 ms Memory Size: 128 MB Max Memory Used: 128 MB

[
{
"category": "group",
"uid": "84db3228-0174-45bf-8b67-2e9c62b5ecf7",
"description": "Men",
"is_private": false
},
{
"category": "group",
"uid": "8fe42e6c-ea78-4c61-b922-2e5776a87b6b",
"description": "Women",
"is_private": false
}
]

Here the development effort was focused on ensuring that the event payload for the Lambda function was written correctly to execute the core business logic. The local testing with SAM CLI ensures that the implementation was correct prior to deploying the code and resources into my AWS account.

Making requests to public API Gateway endpoints
When integrating the Lambda function with API Gateway, there are two key considerations. First, I need to ensure that API Gateway has the necessary permissions to invoke the Lambda function, which was already covered by the blueprints I deployed at the beginning of this post. Second, I need to ensure that the Lambda function is processing the event payload that is generated by API Gateway, which was covered by testing locally with an API Gateway event payload. Therefore, integration with API Gateway: ✅

API endpoint testing with curl

After deploying my resources into my AWS account, the final layer of testing was then to test the actual API Gateway endpoints, against which clients submit requests. I used curl commands to submit requests and saved those in a makefile. For example, I run make test.group, and a series of requests test the following methods: POST, GET, PUT, DELETE. I also wrote a set of tests with invalid requests just to make sure that those are handled correctly too.

At this point, I have a functioning API endpoint that handles my core business logic and persists the data into a single DynamoDB table.

Conclusion

I walked through the development process with the aim of helping you see how to start building a serverless API endpoint on AWS from scratch, starting from the data model design in DynamoDB to writing your code using test-driven development approaches.

I didn’t cover security of the API endpoint itself. In my simple scenario, I go with basic API keys for user authentication. If your use case requires both authentication and authorization mechanisms, make use of the appropriate security mechanism like OAuth2 on your API endpoint.

Stay tuned for the next post, where I plan to write about a scheduled workflow for sending out and tracking the daily readings.

Resources

--

--

Heeki Park
Heeki Park

Written by Heeki Park

Principal Solutions Architect @ AWS. Opinions are my own. https://linktr.ee/heekipark

No responses yet