Dynamic Chatbot Logic

tags: AWS, chatbots, Lambda
created: Mon 26 February 2018; status: published;

I am working on a chatbot platform built on the idea that a working bot can be defined in a single line of text. It handles everything needed for a full-featured chatbot and can deploy bots on (almost) any cloud platform.

The platform (it's not named yet) automatically:

Provisions a phone number with Twilio
Connect SMS to a cloud function (AWS Lambda by default)
Send structured logs to CloudWatch
Deploys itself with 1 command, in about 3 minutes
Stores user information and all exchanged messages in DynamoDB

This is a bot called "good-morning bot":

at 8am send "Good morning! Have a great day!"

This is "echo bot":

send "ECHO: {text}"

I wrote those and built the backing platform using serverless. Once those were working, I started migrating my larger bot, TYMS, to the platform as a real use case to drive next development.

TYMS is built as a series of Steps. Each Step has 2 functions:

satisfied() - reports True if step currently is happy and does not need to run. Reports False (meaning 'I am not satisfied') if it should execute.
run() - logic of the step, in Python. It can inspect its protected step-specific state, global state, subscriber information, etc - basically anything.

A ConversationHandler executes each step until they have all had a chance to run or until one asserts 'done' for that interaction. It hands local state to each Step and saves the results. It also does extensive logging so global and step-specific state is logged before and after each step - it makes debugging easier.

This all works well. I can execute the bot locally, run tests, etc.

However, deploying is slower than I would like. Serverless has a plugin for python requirements, but it uses Docker to rebuild requirements each deployment, so it takes a long time. Uploading the ~5 MB is quick, but then all the AWS resources are re-provisioned from the CloudFormation file. This ends up taking 4 minutes or so before finally being able to execute it.

So I was thinking: for fastest turnaround, I really want the bot logic to change instantly. Ideally, I want to use a text editor or a web-based admin tool to edit bot logic in the cloud and have it usable without redeploying. Ultimately, the bot platform is very stable. Why even package the bot logic into the Lambda function?

I am now considering storing the bot logic in S3 and loading per-invocation of the bot platform. Each bot would still have a dedicated deployment, since there are may unique parameters for each including DB tables, logging settings, even area code desired for the Twilio number at first acquisition. But the Tyms bot's platform would download its logic files from S3, import, and run them.

There are security issues with importing code dynamically. But Lambda is doing this already, since Lambda code is stored in zip files in S3 buckets. Files with correct read-write privileges would be safe and can even encrypted.

The advantage I see is avoiding redeployment except in the case the core bot platform is updated. Further, bots could then be edited through a simple UI with immediate effect. Bugs could be fixed or more intelligence added on the fly.

This also lets versions be managed on the fly more easily. New edits could be made applicable only to certain users: add a change, test it, then publish it (or unpublish!).

This seems like the way to go and lets the platform behave more like some online bot publishing platforms. Edit in UI, see quick results, but still have a secure, stable bot platform underlying it, which can be upgraded independently of the bots themselves.

This will be the next evolution of my bot platform, to pull logic from cloud-hosted code assets.