Rate limiting is an important feature for Web Applications & API control that we wanted to add to open-appsec for a while. The feature allows administrators in all open-appsec deployment options to control the access rate to specific URLs and APIs in a granular way, based on source IPs, user identity and more.
We took an experimental approach to developing this feature and tasked two developers, Ned and Dan, with the same objective but had them use different techniques. Ned designed and developed the feature using traditional techniques while Dan used AI-development tools, namely ChatGPT Large Language Mode.
As the Project Manager and coordinator of the project, I oversaw their progress and checked in with Ned and Dan every few workdays, discussing not only what they managed to accomplish, but also the issues they encountered and their overall feeling towards the development approach they used. What happened is quite fascinating and we decided to share the outcome in this blog.
Rate Limiting - basic requirements and design
Rate limiting is an important tool for safeguarding your websites and APIs. It works by setting a cap on how many requests can be made within a certain period. This is usually tracked by a specific source identifier (IP address, username, etc). The limit can either apply to all requests, or it can be targeted to requests coming from the same identifier, say, not allowing more than 10,000 requests per second from a single IP address. This way, it helps maintain a balance and ensures your digital properties aren't overwhelmed by too much traffic at once.
See the example below:
In this example, if the same source identifier configured for this Web Application asset, creates over 10000 requests per minute targeted at the defined asset, an event is triggered. If the Mode is set to 'Active', the open-appsec agent will send a log and will block additional requests from the same source identifier.
Both developers needed to design and implement a solution for the following software requirements:
Use a source identifier that is configured in the policy. That identifier is a key – It can be the source IP address, the value of a specific HTTP header such as X-forwarded-for, etc. The key’s value differentiates between different requests to count separately.
Allow an admin to configure the maximum requests in a configured timeframe by the same identifier’s value. E.g. – no more than 500 requests per 2 minutes from the same source IP address.
When the maximum number of requests exceeds the configured limit, an event is triggered. An admin can configure if this event only triggers a log, or also blocks subsequent traffic until the timeframe elapses.
Different agent processes can detect requests but they are counted together.
In run-time, the open-appsec agent already knows how to parse the HTTP/S traffic and find the identification keys used to count the rate limits (whether it’s a source address, a custom-defined ID from the traffic, etc.). The developers needed to add a searchable storage solution that allows counting the number of requests per value of the configured identifiers. Another design element is to be able to reset part of the count to only look at requests within the relevant timeframe, usually by using a rolling time window. The storage solution should support atomic read and write from multiple processes.
The developers knew about each other's different methods of work but were asked to work independently without talking to each other. Ned conducted research, design and implementation on his own. Dan used ChatGPT 3.5. In the end, both integrated the code that was created into the open-appsec code base.
Elapsed work time - ~5 hours.
Progress: Researching database alternatives and algorithms and reviewing a selected approach.
Ned looked into two key-value store options: the popular REDIS and another solution called Memcached. Memcached is praised for its efficient memory usage, especially when data is frequently added and removed. But REDIS stood out with its wealth of documentation and examples, which would be handy when setting things up. Plus, Ned knew his team was already familiar with REDIS. So, he was more inclined to pick REDIS over Memcached.
Ned initially thought of using a straightforward rate-limiting method called the fixed window algorithm. This method involves creating a counter for each unit of time. But after doing a bit more research, Ned settled on another method known as the "Token Bucket" algorithm. This algorithm is great because it's more efficient when it comes to memory usage, and it can better handle sudden influxes of traffic. However, one downside is that it needs more than one call to the key-value store in a single combined action. So, Ned will need to plan the design keeping this factor in mind.
Progress: Discussing with ChatGPT implementation options and reaching a library that compiles and implements basic rate limit.
Dan started a dialog with ChatGPT about Rate Limiting.
He tasked ChatGPT with creating a library that analyzes the inputs from the traffic, according to the configuration parameters, and triggers an event if a limit is exceeded. ChatGPT produced the required code within seconds and easily modified it upon some specific requests from Dan, such as avoiding the use of blocking code, which doesn’t fit the open-appsec service that uses a single thread.
The code was compiled and worked. However, one of our main requirements was to allow rate count while working with multiple processes in parallel and the code produced by ChatGPT was appropriate for a single process. Dan asked ChatGPT for ways to modify the code to meet the requirements.
Out of the four methods suggested by ChatGPT for implementing a multi-process rate limit - Shared memory design, using message queues, a distributed key-value store, and defining a suitable network protocol, Dan chose to go for speed and simplicity. He decided on the shared memory approach as it didn't need any interaction with external components. ChatGPT produced the code and it compiled.
It seemed that Dan was off to a great start. However, the new code from ChatGPT wasn’t tested yet.
Elapsed work time - ~14 hours.
Progress: researching how his development team had been using REDIS so far and finalized the design.
Ned made sure to double-check his decision to use REDIS by consulting with his colleagues, but also kept looking for alternative rate-limiting algorithms, bearing in mind the drawback of his current selection. Meanwhile, he also started building a basic structure for a component. This component would help incorporate his new code into the existing rate limit policy configuration.
Progress: Dan had to troubleshoot because the code that ChatGPT produced wasn't counting accurately when it received input from several processes.
After testing the shared memory solution, Dan, with the help of ChatGPT, tried to iron out any bugs. The code produced by ChatGPT was quite complex, making it difficult to understand. So, Dan resorted to some hands-on debugging methods he created to identify the main problem: a clash between the fixed size of the allocated shared memory and the variable nature of the elements being added to it.
Questioning ChatGPT regarding the issues resulted in an “Oh, yes, there is a problem here” answer.
ChatGPT also suggested fixing the code by using a new library code.
However, the new code produced segmentation faults during tests.
Could the solution be recovered using the existing design?
Elapsed work time - ~17 hours.
Progress: implement code and continue looking for solutions for atomic multi-process counting.
Most of Ned's time was now focused on actually writing the code. After earlier phases mostly centered around research, he was finally diving into the real action: turning the design concepts, which came out of his extensive research, into actual code. At the same time, he kept reaching out to colleagues who were familiar with REDIS, hoping to come up with an innovative solution to handle multiple processes. However, he hadn't found an answer yet.
Progress: opting for a redesign.
One more attempt to continue implementing the shared memory approach resulted in code that would require expertise in the boost library in order to maintain. Dan felt the time required to learn this would have caused the value of working quickly with an AI model to be lost. Implementing a shared memory solution that uses dynamic allocation proved too complex. Dan decided to go back to the implementation options suggested by ChatGPT initially and select a different approach.
Will a redesigned solution encounter similar setbacks?
Elapsed work time - ~34 hours.
Progress: complete the implementation along with some fundamental tests, but still no solution for multiple processes.
Ned completed the code and had written and performed simple tests on most of it, but he hit a roadblock when it came to making his algorithm work with multiple processes. So, he took a break from coding to look for the best possible solution. It seemed he had found a promising lead. REDIS, the system he was using, allows for running a script in a single atomic run that can actually create multiple requests to the key-value store. This could potentially be the solution he was looking for.
Progress: New working code!
Dan went back to the original options suggested by ChatGPT for a storage solution. He selected using the known distributed key-value store REDIS, as, similarly to Ned, he found out from his peers that REDIS was anyway intended to be packaged with the open-appsec agent. He asked ChatGPT to create a library that uses an existing REDIS as the database for counting rates.
Dan revisited the original storage solutions recommended by ChatGPT. Without talking to Ned, he decided to use the popular distributed key-value store, REDIS. Dan knew that REDIS was intended to be bundled with the open-appsec agent, which further validated his choice. With this decision made, he asked ChatGPT to build a library that uses an existing REDIS as the database for tracking the rates.
This time, the code that ChatGPT produced, compiled and worked immediately, including support for multiple processes!
ChatGPT also wrote interesting and relevant unit tests.
The rest of the time was used to integrate this rate limit functionality library to the open-appsec components and policy. Dan felt it was faster for him to integrate on his own, although he could have provided the code to ChatGPT to see if it could handle it.
It was interesting to note that Dan asked ChatGPT to implement a functionality of far less complexity than implementing shared memory storage. The code doesn’t need to work differently to support multiple inputs from multiple processes because it is now using an external server which already solves the atomic requirement of the calls to it.
Are we near the end of the development project?
Elapsed work time - ~40 hours.
Progress: The code now supports multiple processes, but testing revealed some edge cases that still need debugging. Moreover, unit tests and large-scale tests are still pending.
Ned found a solution for handling multiple processes by implementing the calls to the key-value store using a scripting language. After developing this, he ran tests on the system. As often happens, these tests uncovered a few edge cases where the code didn't quite work as expected. Ned was busy sorting out these issues by debugging the code.
He was confident that he was close to finishing the project and already had a basic, working proof of concept in hand.
Progress: optimizations and unit test development.
Dan successfully integrated the library, provided by ChatGPT in the previous round, into the open-appsec component. As a result, the solution was now functional from start to finish. Dan then started brainstorming with ChatGPT on how to optimize the algorithm for shorter bursts. While he wrote the unit tests, they hadn't been integrated yet, and the larger scale test was still on the to-do list.
Summary and Conclusions
In our experiment, Ned was able to create a basic proof of concept within the allotted work time, though without unit tests. During the same period, Dan and ChatGPT managed to implement and test two separate designs, eventually dropping one. Had Dan consulted his peers more during the initial decision-making process, instead of quickly selecting from a list of options, he could have halved his implementation time. Dan's ChatGPT-based implementation using REDIS, developed and integrated in approximately 40 hours, is now in our final testing phase.
We learned so much from this fascinating project. Here are our main conclusions:
1. Trying many options and failing quickly vs. Planning
Every developer naturally wants to see their code working as quickly as possible, and AI development tools certainly speed up the process. However, the fastest route isn't always the most effective. The longstanding wisdom in development was always - to start with clear requirements, consider multiple design alternatives, and involve peer reviews. These steps might seem slow but saved time in the long run by avoiding potential pitfalls. Is this wisdom still true in this new era?
It is very easy to ask the LLM for multiple design or implementation alternatives and discuss with it the advantages and disadvantages of each option. But when we debrief this project, opinions varied within our team about the question of how to decide between the options. Should the developer decide on his own? Should he consult as usual with tech leads and peers (meaning that often he needs to wait for the availability)? Given that the code is produced and can be tested so quickly, maybe it’s better to just try and fail quickly?
Our conclusion is that it is an individual decision that depends on the developer's estimation of the potential "damage" of a wrong choice in terms of time lost and the potential scale of changes needed to reverse course. If the developer is in doubt, or not sure what is the potential damage, then he must consult.
2. Working with the Large Language Model
The way you structure and size your prompts is crucial. We've found that while packing all requirements into a single prompt gets us an answer, it can make subsequent adjustments to fit specific needs quite difficult. The resulting code may be hard to integrate with an existing code, or might not be optimized the way we intended.
Therefore, we recommend planning your questions in advance. Break them down and add requirements incrementally, ensuring you fully comprehend how the AI's response aligns with your needs.
For future code structure requirements, we'll adopt a "stub" approach, where we write basic code with placeholders and then ask the AI model to implement the details. This ensures easier integration. This approach also lets the human side of the equation consider elements they may have overlooked initially, such as performance optimizations that weren't specified, and that the AI model might not have suggested on its own.
The ChatGPT Large Language Model is able to produce code for complex problems, but it can also be very complex code that is difficult to read and debug. You must read the code and check that it can be maintained. Complied code doesn’t mean that it’s working and working code does not mean that it is maintainable.
3. Learning from Experiments
This project offered us unique insights not only into the differences between traditional and AI-assisted development methods, but also on effectively leveraging powerful Lagre Langauge Model like ChatGPT. The results have been encouraging, and we're certainly keen to keep incorporating AI tools into our development process, even as we acknowledge there's a lot more to learn.
We've picked up useful tactics to mitigate potential issues related to our experimental, AI-centric approach. However, with AI tools advancing at breakneck speed, we must also rapidly adapt our processes. Our approach of conducting real-world experiments with practical outcomes keeps our team motivated and allows us to tackle practical problems instead of dwelling on theories.
Therefore, we plan to conduct more experiments like the one detailed in this blog post because nothing beats learning from hands-on experience. We're excited about the lessons the future holds!
Rate Limiting is coming soon!
Thanks Dan and Ned for your hard work! Dan's ChatGPT-based implementation is now in our final testing phase. We're excited to say it will be available to open-appsec users soon!
open-appsec is an open-source project that builds on machine learning to provide pre-emptive web app & API threat protection against OWASP-Top-10 and zero-day attacks. It simplifies maintenance as there is no threat signature upkeep and exception handling, like common in many WAF solutions.