What is serverless?
"Serverless" is a word used to describe Functions-as-a-Service, a function that lives on a cloud provider's server. When a specific url is hit (or event), the function runs. Once the function has completed, the process is over.
The billing model for serverless is as key as the idea itself, in that you only pay for those brief moments the function is actually doing something. Once it is complete (or back to "zero"), you no longer need to pay for any continuous hosting, per month, or by the hour. The function will always be available when needed, but not running when not needed.
Where a traditional server would require a hosting plan allowing the server to run 24/7, listening for requests, serverless lambdas are only run by certain triggers, and shut down when that process is complete.
The kitchen analogy
Think of traditional servers on a hosting provider like a kitchen.
A kitchen can have a certain number of cooks, and can take in a certain number of orders. In a traditional auto-scaling model, when the kitchen gets an influx of orders, the kitchen manager calls for more cooks in the kitchen.
At some point, the kitchen is at capacity. No more cooks (resources) can fit in this kitchen (server), so a second kitchen is added with another staff of cooks. To meet this new demand, we still must wait for this new kitchen (server) to be rapidly built.
This is auto-scaling on traditional servers.
With auto scaling, why do we need serverless?
Now, let's use the order / kitchen analogy again. Before, our kitchen is taking in multiple orders and fluxuating cooks, or even entire kitchens, as needed. While this mostly works, the delay in getting the second kitchen, or even additional cooks, can mean some food will take longer during these events.
In our scenario above, instead of each order going to a single kitchen, now each order is getting its own kitchen. Every order gets its own chef, prep cooks, and expo.
Due to the small size and optimized runtimes, serverless functions are able to horizontally scale, meaning each request (or order) is it's own invocation with its own threadpool and memory (not always).
What does this mean? It means that we can build incredibly durable and resilient applications that no longer need to be throttled through things like SQS queus, or debug provisioning issues during auto-scaling events. They are always on, and essentially never go "down."
It also means that it is usually much cheaper to run applications in serverless environments than to run a traditional always-on server, especially compared to the higher costs of instances during auto-scaling. There is also a delay between running more resources during higher demand, and when that demand drops off. So, for brief periods you can pay significantly higher compute costs without serving any traffic.
Serverless scales as needed, as each request being its own invocation. This gives the cost-to-use a true 1:1.
Caveats / Gotchas
You'll hear people say "serverless is not really serverless". That's true. It's just someone else' server. But, for the purpose of architecture & how Functions-as-a-Service behave, it's best to treat it as a true down-to-zero function, instead of what it really is, a function living on AWS servers.
Serverless architectures are often more complex than traditional monolithic servers. What was once a kitchen sink, is now a harmonious orchestra that must work together like a symphony. While you gain durability and efficiency, testing is still an emerging and evolving area of serverless.
While you can test offline, testing in cloud-native environments is a growing area of development to simulate a true serverless ecosystem. While you can stub out certain behaviors, they are not true tests until they live in the cloud infastructure with real environments and real permissions.
Since each function has a short lifecycle, it does not inherently have any way to manage state. This is usually solved by using tools like Redis to persist application state, but adds to the complexity of applications.