SQLite does not have MVCC! It has only one author! SQLite is for phones and mobile apps (and sometimes airliners)! Use a proper database like Postgres for web servers! In this article I will explain why being embedded and lacking a single author is not a drawback, but actually allows SQLite to scale so unreasonably.
Preface
For code examples I will use Clojure. But, what they cover should be applicable to most programming languages.
The machine on which these benchmarks are run has the following characteristics:
- MacBook Pro (2021)
- Chip: Apple M1 Pro
- Memory: 16 GB
These benchmarks are not meant to be perfect or optimal. They are simply meant to demonstrate that it is relatively easy to achieve good write throughput with SQLite. Usual benchmark disclaimers apply.
Defining TPS
When I say TPS I don’t mean writes/updates per second. I’m talking about transactions per second, specifically interactive transactions that are common when building web applications. By interactive transactions I mean transactions where you execute some queries, run some application code and then execute more queries. For example:
BEGIN;
UPDATE accounts SET balance = balance - 100.00
WHERE name="Alice";
-- some application code runs
UPDATE accounts SET balance = balance + 100.00
WHERE name="Bob";
COMMIT;
Transactions are useful because they let you roll back the state of your changes if a problem occurs with your application.
benchmark harness
We spin to simulate requests n Virtual threads (green threads) that each execute a function f This is similar to the handler on the web server and will give us similar logic. The thing to note is that it is high burst. that means we will reach n Level up concurrent requests as fast as the system can spin up virtual threads.
(defmacro tx-per-second [n & body]
`(let [ids# (range 0 ~n)
start# (. System (nanoTime))]
(->> ids#
;; Futures are using virtual threads so blocking is not slow
(mapv (fn [_#] (future ~@body)))
(run! deref))
(int (/ ~n (/ (double (- (. System (nanoTime)) start#)) 1000000000.0)))))
For the Clojure programmers among you future This has been changed to use virtual threads. Therefore, we can earn lakhs of rupees if needed.
;; Make futures use virtual threads
(set-agent-send-executor!
(Executors/newVirtualThreadPerTaskExecutor))
(set-agent-send-off-executor!
(Executors/newVirtualThreadPerTaskExecutor))
We will use Postgres as our network database (I’m using Postgres, but the same applies to MySQL etc) with a high performance connection pool optimized for our number of cores.
(defonce pg-db
(jdbc/with-options
(connection/->pool
HikariDataSource
{:dbtype "postgres"
:dbname "thedb"
:username (System/getProperty "user.name")
:password ""
:minimumIdle 8
:maximumPoolSize 8})
{}))
We will use SQLite with a single writer connection and reader connections equal to the number of our cores.
(defonce lite-db
(d/init-db! "database.db"
{:pool-size 8
:pragma {:cache_size 15625
:page_size 4096
:journal_mode "WAL"
:synchronous "NORMAL"
:temp_store "MEMORY"
:busy_timeout 5000}}))
Our database will have a simple schema:
(jdbc/execute! pg-db
["CREATE TABLE IF NOT EXISTS account(id INT PRIMARY KEY, balance INT)"])
(d/q (lite-db :writer)
["CREATE TABLE IF NOT EXISTS account(id PRIMARY KEY, balance INT)"])
And each has a billion rows:
(->> (range 0 (* 1000 1000 1000))
(partition-all 32000)
(run!
(fn [batch]
(jdbc-sql/insert-multi! pg-db :account
(mapv (fn [id] {:id id :balance 1000000000}) batch)))))
(->> (range 0 (* 1000 1000 1000))
(partition-all 100000)
(run!
(fn [batch]
(d/with-write-tx [tx (lite-db :writer)]
(run!
(fn [id]
(d/q tx
["INSERT INTO account(id, balance) VALUES (?,?)" id 1000000000]))
batch)))))
Our user distribution will follow a power law. That means the top X percent will be involved in most transactions. We have a billion users, so in practice most of them will not be active, or will rarely be active. 0.9995 This means that 99.95% of transactions will be done by 0.05% of users. This still means around 100000 unique active users at any given time.
The reason we’re using the power law is that it’s a very common distribution for a lot of real products. If you think about credit card payment systems in the context of retail, most transactions are likely to occur with a few large retailers (Amazon, Walmart, etc.).
(defn pareto-user []
(rand-pareto (* 1000 1000 1000) 0.9995))
rand-pareto Transforms a random distribution into a power law distribution.
(defn rand-pareto [r p]
(let [a (/ (Math/log (- 1.0 p)) (Math/log p))
x (rand)
y (/ (- (+ (Math/pow x a) 1.0)
(Math/pow (- 1.0 x) (/ 1.0 a)))
2.0)]
(long (* r y))))
network database
Let’s start with network databases.
(tx-per-second 100000
(jdbc/with-transaction [tx pg-db]
(jdbc/execute! tx (credit-random-account))
(jdbc/execute! tx (debit-random-account))))
;; => 13756 TPS
A respectable 13756 TPS.
However, typically a network database will not be on the same server as our application. So let’s simulate some network latency. Let’s say there is 5ms latency between your app server and your database.
(tx-per-second 10000
(jdbc/with-transaction [tx pg-db]
(jdbc/execute! tx (credit-random-account))
(Thread/sleep 5)
(jdbc/execute! tx (debit-random-account))))
;; => 1214 TPS
Note: Virtual threads do not sleep real threads. Instead they allow the underlying carrier thread to resume another virtual thread.
What happens if we increase that latency to 10 ms?
(tx-per-second 10000
(jdbc/with-transaction [tx pg-db]
(jdbc/execute! tx (credit-random-account))
(Thread/sleep 10)
(jdbc/execute! tx (debit-random-account))))
;; => 702 TPS
But, wait, our transactions are not serialized, which is required to be the case if we want consistent transaction processing (isolation is serialized by SQLite design). We better fix it and handle the retry.
(tx-per-second 10000
(loop []
(let [result
(try
(jdbc/with-transaction [tx pg-db {:isolation :serializable}]
(jdbc/execute! tx (credit-random-account))
(Thread/sleep 10)
(jdbc/execute! tx (debit-random-account)))
(catch Exception _ nil))]
(when-not result (recur)))))
;; => 660 TPS
What if there is an additional query (an additional network hop) in the interactive transaction?
(tx-per-second 10000
(loop []
(let [result
(try
(jdbc/with-transaction [tx pg-db {:isolation :serializable}]
(jdbc/execute! tx (credit-random-account))
(Thread/sleep 10)
(jdbc/execute! tx (debit-random-account))
(Thread/sleep 10)
(jdbc/execute! tx (debit-random-account)))
(catch Exception _ nil))]
(when-not result (recur)))))
;; => 348 TPS
348 TPS! What’s going on over here? Amdahl’s law applies!
The improvement in overall performance achieved by optimizing any one part of a system is limited to the fraction of time that the improved part is actually used.
We are performing transactions with row locks in a network with high contention due to power law. The scary thing about this is that no amount of extra money (CPU/server/memory) will save us. This is a hard limitation caused by the network. What’s worse, any unexpected increase in latency will worsen the problem. Which also means that you cannot have application servers in different data centers than your database (due to increased latency).
I learned this the hard way building an emoji based tipping bot for Discord. At the time I didn’t understand why we were crossing this hard line in TPS. We sacrificed the convenience of interactive transactions and moved everything to stored procedures (meaning no locks across the entire network). However, this is not possible in many domains.
Embedded means no network
Let’s see how SQLite fares.
(tx-per-second 1000000
(d/with-write-tx [tx (lite-db :writer)]
(d/q tx (credit-random-account))
(d/q tx (debit-random-account))))
;; => 44096 TPS
44096 TPS! By eliminating the network, SQLite largely reduces the effects of Amdahl’s law.
Single Writer lets you batch deploy
However we don’t need to stop there. Because, SQLite is a single writer which we can batch. sqlite4clj provides a convenient dynamic batching function. Batch size grows dynamically with workload and producers do not have to block when consumers are busy. Effectively it optimizes itself for latency and throughput.
(defn batch-fn [db batch]
@(on-pool! lite-write-pool
(d/with-write-tx [tx db]
(run! (fn [thunk] (thunk tx)) batch))))
(defonce tx!
(b/async-batcher-init! lite-db
{:batch-fn #'batch-fn}))
Note: For Clojure/Java programmers we are using a thread pool because SQLite should be treated as CPU and not IO, so we don’t want it to starve our virtual threads (io green threads).
(tx-per-second 1000000
@(tx!
(fn [tx]
(d/q tx (credit-random-account))
(d/q tx (debit-random-account)))))
;; => 186157 TPS
But, wait I hear you cry! The trick is that we no longer have isolated transaction failures. Batching is sacrificing granular transactions. You are right! Let’s fix that.
(tx-per-second 1000000
@(tx!
(fn [tx]
(d/q tx ["SAVEPOINT inner_tx"])
(try
(d/q tx (credit-random-account))
(d/q tx (debit-random-account))
(catch Throwable _
(d/q tx ["ROLLBACK inner_tx"])))
(d/q tx ["RELEASE inner_tx"]))))
;; => 121922 TPS
SQLite supports nested transactions SAVEPOINT This allows us to perform fine transaction rollback even when batching our writes. If a transaction fails it will not cause the batch to fail. The only case where the entire batch will fail is in the event of a power loss/or a hard crash.
What about concurrent reading?
Typically a system has a mix of reads and writes, somewhere between 75% reads to 25% writes. So let’s add some articles.
(tx-per-second 1000000
(on-pool! lite-read-pool
(d/q (lite-db :reader)
["select * from account where id = ? limit 1" (pareto-user)]))
(on-pool! lite-read-pool
(d/q (lite-db :reader)
["select * from account where id = ? limit 1" (pareto-user)]))
(on-pool! lite-read-pool
(d/q (lite-db :reader)
["select * from account where id = ? limit 1" (pareto-user)]))
@(tx!
(fn [tx]
(d/q tx ["SAVEPOINT inner_tx"])
(try
(d/q tx (credit-random-account))
(d/q tx (debit-random-account))
(catch Throwable _
(d/q tx ["ROLLBACK inner_tx"])))
(d/q tx ["RELEASE inner_tx"]))))
;; => 102545 TPS
102545 TPS!
Note: For Clojure/Java programmers we are using a separate read thread pool so that reads don’t starve writes.
tps report
| postgres | sqlite | |
|---|---|---|
| no network | 13756 | 44096 |
| 5ms | 1214 | N/A |
| 10ms | 702 | N/A |
| 10 ms sort | 660 | N/A |
| batch | N/A | 186157 |
| batch savepoint | N/A | 121922 |
| batch savepoint+ reads | N/A | 102545 |
conclusion
Hopefully, this post will help illustrate the unfair effectiveness of SQLite as well as Amdahl’s law and the challenges you may face with network databases like Postgres.
The full benchmark code can be found here.
Further reading:
If you want to learn more about Amdahl’s law, power laws and how they interact with network databases I highly recommend listening to this interview with Joran Greif and checking out his talk 1000x: The Power of an Interface for Performance written by Joran Dirk Greif.
If you want to read how far you can go with SQLite checkout Scaling SQLite up to 4M QPS on a single server (EC2 vs Bare Metal).
If you’re thinking about running SQLite in production and wondering how to create streaming replicas, backups, and projections, checkout Lightstream.
If you still don’t think a single machine can handle your workload then Scalability is worth reading! But at what cost?
thanks for doing Everyone on Datastar Discord read the draft and gave me feedback.
Discussion
<a href