Implementing Santorini Bitboards with Haskell
0. Prerequisites
- Haskell: GHC version 8.10.x
- Libraries
- containers:
IntMap
for key-value pairs andSequence
for queues. - QuickCheck: For random instance generation.
- criterion: For benchmarking.
- containers:
1. Computing Player’s Moves
Problem Setting
Santorini is a turn-based strategy game for two players. The board consists of \(5 \times 5\) square spaces, and each space has a level from \(0\) to \(4\). Here is an excerpt from its basic rules.
- Each player owns two workers.
- On each turn, a player must move and then build with the selected worker.
- Any space at level \(4\) is capped; no worker can move or build there.
- A worker may move into one of the (up to) eight neighboring unoccupied spaces (like King’s movement in Western chess). A worker may:
- move up a maximum of one level higher
- move down any number of levels lower
- or move along the same level.
- A player can build a block on an unoccupied space neighboring the moved worker.
- If one of the workers moves up on top of level \(3\), the player wins.
Now, consider a program that computes of a worker’s all possible moves. For simplicity, We treat the spaces occupied by the other workers as level \(4\) so that a moving worker cannot reach any occupied spaces.
The signature of the function could be like this.
type Index = (Int, Int)
type Level = Int
getMoveTo :: (Index, [Level]) -> [Index]
getMoveTo (moveFrom, spaces) = ...
The Index
consists of a pair of row and column indices (1-indexed). Any Level
should be between \(0\) and \(4\), inclusive. The function getMoveTo()
takes the following three arguments.
moveFrom
: position of the selected worker- Example:
(1, 2)
- Example:
spaces
: current level for each space, represented as a list for total \(25\) spaces \((1,1),(1,2),\ldots,(1,5),(2,1),\ldots(5,5)\).- Example:
[0,0,2,0,0,1,2,0,4,1,3,0,0,0,1,0,0,1,3,4,0,1,1,2,2]
- Example:
And we expect a result: [(1, 1), (2, 1), (2, 3)]
Naive Implementation
Here is an example straightforward implementation using list comprehension.
getMoveToNaive :: (Index, [Level]) -> [Index]
getMoveToNaive ((x, y), spaces) =
let levels = IntMap.fromList $ zip [0 ..] spaces
in [ (xx, yy)
| dx <- [-1, 0, 1],
dy <- [-1, 0, 1],
dx /= 0 || dy /= 0,
let xx = x + dx, -- move-to candidate
let yy = y + dy,
1 <= xx && xx <= 5, -- boundary check
1 <= yy && yy <= 5,
levels ! fromIndex (xx, yy) <= 3, -- cannot move up to level 4
levels ! fromIndex (x, y) + 1 >= levels ! fromIndex (xx, yy) -- can move up at most one level
]
But can we compute faster?
Implementing Bitboard
The idea of the bitboard is to represent each space as a bit (\(0\) or \(1\)) and the entire board as a bit array. There are \(25\) spaces on the board, but it is more convenient to have extra spaces around the boundary. We employ the following bitwise representation, which fits in a 64-bit integer. For example, space \((1,1)\) maps to index \(8\), and space \((1,2)\) to index \(9\), and in general, \((x,y)\) maps to \(7x+y\).
col 1 2 3 4 5
=============================
0 1 2 3 4 5 6
row --------------------
1 7 | 8 9 10 11 12| 13
2 14 |15 16 17 18 19| 20
3 21 |22 23 24 25 26| 27
4 28 |29 30 31 32 33| 34
5 35 |36 37 38 39 40| 41
--------------------
42 43 44 45 46 47 48
Notice that valid spaces are no longer from \(0\) to \(24\). We define \(V=\{7x + y \mid 1\leq x,y \leq 5\}\) as the set of valid indices. To represent a bitboard as an integer, we define the following function: \(b(X) = \sum_{i \in X} 2^i\). For instance, \(b(V) = 2147077824256\), which we can also represent as follows. In the diagram, -
means \(0\) (bit off) and *
means \(1\) (bit on).
-------
-*****-
-*****-
-*****-
-*****-
-*****-
-------
Here I would like to introduce several operations.
Neighborhood of an index
Let \(i\) be a valid index, and \(S=\{0, 1, 2, 7, 9, 14, 15, 16\}\), which is equivalent to the following diagram.
***----
*-*----
***----
-------
-------
-------
-------
Then, the bitboard of \(i\)’s neighborhood \(N(i)\) is given by bit shift and a few simple operations: \(N(i)=b^{-1}[\left(b(S)\ll(i-8)\right) \land b(V)]\).
getNeighborhood :: BBIndex -> BitBoard
getNeighborhood i = x `shift` (i + s) .&. globalMask
where
x = 115335 -- listToBB [0, 1, 2, 7, 9, 14, 15, 16]
s = -8
This effectively checks the board boundary. The following shows an example of \(N(19)\).
x:=
b(S) b(S)<<(19-8) x & b(V)
=b(S)<<11
***---- ------- -------
*-*---- ----*** ----**-
***---- ----*-* ----*--
------- ----*** ----**-
------- ------- -------
------- ------- -------
------- ------- -------
Closed neighborhood of a bitboard
We define the closed neighborhood of a set of spaces \(X\), denoted by \(N[X]\), as follows: \(N[X]=\bigcup_{x \in X}N[x]\), where \(N[x]=N(x) \cup \{x\}\). Then, \(N[X]\) can also be obtained by \(N[X]=b^{-1}[ ((x \ll 7) \lor x \lor (x \gg 7)) \land b(V))]\), where \(x = (b(X)\ll 1) \lor b(X) \lor (b(X)\gg 1)\).
getClosedNeighborhood :: BitBoard -> BitBoard
getClosedNeighborhood bb =
let x = bb .|. (bb `shift` 1) .|. (bb `shift` (-1))
y = x .|. (x `shift` 7) .|. (x `shift` (-7))
in y .&. globalMask
For \(N[\{8, 24, 25\}]\), computation would be like this.
z:= x:= y:= y & b(V)
b({8,24,25}) (z<<1)| (x<<7)|
z|(z>>1) x|(x>>7)
------- ------- ***---- -------
-*----- ***---- ***---- -**----
------- ------- ******- -*****-
---**-- --****- --****- --****-
------- ------- --****- --****-
------- ------- ------- -------
------- ------- ------- -------
Computing Move Range with Bitboard
Now, we can efficently compute the neighborhood of the given index. But how do we compare levels to determine whether or not a worker can move up? Luckily enough, levels are at most \(4\), and we can ignore level-\(4\) spaces as workers cannot move there anyways. So, we keep track of bitboards from level \(0\) to \(3\). Let \(L_k\) be a set of spaces whose level is \(k\). Here is an example.
Levels L_0 L_1 L_2 L_3
------- ------- ------- ------- -------
-00200- -**-**- ------- ---*--- -------
-12041- ---*--- -*---*- --*---- -------
-30001- --***-- -----*- ------- -*-----
-00134- -**---- ---*--- ------- ----*--
-01122- -*----- --**--- ----**- -------
------- ------- ------- ------- -------
By definition, all \(1\)-bits in \(L_k\) are distinct. The move-to range of a worker at index \(i\), denoted by \(M(i)\), can be computed as follows.
\[M(i) = \begin{cases} N(i) \land (L_0 \lor L_1) &\text{if }\ i \in L_0\\ N(i) \land (L_0 \lor L_1 \lor L_2) &\text{if }\ i \in L_1\\ N(i) \land (L_0 \lor L_1 \lor L_2 \lor L_3) &\text{if }\ i \in L_2 \lor L_3 \end{cases}\]This should be more efficient than checking the levels of \(i\)’s all neighbors. The following is an example implementation, along with the functions that convert input and output.
convertInput :: (Index, [Level]) -> (Int, [BitBoard])
convertInput (moveFrom, spaces) =
( toBBIndex moveFrom,
foldl'
( \[x0, x1, x2, x3] (i, y) ->
let bb = singletonBB i
in case y of
0 -> [x0 + bb, x1, x2, x3]
1 -> [x0, x1 + bb, x2, x3]
2 -> [x0, x1, x2 + bb, x3]
3 -> [x0, x1, x2, x3 + bb]
_ -> [x0, x1, x2, x3]
)
[0, 0, 0, 0]
(zip validIndices spaces)
)
convertOutput :: BitBoard -> [Index]
convertOutput = map fromBBIndex . bbToList
getMoveToWithBB :: (Index, [Level]) -> [Index]
getMoveToWithBB = convertOutput . getMoveToWithBB' . convertInput
getMoveToWithBB' :: (BBIndex, [BitBoard]) -> BitBoard
getMoveToWithBB' (i, [x0, x1, x2, x3]) = case getNeighborhood i of
nbr | i `elemBB` x0 -> nbr .&. (x0 .|. x1)
nbr | i `elemBB` x1 -> nbr .&. (x0 .|. x1 .|. x2)
nbr -> nbr .&. (x0 .|. x1 .|. x2 .|. x3)
getMoveToWithBB' _ = undefined
2. Computing All Distances
The biggest advantage of using bitboards is that we can perform (most) operations with a bitboard in constant time, regardless of the number of \(1\)-bits. Naturally, we want to compute the union of move ranges, that is: \(M(X)=\bigcup_{x \in X}M(x)\), where \(X\) is a set of spaces instead of a single index. It is not hard to see the following relationship.
\[\begin{aligned} M(X) =&(N[X] \cap L_0)\\ &\cup (N[X] \cap L_1)\\ &\cup (N[X \cap (L_1 \cup L_2 \cup L_3)] \cap L_2)\\ &\cup (N[X \cap (L_2 \cup L_3)] \cap L_3) \end{aligned}\]Here is an implementation. Note that \(b(L_1) \lor b(L_2) \lor b(L_3) = b(L_1) \oplus b(L_2) \oplus b(L_3) = b(L_0) \oplus b(L_1) \oplus b(L_2) \oplus b(L_3) \oplus b(L_0)\), where \(\oplus\) denotes exclusive or, because all \(1\)-bits are distinct among all levels.
getMoveToBB :: [BitBoard] -> [BitBoard] -> [BitBoard]
getMoveToBB [v0, v1, v2, v3] [x0, x1, x2, x3] =
let xx = x0 .|. x1 .|. x2 .|. x3
y0 = getClosedNeighborhood xx .&. v0
y1 = getClosedNeighborhood xx .&. v1
y2 = getClosedNeighborhood (xx `xor` x0) .&. v2
y3 = getClosedNeighborhood (x2 .|. x3) .&. v3
in [y0, y1, y2, y3]
getMoveToBB _ _ = undefined
Unfortunately, this function is not as fast as getMoveToWithBB'
when a bitboard is singleton, that is, only one bit is on. However, if we consider the following problem, getMoveToBB
can be advantageous.
Given an index \(i\), compute distances from \(i\) to all spaces.
The signagure would be:
getDistances :: (Index, [Level]) -> IntMap Int
getDistances (i, spaces) = ...
Here i
is the starting index, and spaces
holds level information. We expect key-value pairs where the key is the index of a space, normalized to the range from \(0\) to \(24\), and the value is the distance from \(i\). The distance from \(i\) to \(i\) itself is \(0\), and if a space is unreachable from \(i\), then the distance should be \(\infty\).
Naive BFS
This is so-called the single-source shortest path problem, and we could tackle this by BFS (breadth-first search). We use Data.Sequence
for a queue because Data.List
is not performant when an element is added to the last.
getDistancesNaive :: (Index, [Level]) -> IntMap Int
getDistancesNaive (moveFrom, spaces) =
let initMap = IntMap.fromList [(i, if i == fromIndex moveFrom then 0 else distInf) | i <- [0 .. 24]]
in getDistancesNaive' spaces (Seq.fromList [moveFrom]) initMap
getDistancesNaive' :: [Level] -> Seq Index -> IntMap Int -> IntMap Int
getDistancesNaive' spaces q sofar | Seq.null q = sofar
getDistancesNaive' spaces q sofar =
let x = q `Seq.index` 0
x' = fromIndex x
nbrs = getMoveToNaive (x, spaces)
unseen = Seq.fromList [nbr | nbr <- nbrs, (sofar ! fromIndex nbr) == distInf]
d = (sofar ! x') + 1
q' = Seq.drop 1 q Seq.>< unseen
sofar' = foldl (\m u -> IntMap.insert (fromIndex u) d m) sofar unseen
in getDistancesNaive' spaces q' sofar'
BFS with Bitboards
Here is the BFS code using getMoveToBB
with bitboards.
getDistancesWithBB :: (Index, [Level]) -> IntMap Int
getDistancesWithBB i =
let (moveFrom, levels) = convertInput i
result = getDistancesWithBB' (singletonBB moveFrom) levels
result' = [((fromIndex . fromBBIndex) j, d) | (d, x) <- zip [0 ..] result, j <- bbToList x]
in IntMap.fromList $ zip [0 .. 24] (repeat distInf) ++ result'
getDistancesWithBB' :: BitBoard -> [BitBoard] -> [BitBoard]
getDistancesWithBB' moveFrom levels = (takeWhile (/= 0) . map (sum . fst)) $ iterate' f (map (.&. moveFrom) levels, moveFrom)
where
f (frontier, visited) =
let ys = getMoveToBB levels frontier
in (map (`andNotBB` visited) ys, visited .|. sum ys)
3. Benchmarking
We want to compare the performance between with and witout bitboards.
Generating Random Boards
First, we use QuickCheck to generate random instances. We add some biases to levels because we do not want too many spaces to be level \(4\).
import Test.QuickCheck (chooseInt, elements, generate, shuffle, vectorOf)
generateRandomInput :: Int -> IO [(Index, [Level])]
generateRandomInput n = generate $
vectorOf n $ do
indices <- shuffle [0 .. 24]
workerLevel <- chooseInt (0, 2)
emptyLevels <- vectorOf 21 $ elements [0, 0, 1, 1, 1, 2, 2, 2, 3, 4] -- levels at empty spaces
let spaces = (map snd . sort . zip indices) (workerLevel : [4, 4, 4] ++ emptyLevels)
let worker = toIndex $head indices
return (worker, spaces)
Benchmark Code
We use criterion for benchmarking. See the tutorial for details.
We test the getMoveTo
functions with \(10^5\) instances, and getDistances
with \(10^4\) instances. We also check if those functions return exactly the same value.
import Criterion.Main (bench, bgroup, defaultMain, nf)
verify :: (Index, [Level]) -> Bool
verify x = getMoveToNaive x == getMoveToWithBB x && getDistancesNaive x == getDistancesWithBB x
main :: IO ()
main = do
instances <- generateRandomInput 100000
let instances' = map convertInput instances
-- check correctness
guard $ all verify instances
-- benchmark
defaultMain
[ bgroup
"getMoveTo"
[ bench "Naive" $ nf (map getMoveToNaive) instances,
bench "BB" $ nf (map getMoveToWithBB) instances,
bench "BB (core)" $ nf (map getMoveToWithBB') instances'
],
bgroup
"getDistances"
[ bench "Naive" $ nf (map getDistancesNaive) (take 10000 instances),
bench "BB" $ nf (map getDistancesWithBB) (take 10000 instances)
]
]
The benchmark BB (core) measures only the core logic without input and output conversions.
Benchmark Result
As we expected, using bitboards improved the overall performance for both problems. If we compare core logic, getMoveToWithBB
is 20 times faster than the naive implementation. As for computing distances, getDistancesWithBB
is around five times faster than the naive one.
benchmarking getMoveTo/Naive
time 201.3 ms (190.9 ms .. 221.3 ms)
0.994 R² (0.969 R² .. 1.000 R²)
mean 211.2 ms (203.5 ms .. 217.3 ms)
std dev 8.909 ms (6.801 ms .. 11.81 ms)
variance introduced by outliers: 14% (moderately inflated)
benchmarking getMoveTo/BB
time 154.8 ms (144.8 ms .. 166.7 ms)
0.993 R² (0.975 R² .. 1.000 R²)
mean 149.6 ms (145.3 ms .. 154.9 ms)
std dev 7.277 ms (5.587 ms .. 8.660 ms)
variance introduced by outliers: 12% (moderately inflated)
benchmarking getMoveTo/BB (core)
time 8.066 ms (7.902 ms .. 8.234 ms)
0.996 R² (0.993 R² .. 0.998 R²)
mean 7.967 ms (7.861 ms .. 8.100 ms)
std dev 336.1 μs (251.3 μs .. 483.2 μs)
variance introduced by outliers: 19% (moderately inflated)
benchmarking getDistances/Naive
time 446.4 ms (159.1 ms .. 618.8 ms)
0.956 R² (0.851 R² .. 1.000 R²)
mean 561.1 ms (502.6 ms .. 608.0 ms)
std dev 61.83 ms (10.67 ms .. 84.72 ms)
variance introduced by outliers: 23% (moderately inflated)
benchmarking getDistances/BB
time 86.04 ms (82.47 ms .. 91.98 ms)
0.993 R² (0.983 R² .. 0.999 R²)
mean 83.13 ms (81.23 ms .. 85.34 ms)
std dev 3.896 ms (2.993 ms .. 5.197 ms)
See the full report for more details.
4. Code
- Repository: https://github.com/mogproject/santorini-bitboard-example
- Algorithms: BenchMain.hs
- BitBoard: BitBoard.hs