发布于 2026-01-06 0 阅读
0

用 Haskell 编写一个快速数独求解器

用 Haskell 编写一个快速数独求解器

数独是一种数字填空谜题。它由一个 9x9 的方格组成,需要用 1 到 9 的数字填充。方格中的一些单元格已经预先填好,玩家需要填写其余的单元格。

Haskell是一种纯函数式编程语言。鉴于数独问题的组合性质,它是解决数独问题的理想选择。本系列文章旨在用 Haskell 编写一个快速的数独求解器。我们将重点关注解决方案的实现和效率优化,并逐步深入,本文将从一个速度较慢但简单的解决方案开始。

这篇文章最初发表在我的博客上。

内容

  1. 约束满足问题
  2. 设置
  3. 修剪细胞
  4. 修剪网格
  5. 做出选择
  6. 解开谜题
  7. 结论

约束满足问题

解数独是一个约束满足问题。给定一个部分填充的网格,我们需要将其完全填充,并满足以下所有约束条件:

  1. 九行中的每一行都必须包含从 1 到 9 的所有数字。
  2. 九列中的每一列都必须包含从 1 到 9 的所有数字。
  3. 九个 3x3 子网格中的每一个都必须包含从 1 到 9 的所有数字。
+-------+-------+-------+
| . . . | . . . | . 1 . |
| 4 . . | . . . | . . . |
| . 2 . | . . . | . . . |
+-------+-------+-------+
| . . . | . 5 . | 4 . 7 |
| . . 8 | . . . | 3 . . |
| . . 1 | . 9 . | . . . |
+-------+-------+-------+
| 3 . . | 4 . . | 2 . . |
| . 5 . | 1 . . | . . . |
| . . . | 8 . 6 | . . . |
+-------+-------+-------+
    A sample puzzle

+-------+-------+-------+
| 6 9 3 | 7 8 4 | 5 1 2 |
| 4 8 7 | 5 1 2 | 9 3 6 |
| 1 2 5 | 9 6 3 | 8 7 4 |
+-------+-------+-------+
| 9 3 2 | 6 5 1 | 4 8 7 |
| 5 6 8 | 2 4 7 | 3 9 1 |
| 7 4 1 | 3 9 8 | 6 2 5 |
+-------+-------+-------+
| 3 1 9 | 4 7 5 | 2 6 8 |
| 8 5 6 | 1 2 9 | 7 4 3 |
| 2 7 4 | 8 3 6 | 1 5 9 |
+-------+-------+-------+
    and its solution

数独网格中的每个单元格都属于一行、一列和一个子网格(通常称为“块”1 )。预填充单元格中的数字对其所属的行、列和子网格施加了约束。例如,如果一个单元格包含 1 ,那么该单元格所在行、列或子网格中的其他单元格就不能包含11。基于这些约束,我们可以设计一个简单的算法来解决数独问题:

1. 每个单元格要么包含一个数字,要么包含一组可能的数字。例如,下图展示了上述示例谜题中所有未填充单元格的所有可能值:

+-------------------------------------+-------------------------------------+-------------------------------------+
| [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] | [123456789] 1           [123456789] |
| 4           [123456789] [123456789] | [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] |
| [123456789] 2           [123456789] | [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [123456789] [123456789] [123456789] | [123456789] 5           [123456789] | 4           [123456789] 7           |
| [123456789] [123456789] 8           | [123456789] [123456789] [123456789] | 3           [123456789] [123456789] |
| [123456789] [123456789] 1           | [123456789] 9           [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3           [123456789] [123456789] | 4           [123456789] [123456789] | 2           [123456789] [123456789] |
| [123456789] 5           [123456789] | 1           [123456789] [123456789] | [123456789] [123456789] [123456789] |
| [123456789] [123456789] [123456789] | 8           [123456789] 6           | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+

2. 如果一个单元格包含数字,则从其所有相邻单元格的可能数字列表中移除该数字。相邻单元格是指给定单元格所在行、列和子网格中的其他单元格。例如,4从相邻单元格中移除第 2 行第 1 列单元格的固定值后,网格如下:

+-------------------------------------+-------------------------------------+-------------------------------------+
| [123 56789] [123 56789] [123 56789] | [123456789] [123456789] [123456789] | [123456789] 1           [123456789] |
| 4           [123 56789] [123 56789] | [123 56789] [123 56789] [123 56789] | [123 56789] [123 56789] [123 56789] |
| [123 56789] 2           [123 56789] | [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [123 56789] [123456789] [123456789] | [123456789] 5           [123456789] | 4           [123456789] 7           |
| [123 56789] [123456789] 8           | [123456789] [123456789] [123456789] | 3           [123456789] [123456789] |
| [123 56789] [123456789] 1           | [123456789] 9           [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3           [123456789] [123456789] | 4           [123456789] [123456789] | 2           [123456789] [123456789] |
| [123 56789] 5           [123456789] | 1           [123456789] [123456789] | [123456789] [123456789] [123456789] |
| [123 56789] [123456789] [123456789] | 8           [123456789] 6           | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+

3. 对所有已求解(或已固定)的单元格重复上一步,这些单元格可以是预先填充的,也可以是在上一次迭代求解过程中填充的。例如,从所有非固定单元格中移除所有固定值后的网格如下:

+-------------------------------------+-------------------------------------+-------------------------------------+
| [    56789] [  3  6789] [  3 567 9] | [ 23 567 9] [ 234 678 ] [ 2345 789] | [    56789] 1           [ 23456 89] |
| 4           [1 3  6789] [  3 567 9] | [ 23 567 9] [123  678 ] [123 5 789] | [    56789] [ 23 56789] [ 23 56 89] |
| [1   56789] 2           [  3 567 9] | [  3 567 9] [1 34 678 ] [1 345 789] | [    56789] [  3456789] [  3456 89] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 2   6  9] [  3  6  9] [ 23  6  9] | [ 23  6   ] 5           [123    8 ] | 4           [ 2   6 89] 7           |
| [ 2  567 9] [   4 67 9] 8           | [ 2   67  ] [12 4 67  ] [12 4  7  ] | 3           [ 2  56  9] [12  56  9] |
| [ 2  567  ] [  34 67  ] 1           | [ 23  67  ] 9           [ 234  78 ] | [    56 8 ] [ 2  56 8 ] [ 2  56 8 ] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3           [1    6789] [     67 9] | 4           7           [    5 7 9] | 2           [    56789] [1   56 89] |
| [ 2   6789] 5           [ 2 4 67 9] | 1           [ 23   7  ] [ 23   7 9] | [     6789] [  34 6789] [  34 6 89] |
| [12    7 9] [1  4  7 9] [ 2 4  7 9] | 8           [ 23   7  ] 6           | [1   5 7 9] [  345 7 9] [1 345   9] |
+-------------------------------------+-------------------------------------+-------------------------------------+

4. 继续迭代,直到网格稳定下来,即所有单元格的取值范围不再发生变化。例如,当前迭代的稳定网格如下:

+-------------------------------------+-------------------------------------+-------------------------------------+
| [    56789] [  3  6789] [  3 567 9] | [ 23 567 9] [ 234 6 8 ] [ 2345 789] | [    56789] 1           [ 23456 89] |
| 4           [1 3  6789] [  3 567 9] | [ 23 567 9] [123  6 8 ] [123 5 789] | [    56789] [ 23 56789] [ 23 56 89] |
| [1   56789] 2           [  3 567 9] | [  3 567 9] [1 34 6 8 ] [1 345 789] | [    56789] [  3456789] [  3456 89] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 2   6  9] [  3  6  9] [ 23  6  9] | [ 23  6   ] 5           [123    8 ] | 4           [ 2   6 89] 7           |
| [ 2  567 9] [   4 67 9] 8           | [ 2   67  ] [12 4 6   ] [12 4  7  ] | 3           [ 2  56  9] [12  56  9] |
| [ 2  567  ] [  34 67  ] 1           | [ 23  67  ] 9           [ 234  78 ] | [    56 8 ] [ 2  56 8 ] [ 2  56 8 ] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3           [1    6 89] [     6  9] | 4           7           [    5   9] | 2           [    56 89] [1   56 89] |
| [ 2   6789] 5           [ 2 4 67 9] | 1           [ 23      ] [ 23     9] | [     6789] [  34 6789] [  34 6 89] |
| [12    7 9] [1  4  7 9] [ 2 4  7 9] | 8           [ 23      ] 6           | [1   5 7 9] [  345 7 9] [1 345   9] |
+-------------------------------------+-------------------------------------+-------------------------------------+

5. 网格稳定后,按照某种策略选择一个未固定的单元格。从该单元格所有可能的数字中选择一个,并将该单元格固定(假设)为该数字。返回步骤 1 并重复。6
. 排除可能性可能会导致不一致。例如,您可能会遇到一个没有任何可能数字的单元格。在这种情况下,放弃该解法分支,并返回到您上次固定单元格的位置。选择另一个可能的数字进行固定并重复。7
. 如果网格在任何时候完全填满,您就找到了解决方案!
8. 如果您穷尽了所有解法分支,则此谜题无解。这种情况可能发生在初始单元格被错误填充时。

该算法实际上是对网格配置状态空间进行深度优先搜索。它保证要么找到一个解,要么证明该难题无解。

设置

我们首先编写类型来表示单元格和网格:

data Cell = Fixed Int | Possible [Int] deriving (Show, Eq)
type Row  = [Cell]
type Grid = [Row]

单元格要么固定为某个特定数字,要么有一组可能的数字可供选择。因此,很自然地将其表示为带有`and`构造函数的求和类型。行是单元格列表,网格是行列表。FixedPossible

我们将输入谜题视为一个由 81 个字符组成的字符串,代表从左到右、从上到下的单元格。例如:

.......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6...

这里,.表示一个未填充的单元格。让我们编写一个函数来读取此输入并将其解析为我们的Grid数据结构:

readGrid :: String -> Maybe Grid
readGrid s
  | length s == 81 = traverse (traverse readCell) . Data.List.Split.chunksOf 9 $ s
  | otherwise      = Nothing
  where
    readCell '.' = Just $ Possible [1..9]
    readCell c
      | Data.Char.isDigit c && c > '0' = Just . Fixed . Data.Char.digitToInt $ c
      | otherwise = Nothing

readGridJust grid如果输入正确,则返回 a ,否则返回Nothing。它将 a 解析.Possible包含所有可能数字的单元格,并将数字字符解析为包含该数字的单元格。让我们在REPLFixed中尝试一下

*Main> Just grid = readGrid ".......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> mapM_ print grid
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 1,Possible [1,2,3,4,5,6,7,8,9]]
[Fixed 4,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Fixed 2,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 5,Possible [1,2,3,4,5,6,7,8,9],Fixed 4,Possible [1,2,3,4,5,6,7,8,9],Fixed 7]
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 8,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 3,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 1,Possible [1,2,3,4,5,6,7,8,9],Fixed 9,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Fixed 3,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 4,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 2,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Fixed 5,Possible [1,2,3,4,5,6,7,8,9],Fixed 1,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 8,Possible [1,2,3,4,5,6,7,8,9],Fixed 6,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]

输出结果虽然不太易读,但是正确的。我们可以编写一些函数来清理它:

showGrid :: Grid -> String
showGrid = unlines . map (unwords . map showCell)
  where
    showCell (Fixed x) = show x
    showCell _ = "."

showGridWithPossibilities :: Grid -> String
showGridWithPossibilities = unlines . map (unwords . map showCell)
  where
    showCell (Fixed x)     = show x ++ "          "
    showCell (Possible xs) =
      (++ "]")
      . Data.List.foldl' (\acc x -> acc ++ if x `elem` xs then show x else " ") "["
      $ [1..9]

回到REPL

*Main> Just grid = readGrid ".......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> putStrLn $ showGrid grid
. . . . . . . 1 .
4 . . . . . . . .
. 2 . . . . . . .
. . . . 5 . 4 . 7
. . 8 . . . 3 . .
. . 1 . 9 . . . .
3 . . 4 . . 2 . .
. 5 . 1 . . . . .
. . . 8 . 6 . . .
*Main> putStrLn $ showGridWithPossibilities grid
[123456789] [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] 1           [123456789]
4           [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] [123456789]
[123456789] 2           [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] [123456789]
[123456789] [123456789] [123456789] [123456789] 5           [123456789] 4           [123456789] 7
[123456789] [123456789] 8           [123456789] [123456789] [123456789] 3           [123456789] [123456789]
[123456789] [123456789] 1           [123456789] 9           [123456789] [123456789] [123456789] [123456789]
3           [123456789] [123456789] 4           [123456789] [123456789] 2           [123456789] [123456789]
[123456789] 5           [123456789] 1           [123456789] [123456789] [123456789] [123456789] [123456789]
[123456789] [123456789] [123456789] 8           [123456789] 6           [123456789] [123456789] [123456789]

现在输出结果更易读了。我们可以看到,一开始所有未填充的单元格都包含了所有可能的数字值。接下来我们会用到这些函数进行调试。现在我们可以开始解谜了。

修剪细胞

我们可以逐个移除固定单元格中相邻单元格的数字。但是,更快的方法是找到一行单元格中的所有固定数字,并将它们从该行所有非固定单元格的可能值中移除。然后,我们可以对网格的所有行(以及列和子网格!稍后会介绍如何操作)重复此修剪步骤。

pruneCells :: [Cell] -> Maybe [Cell]
pruneCells cells = traverse pruneCell cells
  where
    fixeds = [x | Fixed x <- cells]

    pruneCell (Possible xs) = case xs Data.List.\\ fixeds of
      []  -> Nothing
      [y] -> Just $ Fixed y
      ys  -> Just $ Possible ys
    pruneCell x = Just x

pruneCells如前所述,对单元格列表进行修剪。首先,找出单元格列表中的固定数字。然后,遍历每个非固定单元格,从它们的可能值中移除已找到的固定数字。有两种特殊情况:

  • 如果修剪后得到的单元格没有可能的数字,则表明该搜索分支没有解决方案,因此Nothing在这种情况下,我们返回 a。
  • 如果修剪后只剩下一个可能的数字,那么我们就将该单元格变成一个固定单元格,其中包含该数字。

我们使用该traverse函数来修剪单元格,以便Nothing修剪一个单元格的结果能够传播到整个列表。

让我们在REPL 环境中试运行一下

*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> putStr $ showGridWithPossibilities $ [head grid] -- first row of the grid
6           [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] 1           [123456789]
*Main> putStr $ showGridWithPossibilities [fromJust $ pruneCells $ head grid] -- same row after pruning
6           [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] 1           [ 2345 789]

成功了!6并且1排除了其他细胞的可能性。现在我们准备好了……

修剪网格

对网格进行剪枝需要剪枝每一行、每一列和每一个子网格。我们先尝试在REPL中解决这个问题:

*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = traverse pruneCells grid
*Main> putStr $ showGridWithPossibilities grid'
6           [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] 1           [ 2345 789]
4           [123 56789] [123 56789] [123 56789] [123 56789] [123 56789] [123 56789] [123 56789] [123 56789]
[1 3456789] 2           [1 3456789] [1 3456789] [1 3456789] [1 3456789] [1 3456789] [1 3456789] [1 3456789]
[123  6 89] [123  6 89] [123  6 89] [123  6 89] 5           [123  6 89] 4           [123  6 89] 7
[12 4567 9] [12 4567 9] 8           [12 4567 9] [12 4567 9] [12 4567 9] 3           [12 4567 9] [12 4567 9]
[ 2345678 ] [ 2345678 ] 1           [ 2345678 ] 9           [ 2345678 ] [ 2345678 ] [ 2345678 ] [ 2345678 ]
3           [1   56789] [1   56789] 4           [1   56789] [1   56789] 2           [1   56789] [1   56789]
[ 234 6789] 5           [ 234 6789] 1           [ 234 6789] [ 234 6789] [ 234 6789] [ 234 6789] [ 234 6789]
[12345 7 9] [12345 7 9] [12345 7 9] 8           [12345 7 9] 6           [12345 7 9] [12345 7 9] [12345 7 9]

通过使用traverse`-ing` 操作网格pruneCells,我们可以逐行进行修剪。由于修剪一行不会影响其他行,因此我们无需在每次修剪步骤之间传递结果行。也就是说,`-ing` 操作traverse对我们来说就足够了,这里不需要 `-ing` 操作foldl

现在我们如何对列进行同样的操作呢?由于我们的网格表示是行优先的,所以我们首先需要将其转换为列优先的表示。幸运的是,这正是某个函数的Data.List.transpose作用:

*Main> Just grid = readGrid "693784512487512936125963874932651487568247391741398625319475268856129743274836159"
*Main> putStr $ showGrid grid
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9
*Main> putStr $ showGrid $ Data.List.transpose grid
6 4 1 9 5 7 3 8 2
9 8 2 3 6 4 1 5 7
3 7 5 2 8 1 9 6 4
7 5 9 6 2 3 4 1 8
8 1 6 5 4 9 7 2 3
4 2 3 1 7 8 5 9 6
5 9 8 4 3 6 2 7 1
1 3 7 8 9 2 6 4 5
2 6 4 7 1 5 8 3 9

现在删除列很容易:

*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = fmap Data.List.transpose . traverse pruneCells . Data.List.transpose $ grid
*Main> putStr $ showGridWithPossibilities grid'
6           [1 34 6789] [ 234567 9] [ 23 567 9] [1234 678 ] [12345 789] [1   56789] 1           [123456 89]
4           [1 34 6789] [ 234567 9] [ 23 567 9] [1234 678 ] [12345 789] [1   56789] [ 23456789] [123456 89]
[12  5 789] 2           [ 234567 9] [ 23 567 9] [1234 678 ] [12345 789] [1   56789] [ 23456789] [123456 89]
[12  5 789] [1 34 6789] [ 234567 9] [ 23 567 9] 5           [12345 789] 4           [ 23456789] 7
[12  5 789] [1 34 6789] 8           [ 23 567 9] [1234 678 ] [12345 789] 3           [ 23456789] [123456 89]
[12  5 789] [1 34 6789] 1           [ 23 567 9] 9           [12345 789] [1   56789] [ 23456789] [123456 89]
3           [1 34 6789] [ 234567 9] 4           [1234 678 ] [12345 789] 2           [ 23456789] [123456 89]
[12  5 789] 5           [ 234567 9] 1           [1234 678 ] [12345 789] [1   56789] [ 23456789] [123456 89]
[12  5 789] [1 34 6789] [ 234567 9] 8           [1234 678 ] 6           [1   56789] [ 23456789] [123456 89]

首先,我们调整transpose网格,将列转换为行。然后,我们通过遍历行来修剪行traverse。最后,我们再次调整网格,pruneCells将行转换回列。最后一个步骤需要进行修剪,因为返回的是一个.transposetransposefmaptraverse pruneCellsMaybe

修剪子网格稍微复杂一些。与修剪列类似,我们需要两个函数来将子网格转换为行,以及将行转换回来。让我们编写第一个函数:

subGridsToRows :: Grid -> Grid
subGridsToRows =
  concatMap (\rows -> let [r1, r2, r3] = map (Data.List.Split.chunksOf 3) rows
                      in zipWith3 (\a b c -> a ++ b ++ c) r1 r2 r3)
  . Data.List.Split.chunksOf 3

试试看:

*Main> Just grid = readGrid "693784512487512936125963874932651487568247391741398625319475268856129743274836159"
*Main> putStr $ showGrid grid
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9
*Main> putStr $ showGrid $ subGridsToRows grid
6 9 3 4 8 7 1 2 5
7 8 4 5 1 2 9 6 3
5 1 2 9 3 6 8 7 4
9 3 2 5 6 8 7 4 1
6 5 1 2 4 7 3 9 8
4 8 7 3 9 1 6 2 5
3 1 9 8 5 6 2 7 4
4 7 5 1 2 9 8 3 6
2 6 8 7 4 3 1 5 9

您可以检查代码和输出,确保其正常运行。另外,我们发现不需要编写反向变换函数,subGridsToRows它本身就是一个反向变换函数:

*Main> putStr $ showGrid grid
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9
*Main> putStr $ showGrid $ subGridsToRows $ subGridsToRows $ grid
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9

太好了!现在编写子网格剪枝函数就容易多了:

*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = fmap subGridsToRows . traverse pruneCells . subGridsToRows $ grid
*Main> putStr $ showGridWithPossibilities grid'
6           [1 3 5 789] [1 3 5 789] [123456789] [123456789] [123456789] [ 23456789] 1           [ 23456789]
4           [1 3 5 789] [1 3 5 789] [123456789] [123456789] [123456789] [ 23456789] [ 23456789] [ 23456789]
[1 3 5 789] 2           [1 3 5 789] [123456789] [123456789] [123456789] [ 23456789] [ 23456789] [ 23456789]
[ 234567 9] [ 234567 9] [ 234567 9] [1234 678 ] 5           [1234 678 ] 4           [12  56 89] 7
[ 234567 9] [ 234567 9] 8           [1234 678 ] [1234 678 ] [1234 678 ] 3           [12  56 89] [12  56 89]
[ 234567 9] [ 234567 9] 1           [1234 678 ] 9           [1234 678 ] [12  56 89] [12  56 89] [12  56 89]
3           [12 4 6789] [12 4 6789] 4           [ 23 5 7 9] [ 23 5 7 9] 2           [1 3456789] [1 3456789]
[12 4 6789] 5           [12 4 6789] 1           [ 23 5 7 9] [ 23 5 7 9] [1 3456789] [1 3456789] [1 3456789]
[12 4 6789] [12 4 6789] [12 4 6789] 8           [ 23 5 7 9] 6           [1 3456789] [1 3456789] [1 3456789]

效果很好。现在我们可以将这三个步骤串联起来,对整个网格进行修剪。我们还需要确保每一步的修剪结果都能传递到下一步。这样,一步中创建的固定单元格就能在后续步骤中引发更多的修剪。>>=为此,我们使用单子绑定(monadic bind())。以下是最终代码:

pruneGrid' :: Grid -> Maybe Grid
pruneGrid' grid =
  traverse pruneCells grid
  >>= fmap Data.List.transpose . traverse pruneCells . Data.List.transpose
  >>= fmap subGridsToRows . traverse pruneCells . subGridsToRows

测试内容如下:

*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = pruneGrid' grid
*Main> putStr $ showGridWithPossibilities grid'
6           [  3   789] [  3 5 7 9] [ 23 5 7 9] [ 234  78 ] [ 2345 789] [    5 789] 1           [ 2345  89]
4           [1 3   789] [  3 5 7 9] [ 23 567 9] [123  678 ] [123 5 789] [    56789] [ 23 56789] [ 23 56 89]
[1   5 789] 2           [  3 5 7 9] [  3 567 9] [1 34 678 ] [1 345 789] [    56789] [  3456789] [  3456 89]
[ 2      9] [  3  6  9] [ 23  6  9] [ 23  6   ] 5           [123    8 ] 4           [ 2   6 89] 7
[ 2  5 7 9] [   4 67 9] 8           [ 2   67  ] [12 4 67  ] [12 4  7  ] 3           [ 2  56  9] [12  56  9]
[ 2  5 7  ] [  34 67  ] 1           [ 23  67  ] 9           [ 234  78 ] [    56 8 ] [ 2  56 8 ] [ 2  56 8 ]
3           [1    6789] [     67 9] 4           7           [    5 7 9] 2           [    56789] [1   56 89]
[ 2    789] 5           [ 2 4 67 9] 1           [ 23   7  ] [ 23   7 9] [     6789] [  34 6789] [  34 6 89]
[12    7 9] [1  4  7 9] [ 2 4  7 9] 8           [ 23   7  ] 6           [1   5 7 9] [  345 7 9] [1 345   9]
*Main> putStr $ showGrid grid
6 . . . . . . 1 .
4 . . . . . . . .
. 2 . . . . . . .
. . . . 5 . 4 . 7
. . 8 . . . 3 . .
. . 1 . 9 . . . .
3 . . 4 . . 2 . .
. 5 . 1 . . . . .
. . . 8 . 6 . . .
*Main> putStr $ showGrid grid'
6 . . . . . . 1 .
4 . . . . . . . .
. 2 . . . . . . .
. . . . 5 . 4 . 7
. . 8 . . . 3 . .
. . 1 . 9 . . . .
3 . . 4 7 . 2 . .
. 5 . 1 . . . . .
. . . 8 . 6 . . .

我们可以清楚地看到网格周围各种可能性的大幅减少。我们还可以看到7第 7 行第 5 列的单元格中出现了一个弹出框。这意味着我们可以进一步修剪网格,直到它稳定下来。如果您熟悉 Haskell,您可能会意识到这实际上是在单子上下文中寻找函数的不动点pruneGrid'。它的实现很简单:

pruneGrid :: Grid -> Maybe Grid
pruneGrid = fixM pruneGrid'
  where
    fixM f x = f x >>= \x' -> if x' == x then return x else fixM f x'

这段代码的核心在于这个fixM函数。它接受一个单子函数f和一个初始值,并递归调用自身直到返回值稳定下来。让我们在REPL中再运行一次:

*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = pruneGrid grid
*Main> putStr $ showGridWithPossibilities grid'
6           [  3   789] [  3 5 7 9] [ 23 5 7 9] [ 234   8 ] [ 2345 789] [    5 789] 1           [ 2345  89]
4           [1 3   789] [  3 5 7 9] [ 23 567 9] [123  6 8 ] [123 5 789] [    56789] [ 23 56789] [ 23 56 89]
[1   5 789] 2           [  3 5 7 9] [  3 567 9] [1 34 6 8 ] [1 345 789] [    56789] [  3456789] [  3456 89]
[ 2      9] [  3  6  9] [ 23  6  9] [ 23  6   ] 5           [123    8 ] 4           [ 2   6 89] 7
[ 2  5 7 9] [   4 67 9] 8           [ 2   67  ] [12 4 6   ] [12 4  7  ] 3           [ 2  56  9] [12  56  9]
[ 2  5 7  ] [  34 67  ] 1           [ 23  67  ] 9           [ 234  78 ] [    56 8 ] [ 2  56 8 ] [ 2  56 8 ]
3           [1    6 89] [     6  9] 4           7           [    5   9] 2           [    56 89] [1   56 89]
[ 2    789] 5           [ 2 4 67 9] 1           [ 23      ] [ 23     9] [     6789] [  34 6789] [  34 6 89]
[12    7 9] [1  4  7 9] [ 2 4  7 9] 8           [ 23      ] 6           [1   5 7 9] [  345 7 9] [1 345   9]

我们看到,7第 7 行第 5 列的单元格已从其所有相邻单元格中移除。我们无法再对网格进行修剪。现在是时候做出选择了。

做出选择

网格确定后,我们需要选择一个非固定单元格,并通过赋予其一个可能的值将其固定。这样我们就得到了两个网格,接下来是解搜索的状态空间:

  • 其中一个单元格将选定的单元格固定为选定的数字,并且,
  • 另一种情况是,选定的单元格具有除我们选择要修复的那个单元格之外的所有其他可能性。

我们将这个函数称为nextGrids

nextGrids :: Grid -> (Grid, Grid)
nextGrids grid =
  let (i, first@(Fixed _), rest) =
        fixCell
        . Data.List.minimumBy (compare `Data.Function.on` (possibilityCount . snd))
        . filter (isPossible . snd)
        . zip [0..]
        . concat
        $ grid
  in (replace2D i first grid, replace2D i rest grid)
  where
    isPossible (Possible _) = True
    isPossible _            = False

    possibilityCount (Possible xs) = length xs
    possibilityCount (Fixed _)     = 1

    fixCell (i, Possible [x, y]) = (i, Fixed x, Fixed y)
    fixCell (i, Possible (x:xs)) = (i, Fixed x, Possible xs)
    fixCell _                    = error "Impossible case"

    replace2D :: Int -> a -> [[a]] -> [[a]]
    replace2D i v =
      let (x, y) = (i `quot` 9, i `mod` 9) in replace x (replace y (const v))
    replace p f xs = [if i == p then f x else x | (x, i) <- zip xs [0..]]

我们选择可能性最少的非固定单元格作为枢轴。这种策略符合直觉,因为可能性最少的单元格意味着我们做出正确假设的概率最大。固定一个非固定单元格会导致以下两种情况之一:

a. 单元格只有两个可能的值,导致两个固定单元格;或者,
b. 单元格有两个以上可能的值,导致一个固定单元格和一个非固定单元格。

接下来,我们只需将非固定单元格替换为其固定值和固定/非固定值选项,这需要一些数学运算和列表遍历。在REPL中快速检查一下:

*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = pruneGrid grid
*Main> putStr $ showGridWithPossibilities grid'
6           [  3   789] [  3 5 7 9] [ 23 5 7 9] [ 234   8 ] [ 2345 789] [    5 789] 1           [ 2345  89]
4           [1 3   789] [  3 5 7 9] [ 23 567 9] [123  6 8 ] [123 5 789] [    56789] [ 23 56789] [ 23 56 89]
[1   5 789] 2           [  3 5 7 9] [  3 567 9] [1 34 6 8 ] [1 345 789] [    56789] [  3456789] [  3456 89]
[ 2      9] [  3  6  9] [ 23  6  9] [ 23  6   ] 5           [123    8 ] 4           [ 2   6 89] 7
[ 2  5 7 9] [   4 67 9] 8           [ 2   67  ] [12 4 6   ] [12 4  7  ] 3           [ 2  56  9] [12  56  9]
[ 2  5 7  ] [  34 67  ] 1           [ 23  67  ] 9           [ 234  78 ] [    56 8 ] [ 2  56 8 ] [ 2  56 8 ]
3           [1    6 89] [     6  9] 4           7           [    5   9] 2           [    56 89] [1   56 89]
[ 2    789] 5           [ 2 4 67 9] 1           [ 23      ] [ 23     9] [     6789] [  34 6789] [  34 6 89]
[12    7 9] [1  4  7 9] [ 2 4  7 9] 8           [ 23      ] 6           [1   5 7 9] [  345 7 9] [1 345   9]
*Main> -- the row-4-column-1 cell is the first cell with only two possibilities, [2, 9].
*Main> -- it is chosen as the pivot cell to find the next grids.
*Main> (grid1, grid2) = nextGrids grid'
*Main> putStr $ showGridWithPossibilities grid1
6           [  3   789] [  3 5 7 9] [ 23 5 7 9] [ 234   8 ] [ 2345 789] [    5 789] 1           [ 2345  89]
4           [1 3   789] [  3 5 7 9] [ 23 567 9] [123  6 8 ] [123 5 789] [    56789] [ 23 56789] [ 23 56 89]
[1   5 789] 2           [  3 5 7 9] [  3 567 9] [1 34 6 8 ] [1 345 789] [    56789] [  3456789] [  3456 89]
2           [  3  6  9] [ 23  6  9] [ 23  6   ] 5           [123    8 ] 4           [ 2   6 89] 7
[ 2  5 7 9] [   4 67 9] 8           [ 2   67  ] [12 4 6   ] [12 4  7  ] 3           [ 2  56  9] [12  56  9]
[ 2  5 7  ] [  34 67  ] 1           [ 23  67  ] 9           [ 234  78 ] [    56 8 ] [ 2  56 8 ] [ 2  56 8 ]
3           [1    6 89] [     6  9] 4           7           [    5   9] 2           [    56 89] [1   56 89]
[ 2    789] 5           [ 2 4 67 9] 1           [ 23      ] [ 23     9] [     6789] [  34 6789] [  34 6 89]
[12    7 9] [1  4  7 9] [ 2 4  7 9] 8           [ 23      ] 6           [1   5 7 9] [  345 7 9] [1 345   9]
*Main> putStr $ showGridWithPossibilities grid2
6           [  3   789] [  3 5 7 9] [ 23 5 7 9] [ 234   8 ] [ 2345 789] [    5 789] 1           [ 2345  89]
4           [1 3   789] [  3 5 7 9] [ 23 567 9] [123  6 8 ] [123 5 789] [    56789] [ 23 56789] [ 23 56 89]
[1   5 789] 2           [  3 5 7 9] [  3 567 9] [1 34 6 8 ] [1 345 789] [    56789] [  3456789] [  3456 89]
9           [  3  6  9] [ 23  6  9] [ 23  6   ] 5           [123    8 ] 4           [ 2   6 89] 7
[ 2  5 7 9] [   4 67 9] 8           [ 2   67  ] [12 4 6   ] [12 4  7  ] 3           [ 2  56  9] [12  56  9]
[ 2  5 7  ] [  34 67  ] 1           [ 23  67  ] 9           [ 234  78 ] [    56 8 ] [ 2  56 8 ] [ 2  56 8 ]
3           [1    6 89] [     6  9] 4           7           [    5   9] 2           [    56 89] [1   56 89]
[ 2    789] 5           [ 2 4 67 9] 1           [ 23      ] [ 23     9] [     6789] [  34 6789] [  34 6 89]
[12    7 9] [1  4  7 9] [ 2 4  7 9] 8           [ 23      ] 6           [1   5 7 9] [  345 7 9] [1 345   9]

解开谜题

到目前为止,我们已经实现了算法的部分内容。现在我们将把所有内容整合起来,解决这个难题。首先,我们需要知道我们是否已经完成,还是搞砸了:

isGridFilled :: Grid -> Bool
isGridFilled grid = null [ () | Possible _ <- concat grid ]

isGridInvalid :: Grid -> Bool
isGridInvalid grid =
  any isInvalidRow grid
  || any isInvalidRow (Data.List.transpose grid)
  || any isInvalidRow (subGridsToRows grid)
  where
    isInvalidRow row =
      let fixeds         = [x | Fixed x <- row]
          emptyPossibles = [x | Possible x <- row, null x]
      in hasDups fixeds || not (null emptyPossibles)

    hasDups l = hasDups' l []

    hasDups' [] _ = False
    hasDups' (y:ys) xs
      | y `elem` xs = True
      | otherwise   = hasDups' ys (y:xs)

isGridFilled返回网格是否已完全填充(通过检查网格中是否存在任何Possible单元格)。isGridInvalid检查网格是否无效,因为网格中是否存在重复的固定单元格(位于任何区块中)或是否存在无法填充的非固定单元格。

solve现在编写这个函数几乎是件非常简单的事情:

solve :: Grid -> Maybe Grid
solve grid = pruneGrid grid >>= solve'
  where
    solve' g
      | isGridInvalid g = Nothing
      | isGridFilled g  = Just g
      | otherwise       =
          let (grid1, grid2) = nextGrids g
          in solve grid1 <|> solve grid2

我们像之前一样对网格进行修剪,并将其传递给辅助函数solve'。如果网格无效,则solve'函数会抛出异常;Nothing如果网格已完全填充,则返回已求解的网格。否则,它会在搜索树中找到接下来的两个网格,并通过调用solve函数使用回溯递归地求解它们。这里的回溯是通过使用Alternative以下Maybe类型的实现来实现的:

instance Alternative Maybe where
  empty = Nothing
  Nothing <|> r = r
  l       <|> _ = l

如果第一个分支返回一个值,则执行计算中的第二个分支Nothing

呼!这花了我们好长时间。现在让我们进行最终测试:

*Main> Just grid =
  readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> putStr $ showGrid grid
6 . . . . . . 1 .
4 . . . . . . . .
. 2 . . . . . . .
. . . . 5 . 4 . 7
. . 8 . . . 3 . .
. . 1 . 9 . . . .
3 . . 4 . . 2 . .
. 5 . 1 . . . . .
. . . 8 . 6 . . .
*Main> Just grid' = solve grid
*Main> putStr $ showGrid grid'
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9

成功了!我们来快速写个main包装器solve,以便从命令行调用它:

main :: IO ()
main = do
  inputs <- lines <$> getContents
  Control.Monad.forM_ inputs $ \input ->
    case readGrid input of
      Nothing   -> putStrLn "Invalid input"
      Just grid -> case solve grid of
        Nothing    -> putStrLn "No solution found"
        Just grid' -> putStrLn $ showGrid grid'

现在,我们可以通过命令行调用它:

$ echo ".......12.5.4............3.7..6..4....1..........8....92....8.....51.7.......3..." | stack exec sudoku
3 6 4 9 7 8 5 1 2
1 5 2 4 3 6 9 7 8
8 7 9 1 2 5 6 3 4
7 3 8 6 5 1 4 2 9
6 9 1 2 4 7 3 8 5
2 4 5 3 8 9 1 6 7
9 2 3 7 6 4 8 5 1
4 8 6 5 1 2 7 9 3
5 1 7 8 9 3 2 4 6

好了,我们完成了。

如果你想尝试不同的谜题,这个文件列出了一些最难的谜题。让我们用程序运行其中的一些,看看速度如何:

$ head -n100 sudoku17.txt | time stack exec sudoku
... output omitted ...
      116.70 real       198.09 user        94.46 sys

在我的 2014 款 MacBook Pro(配备 2.2 GHz Intel Core i7 处理器和 16 GB 内存)上,完成一百道谜题大约需要 117 秒,平均每道题耗时约 1.2 秒。这速度相当慢,但我们会在后续文章中努力提升速度。

结论

在这篇略显冗长的文章中,我们一步步学习了如何用 Haskell 编写一个简单的数独求解器。在本系列的后续 文章中,我们将深入分析解决方案的性能,并探索更高效的算法和数据结构来求解数独问题。目前为止的代码可以在这里找到。

这篇文章最初发表在我的博客上。

文章来源:https://dev.to/abhin4v/fast-sudoku-solver-in-haskell-1-a-simple-solution-5ekf