Day 14: Disk Defragmentation

Today’s challenge builds off the work from Day 10, so if you haven’t completed that yet - get cracking, and then come back once you are able to calculate a “knot hash”.

Today’s challenge text packs quite a lot of detail, I’ve tried to extract the key sentences:

The disk in question consists of a 128x128 grid; each square of the grid is either free or used. On this disk, the state of the grid is tracked by the bits in a sequence of knot hashes. A total of 128 knot hashes are calculated, each corresponding to a single row in the grid; each hash contains 128 bits which correspond to individual grid squares. Each bit of a hash indicates whether that square is free (0) or used (1). The hash inputs are a key string (your puzzle input), a dash, and a number from 0 to 127 corresponding to the row.

So, we need to calculate a hash and convert it to binary 128 times. Let’s begin…

Solving Part 1

A “knot hash” was the function we created to get the second star on Day 10:

raze string "x"$(xor over) each 16 cut knot[til 256;raze 64#enlist ("j"$first read0 `:input/10.txt),17 31 73 47 23]

We need to generalise this so we can call it with any input (“sequence of lengths” in AoC speak). To do so we can replace the variable part of the function with a variable, lets call it lengths, and wrap it in a function:

{[lengths] raze string "x"$(xor over) each 16 cut knot[til 256;raze 64#enlist ("j"$lengths),17 31 73 47 23] }

Note that we still wish to cast the input to ASCII (using "j"$) and append the magic numbers 17 31 73 47 23. Let’s call our function hash:

hash:{[lengths] raze string "x"$(xor over) each 16 cut knot[til 256;raze 64#enlist ("j"$lengths),17 31 73 47 23] }

If we want to be more Q-like, we can rename the variable to x, and as the first argument of a function is implicitly x, we do not need to add it to the definition:

hash:{raze string "x"$(xor over) each 16 cut knot[til 256;raze 64#enlist ("j"$x),17 31 73 47 23]}

Armed with our new hash function, lets take a look at the example. The input to the hash function is a key, a dash and a number. Their example key is "flqrgnkx-0", the hash of this is shown below:

q)hash "flqrgnkx-0"
"d4f76bdcbf838f8416ccfa8bc6d1f9e6"

We then need to consider this string as hexadecimal representation where “0” is 0 and “f” is 15… however in our hash function we have the result as a list of bytes which we are then converting to a string for pretty-printing. We can therefore simplify our hash function:

q)hash:{"x"$(xor over) each 16 cut knot[til 256;raze 64#enlist ("j"$x),17 31 73 47 23]}
q)hash "flqrgnkx-0"
0xd4f76bdcbf838f8416ccfa8bc6d1f9e6

We then want to convert to binary. As we leant from Day 10 we can use 0b vs:

q)0b vs 0xd4 / d => 1101, 4 => 0100
11010100b

We need to use either each-both, ', or each-right, /: to apply the conversion to element of our hash:

q)0b vs'hash "flqrgnkx-0"
11010100b
11110111b
01101011b
11011100b
..
q)0b vs/:hash "flqrgnkx-0"
11010100b
11110111b
01101011b
11011100b
..

As this is performing the operation on each item in the list, we need to use raze to flatten the results:

q)raze 0b vs'hash "flqrgnkx-0"
11010100111101110110101111011100101111111000001110001111100001000001011011001100111110101000101111000110110100011111100111100110b

Let’s wrap this in a lambda for reasons that will become clear later:

q){raze 0b vs'x} hash "flqrgnkx-0"
11010100111101110110101111011100101111111000001110001111100001000001011011001100111110101000101111000110110100011111100111100110b

We need to do this another 127 times, so let’s construct a list of "flqrgnkx-0", "flqrgnkx-1", ..., "flqrgnkx-127". In order to do this we can start by creating the list of 0..127 via til:

q)til 128
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 ..

We need to join (,) the key, "flqrgnkx-" with each-right number, similar to what we did on Day 8 with our registers.

q)"flqrgnkx-",/:til 128
"f" "l" "q" "r" "g" "n" "k" "x" "-" 0
"f" "l" "q" "r" "g" "n" "k" "x" "-" 1
"f" "l" "q" "r" "g" "n" "k" "x" "-" 2
"f" "l" "q" "r" "g" "n" "k" "x" "-" 3
"f" "l" "q" "r" "g" "n" "k" "x" "-" 4
"f" "l" "q" "r" "g" "n" "k" "x" "-" 5
..

This isn’t quite what we’re after - the list of numbers if a different type than the key. We therefore need to cast the numeric list to a string, and try again:

q)"flqrgnkx-",/:string til 128
"flqrgnkx-0"
"flqrgnkx-1"
"flqrgnkx-2"
"flqrgnkx-3"
"flqrgnkx-4"
"flqrgnkx-5"
..

If we apply our hash function to each:

q)hash each "flqrgnkx-",/:string til 128
0xd4f76bdcbf838f8416ccfa8bc6d1f9e6
0x55eab3c4fbfede16dcec2c66dda26464
0x0adf13fa40e8ea815376776af3b7b231
0xad3da28cd7b8fb99742c0e63672caf62
0x682fe48c55876aaaa11df2634f96d31a
..

We can then convert each entry to binary with our lambda:

q){raze 0b vs'x} each hash each "flqrgnkx-",/:string til 128
11010100111101110110101111011100101111111000001110001111100001000001011011001..
01010101111010101011001111000100111110111111111011011110000101101101110011101..
00001010110111110001001111111010010000001110100011101010100000010101001101110..
10101101001111011010001010001100110101111011100011111011100110010111010000101..
01101000001011111110010010001100010101011000011101101010101010101010000100011..
..

In order to count the number of ‘on’ bits, we can flatten the results with raze and user sum to confirm that our code gets the correct result for the example key, 8108.

q)sum raze {raze 0b vs'x} each hash each "flqrgnkx-",/:string til 128
8108i

Now we just need to replace the example key with our puzzle input, append the dash, and get today’s first star. First lets read in the puzzle input and grab the first line:

q)read0 `:input/14.txt             / looks like what we want
"hxtvlmkl"
q)count read0 `:input/14.txt       / but as we know, we've got a list of strings
1
q)first read0 `:input/14.txt       / so grab the first
"hxtvlmkl"
q)count first read0 `:input/14.txt / and sanity check
8

Appending a dash can be done a few different ways, pick your poison:

q)(first read0 `:input/14.txt),"-"   / round brackets
"hxtvlmkl-"
q),[;"-"] first read0 `:input/14.txt / as a projection
"hxtvlmkl-"
q),[first read0 `:input/14.txt;"-"]  / square brackets
"hxtvlmkl-"

And then join this key with each-right numbers 0 through 127:

q)10#,[first read0 `:input/14.txt;"-"],/:string til 128 / no round brackets
"hxtvlmkl-0"
"hxtvlmkl-1"
"hxtvlmkl-2"
"hxtvlmkl-3"
"hxtvlmkl-4"
"hxtvlmkl-5"
..
q)(first[read0 `:input/14.txt],"-"),/:string til 128 / mixing things up a bit
"hxtvlmkl-0"
"hxtvlmkl-1"
"hxtvlmkl-2"
"hxtvlmkl-3"
"hxtvlmkl-4"
"hxtvlmkl-5"
..

… and combine with our earlier code to get the first star:

q)sum raze {raze 0b vs'x} each hash each (first[read0 `:input/14.txt],"-"),/:string til 128
8214i

Bonus 1

Did it feel like you could go make a cup of tea whilst the hash function was running 128 times? Wouldn’t it be great if you could reduce the time it took do run that piece of code - without having to make any changes to the underlying function?

If you answered yes to both of these questions, you are in luck. Q has a similar operator to each, known as peach or parallel-each. Under the hood, this operator slices up the workload and runs them in parallel.

In order to get a benefit from peach you need to start up KDB with the little -s argument followed by the number of slaves you wish to spawn; you’re unlikely to get much benefit from having more slaves than you have CPU cores (assuming CPU-heavy workload), so just go with 1:1, in my case this means starting q with 4 slaves:

$ q -s 4

Switching out the each that feeds the hash function with a peach results in the computation taking approximately half the time:

q)\s / confirm 4 slaves available
4i
q)\t sum raze {raze 0b vs'x} each hash each (first[read0 `:input/14.txt],"-"),/:string til 128
366
q)\t sum raze {raze 0b vs'x} each hash peach (first[read0 `:input/14.txt],"-"),/:string til 128
209

In newer versions of KDB (3.5 and above) you can dynamically adjust the number of slaves via the \s command, which allows you to see the impact of parallelism on your function:

q)\s 1
q)\t sum raze {raze 0b vs'x} each hash peach (first[read0 `:input/14.txt],"-"),/:string til 128
410
q)\s 2
q)\t sum raze {raze 0b vs'x} each hash peach (first[read0 `:input/14.txt],"-"),/:string til 128
298
q)\s 3
q)\t sum raze {raze 0b vs'x} each hash peach (first[read0 `:input/14.txt],"-"),/:string til 128
231
q)\s 4
q)\t sum raze {raze 0b vs'x} each hash peach (first[read0 `:input/14.txt],"-"),/:string til 128
213

The speed-up from using peach is not guaranteed, there is overhead in chunking up the work and stitching back together, so always test with different size workloads to see if there is any benefit.

One of the important things to note with peach is that you cannot update global variables from a function executed via peach - only the master thread is permitted to change globals. If you try, you will encounter the 'noupdate error:

q)a:1; { a::a+x } each 1 2 3
::
::
::
q)a:1; { a::a+x } peach 1 2 3
'noupdate: `. `a
  [2]  { a::a+x }
          ^
  [0]  a:1; { a::a+x } peach 1 2 3

Solving Part 2

Now, all the defragmenter needs to know is the number of regions. A region is a group of used squares that are all adjacent, not including diagonals. Every used square is in exactly one region: lone used squares form their own isolated regions, while several adjacent squares all count as a single region.

Part 2 ups the ante. We need to determine the region of each ‘bit’ in the set of 128 hashes. There are a few different ways to approach this. The way I solved it was to iterate over every bit in the grid, recursively visiting all adjacent bits - recording them as being part of the current region.

We need 3 things:

  • the 128x128 grid of bits
  • a list of all coordinates
  • a recursive function to iterate over them

First things first; the grid… We are already building the list of 128 hashes before we raze it in order to sum up all the ‘on’ bits. Therefore we can use the code from Part 1 to give us the grid - but lets switch back to the example key:

q){raze 0b vs'x} each hash each "flqrgnkx-",/:string til 128
11010100111101110110101111011100101111111000001110001111100001000001011011001100111110101000101111000110110100011111100111100110b
01010101111010101011001111000100111110111111111011011110000101101101110011101100001011000110011011011101101000100110010001100100b
00001010110111110001001111111010010000001110100011101010100000010101001101110110011101110110101011110011101101111011001000110001b
10101101001111011010001010001100110101111011100011111011100110010111010000101100000011100110001101100111001011001010111101100010b
..

Lets save this as grid:

q)grid:{raze 0b vs'x} each hash each "flqrgnkx-",/:string til 128
q)count grid       / confirm we have 128 rows
128
q)count first grid / confirm we have 128 bits in a row
128

We can index into grid a few different ways:

q)grid[0]     / return first row of grid
11010100111101110110101111011100101111111000001110001111100001000001011011001100111110101000101111000110110100011111100111100110b
q)grid[;0]    / return first column (as seen on Day 12)
10010101110000110101010000010001001111001100100010000101110001011010100000010100111000011001101111010111010111011111010000000110b
q)grid[2;3]   / return 3rd element of 2nd row of grid (pure square brackets)
0b
q).[grid;2 3] / return 3rd element of 2nd row of grid (. with square brackets)
0b
q)grid . 2 3  / return 3rd element of 2nd row of grid (infix)
0b

The first element (top-left) is at 0 0, and the last element (bottom-right) is at 127 127. We want to generate all combinations of 0 and 127 of length 2. Also known as the Cartesian product. KDB has a built-in for this, cross.

q)"abc" cross 1 2 3 / join each of "abc" with each of 1 2 3
"a" 1
"a" 2
"a" 3
"b" 1
"b" 2
"b" 3
"c" 1
"c" 2
"c" 3

This can be mimicked by joining (,) each-right with each-left, and flattening:

q)raze "abc",/:\:1 2 3
"a" 1
"a" 2
"a" 3
"b" 1
"b" 2
"b" 3
"c" 1
"c" 2
"c" 3

We want the Cartesian product of 0..127 and 0..127, so we can feed these into cross:

q)til[128] cross til 128 / note square brackets around til due to left-of-right evaluation
0 0
0 1
0 2
0 3
0 4
0 5
..
q)count til[128] cross til 128 / there are 128x128 16384 combinations
16384

Note: Slightly faster/lower memory requirements to use join over cross:

q)\ts:100 raze til[128],/:\:til 128
86 1051136
q)\ts:100 til[128] cross til 128
103 1577472

Let’s save our combinations as combos:

q)combos:til[128] cross til 128

For the recursive function, we want to check all neighbouring cells, and if there are any that are set to 1b (aka ‘on’), we want to recurse into each of them and check their neighbours, and so on until we have exhausted all bits that are connected to that original bit.

We need to keep track of bits that we have already visited (otherwise we will never stop), and also keep track of bits that belong to the same region.

Writing this as pseudocode:

f:{[coordinate;region]
  / if grid at this coordinate is off, return immediately
  / if coordinate has already been visited, return immediately
  / set this coordinate as belonging to 'region'
  / call f on the coordinate to the left
  / call f on the coordinate to the right
  / call f on the coordinate above
  / call f on the coordinate below
  }

As mentioned above, we need a way of tracking visited coordinates as well as the region that the coordinates belong to. This can be solved using a dictionary mapping coordinates to regions. If a coordinate is in the key of the dictionary, then we must have already visited it.

We can initialise the dictionary with null values so we know to exclude them later. The keys will be 2-item lists of longs, and the values will be longs. As we need scalars to initialise dictionaries, we need to enlist the values:

q)enlist[0N 0N]!enlist 0N / looks
 |

We can call this dictionary visited:

q)visited:enlist[0N 0N]!enlist 0N
q)1 1 in key visited / KDB knows that the key is a 2-item list
0b

Let’s fill out the pseudocode with Q code:

f:{[coordinate;region]
  if[not grid . coordinate;       / if grid at this coordinate is off, return immediately
    :()
    ];
  if[coordinate in key visited;   / if coordinate has already been visited, return immediately
    :()
  ];
  visited[coordinate]:region;     / set this coordinate as belonging to 'region'
  .z.s[coordinate + 0 -1;region]; / call f on the coordinate to the left
  .z.s[coordinate + 0 1;region];  / call f on the coordinate to the right
  .z.s[coordinate + -1 0;region]; / call f on the coordinate above
  .z.s[coordinate + 1 0;region];  / call f on the coordinate below
  }

If our function f is called recursively, we want to use the same region, however if it is the first time we are calling f, we want to use a different region. We do not have to worry about overwriting existing regions as we are explicitly checking whether we’ve already visited a coordinate and exiting early if so.

There are 16384, 128*128, unique coordinates, so we can call the function f with each coordinate, and a unique number between 0 and 16383. We can generate the range 0 to 16383 via til:

q)til 16384
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32..
q)count til 16384
16384
q)last til 16384
16383

We want to call f with each of the 16384 coordinates, and each of the numbers 0..16383. We can use the each-both adverb for this:

q)f'[til[128] cross til 128;til 16384];

That should evaluate in about a second… the next step is to take a look at our visited dictionary:

q)visited
   |
0 0| 0
0 1| 0
1 1| 0
0 3| 3
1 3| 3
0 5| 5
1 5| 5
0 8| 8
0 9| 8
..

We can see there are duplicate regions - the challenge asks us to count the number of unique regions. As we learnt way back on Day 2, Q has the distinct operator to return a unique set. So let’s apply that to the value of the visited dictionary:

q)value visited
0N 0 0 0 3 3 5 5 8 8 8 8 8 8 8 8 8 8 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 ..
q)distinct value visited
0N 0 3 5 8 13 17 20 27 32 61 67 72 76 80 86 88 92 94 104 107 111 119 125 144 187 217 234 238 245 260 262 286 319 326 348 374 383 384 391 419 443 453 510 557 570 583 591 619 627 635 681 684 691 697 707 738 746 748 754 756 761 798 852 856 859 871 885 910 924 951 953 959 961 970 972 983 989 993 1001 ..

We can then perform count to see how many regions there are:

q)count distinct value visited
1243

The example answer should be 1242, we are off-by-one… this is because we initialised our dictionary with 0N (if you look, it’s the first result when we performed value). We therefore need to subtract 1 from this count - or alternatively, add -1:

q)(count distinct value visited)-1
1242
q)-1 + count distinct value visited / no brackets needed
1242

This looks good, lets plug in our puzzle input, reset the visited dictionary and get the second star:

q)grid:{raze 0b vs'x} each hash each (first[read0 `:input/14.txt],"-"),/:string til 128
q)visited:(enlist 0N 0N)!enlist 0N;
q)f2'[til[128] cross til 128;til 128*128];
q)-1+count distinct value visited
1093

Bonus 2

The second tip for today’s challenge involves the unique attribute.

As we iterate over each coordinate we check whether it is in the key of the visited dictionary.

The key of the dictionary is, by default, just a regular list. This means KDB has to scan through the list, element-by-element until either it finds what it was looking for, or the end of the list is reached. In Big O notation, this is known as O(N) - the complexity increases in a linear fashion with the size of the data.

KDB offers the unique attribute which (as long as the list is unique) will effectively convert the list into a hashmap - additional memory is used to allow a constant-time lookup into the list (aka O(1)). If items are added to a list that would mean that the unique attribute no longer holds, it will be lost:

q)a:`u#1 2 3 / unique attribute is assigned via `u#
q)attr a     / attributes can be checked via attr operator
`u
q)a,:4       / append 4
q)attr a     / list still has unique attribute
`u
a)a,:3       / append a 3
q)attr a     / list is no longer unique (duplicate 3s) so the attribute is lost
`

If we use \t to time our original function we can see it takes almost 2 seconds on my machine:

q)visited:(enlist 0N 0N)!enlist 0N
q)\t f'[til[128] cross til 128;til 128*128]
1829

Applying the unique attribute to the key, and re running shows a huge speed-up, it now takes less than 30ms to run!

q)visited:(`u#enlist 0N 0N)!enlist 0N
q)\t f'[til[128] cross til 128;til 128*128]
28

The combination of peach and `u# result in a big speed-up in today’s challenge; be sure to consider their functionality in future challenges.

Complete Solution To Day 14

My full solution for Day 14 is below.