Advent of Code - 2020

Back again by popular demand is my analysis of last years Advent of Code problems. As part of a continuing tradition, as much as two times is a tradition, I used Advent of Code as an opportunity to build some familiarity with a new language. This years language was Rust, and while it’s a language that I’ve been very keen on lately, it feels like it is very unsuited for competitive programming. Rust’s iterators and more functional styled features were a real highlight of my experience, and I’m very glad that I have learned them, the focus on safety was a real speed bump in trying to solve problems quickly.

Day 1

In classic Advent of Code fashion the problems begin very gently. This days problem required you to find a pair, and then a triple in part 2, from a list of numbers whose sum is 2020. The obvious solution to this is to use nested loops to try each possible pair and triple until you find one that satisfies the condition. It is possible to do this slightly more effeciently by keeping a hash-set of numbers you’ve seen so far as you iterate over the list. Then when you consider a number you check if it’s 2020 complement is in the set and break if it is. This way you can reduce the order of the algorithm by one polynomial degree, i.e part 1 to O(n) and part 2 to O(n^2). However, the naive brute force search is more than fast enough for size of the input set. This was a theme that I noticed alot this year, very few problems required you to push the performance far beyond the straight forward implementation. Even those larger problems could generally be brute forced with release mode optimisations in a few seconds.

Day 2

Day 2 was the beginning of the cruel and unusual input formats the plagued this year. It was not uncommon to see people complaining in Discord that they had spent significantly longer writing code to parse the problem input than to solve the problem. I was luckily able to afford most of this difficulty use Rust’s Regex crate, along with the discovery of named capturing groups. As a side note I would strongly recommend named capturing groups to all frequent regex users, coming from Javascript regexes they have changed my life. The problem itself was to check the validity of a set of passwords according to a set of rules. In part one the rules related to how often a certain character is allowed to occur in the password, and in part two the rules indicate where in the password the character is allowed to occur. String processing is not something that is super natural or intuitive in Rust, but luckily the extensive iterator libraries meant the solution was relatively clean regardless. In particular, match and match_indices were the MVPs, with an honourable mention to fold.

Day 3

Day 3 was the first day which took an ascii art grid as input, a time honoured tradition. At this point in time, however, I was still struggling with the best way to read input in Rust. So far I only knew how to read input line by line rather than using the much more convenient read_to_string. This means that constructing the grid was slightly more awkward than it needed to be, but all in all was only a minor set back. This days problem required us to count how many “#”s were along a path with a certain slope, that wrapped around the width of the grid. The wrapping problem can be solved very neatly using modular arithmatic, then all that’s left is to walk each line and count the “#”s. Part 2 just introduced more lines, so we only need to repeat the same proceedure more times. Although I didn’t try to implement it, you could count all lines at once by iterating over each y coordinate and calculating the corresponding xs. This solution may be more “clever” but I’m not convinced it’ll run much faster.

Day 4

This days problem was another string validation problem, but with passport fields as opposed to passwords. In particular we had to make sure certain fields were present in part 1, and that those fields had sensible values in part 2. I still hadn’t cracked read_to_string yet, so recognising all the passport fields when they were spread across multiple lines was slightly challenging, but beyond that it was just a simple matter of splitting the fields into parts and writing regexes for the different fields. This problem was pretty disappointing, requiring alot of simple, tedious checking to solve, rather than any interesting algorithmic knowledge.

Day 5

Day 5 is a return to something a bit more algorithmic than Day 4. Here you had to work out the coordinates of a seat in a plane from a “binary” search description of it’s location, for example “in the front half of the front half of the back half of the plane”. Despite immediately making this connection I didn’t realise until after that we could simply interpret the fronts and backs, and lefts and rights as binary to get the seats coordinates and instead calculated the bounds directly using a fold. I’m still pretty happy with how clean the fold turned out, using tuples to pass extra state between iterations in a fold was a trick I think I discovered doing these problems and is one I find very useful. Part 2 was just running part one multiple times and finding the missing seat, nothing too exciting here, just put all the seat numbers in a set, then iterate from the lowest seat number you’ve seen to the highest looking for the gap.

Day 6

We’re back to a string processing problem on Day 6. In this case we are given a groups of peoples answer to a set of questions, each identified by a single letter, with one persons answers on each line. The task is to determine how many questions were answered by atleast one person in part 1, and by everyone in part 2. Again I still hadn’t worked out how to read input and split it on a blank line yet, but otherwise the problem is just scanning each character and keeping a count of how many times that character has occured. Then you check the total number of characters scanned and the total number of characters that occurred the same number of times that there were people. Shoutout again to iterators and filter for making this second part very easy.

Day 7

It was at this point that I was starting to run up against my lack of knowledge in Rust. In particular implementing algorithms on trees, and graphs, basically non-linear data structures in general prove to be a point of contention as I keep working with Rust. This days problem was to analyse a tree of differently coloured bags containing some number of other differently coloured bags and work out how many of these bags ultimately contained a gold bag and then how many total bags a gold bag contained. To do the first part I flipped this tree so that the edges went from child bag to parent bag, that is a contained by relationship, and then did a BFS starting from the gold bag to work out how many bag colours contained gold, remembering to subtract one so we don’t count gold itself. The next part of the problem, however, required me to also keep track of the tree in the traditional top down fashion, along with the weights of each edge, the multiplicity of each bag type. Once I had this structure I could just calculate the total number of bags contained within a bag by recursively calculating its children. All in all my implementation wasn’t too bad, but I wish I could’ve simplified it a bit with structs and double linking for the trees instead of two big adjacency lists.

Day 8

Day 8 serves to further cement Advent of Codes tradition, for every year I’ve done it atleast, by having a problem based around some kind of virtual CPU. The first part of this problem was to detect when the CPU first loops, which only requires implementing the CPU and then simulating it while keeping track of which instructions you’ve visited. The second part is more interesting and involves finding which instruction you need to change in order to make the program terminate successfully. Again, and somewhat disappointingly, you can easily brute force this part but there is a fairly interesting linear time solution. The broad overview is to iterate over every index in the program and simulate starting from there, and mark each index you encounter as either looping or terminating depending on the result. If you encounter a known index you can mark each index up to that point as the same and continue. At the end of this we know whether a given index loops or terminates, and have only spent linear time, as each index in the program is only ever touched once. Then we can determine where to make the change by simulating the program again from the start, and at each index checking if making the change there would cause us to jump to a terminating index, again only taking linear time.

Day 9

This was the day that truly cemented my love of Rusts iterators, something which you may notice becoming a common theme throughout this write up. The problem in part one required you to look at a series of numbers and determine what number is the first one that couldn’t be written as the sum of the 25 previous numbers, not counting the first 25 numbers. This is where the window iterator comes in, which gives us each subarray of a certain size, and find_map which allows us to map a function over the sub-array and returns the first non-none result. In this case the function we’re using in find_map is the summand finding code from day 1. Then for part 2 we had to find a contiguous sub surray whose sum was the answer in part 1. I did this using a good old fashioned while loop with a sliding window. To sketch the algorithm we keep a start and end index and a running sum. While the running sum isn’t the target sum, check if the running sum is less than the target, if so add one to the end index and add the number at the new index to the running sum, if it’s less than the target subtract the number at the start index from the sum, and increment the start index. Shoutout to iterator min and max for making it easy to get the actual answer after finding the subarray from part 2.

Day 10

Day 10’s problem was about the differences in a list of numbers. The first part was about computing how many numbers in the list differed by 1 and how many differed by 3. Again this part was easily dispatched by sorting the list and using window, with a size of 2. The second part required us to calculate how many subarrays there were such that no two adjacent numbers differed by more than 3. This is a classic one dimensional dynamic programming problem. This is because the total number of sub arrays that start from a given index is equal to the sum of sub arrays starting from greater indices whose values differ by no more than 3. I solved this using an interative approach starting from the end and working backwards, but you can probably solve it just as well with a top-down recursive approach if you use memoisation.

Day 11

Did anyone say cellular automata? Because Day 11’s problem was cellular automata. Not too much to say about the implementation here, although I’ve found that as I do more of these it’s alot easier to write a general check adjacent function that takes a direction than to write however many if statements you need to check all the adjacent cells. Which as I write it, sounds like a fairly obvious course of action. Part 2 changes the adjacency rules to be based on line of sight, which fits well with the separate check adjacent function methodology.

Day 12

Day 12 was a cruisy day, not in the sense that it was easy, but in the sense that it was related to boat. The problem input was a series of compass directions and turns, part 1 just required you to simulate moving the boat according to these directions. Part 2 was where things got interesting as now some of the directions move a “waypoint” relative to the ship, and the “forward” direction move the ship to the waypoint. This would be entirely straight forward if the directions only included translations, however, there were also rotations of the waypoint about the ship to deal with. Ultimately, this only slightly complicated things, I chose to solve this by computing the transformation matrix for a rotation by 90 degrees in each direction, and then applied those multiple times to get my desired rotation. A nice benefit of doing it this way is we get to keep everything in integers, we don’t need to bust out any trig functions or floating point numbers to compute rotations on the fly.

Day 13

All aboard the math bus for Day 13, a problem about buses and modular arithmetic. The problem input is a series of bus ids indicating how often they run in minutes and some “x”s, which won’t be relevant until part 2, and a timestamp also in minutes. Part 1 tasks you with finding the first bus that runs after the given timestamp. You could probably brute force this part, but instead I took each bus id and subtracted from it the timestamp mod the bus id. This gives us how many minutes there are between the timestamp and the next bus running, and so the smallest of these gives us the answer to part 1. Part 2 is where the Chinese remainder theorem comes in, not that I knew that at the time, or really know that now as I have yet to do a deep dive into the theorem itself. In particular part 2 wants us to find a time where the sequence of buses arriving matches the input sequence, with the “x”s being wildcards. When I was doing this problem I heard alot of talk about solving the Chinese Remainder theorem using sieving, and if I had to put a name on my solution it would be sieving, so I may have just accidentally stumbled into a good algorithm. The vague idea is that I iterate over the timing sequence keeping a running product of all the bus ids encountered so far, and a current answer timestamp. For each id, I increase the answer timestamp by the running total until that id is arriving at the right point in the sequence, and then update the running product. Then once we have looped over all the bus ids our current answer timestamp is equal to the first time at which buses arrive in the correct sequence. The key piece of mathematical intuition that makes this work is that adding the running product to the current timestamp doesn’t change the relative “positions” of the previous ids, that is they’re all congruent to the same value, because the product is a multiple of all of the previous ids.

Day 14

Come down to the masquerade party with day 14, he said trying desperately to come up with any pun relating to bit masks. The days problem was to simulate a computer writing to memory using certain bitmask rules, in part 1 the bitmask affected the value being written, and in part 2 it affected the address it was written to. Calling it a bitmask is maybe a bit deceptive as values in the mask actually represent one of 3 values, 0 and 1, representing that the bit there should be overwritten with that particular value, and “X” which indicates that the bit there should be left unchanged. In order to represent this I used two bitmasks, the first was 0 everywhere except where the cannonical bit mask was 1, and the second was one everywhere except where the cannonical bitmask was 0. Then when writing a value I just bitwise or the first mask and bitwise and the second to get the resulting value. Additionally instead of simulating the whole 36 bit memory address space I used a hashmap because the full address space is likely mostly empty. For part 2 the interpretation of the bitmask changes, it now applies to the memory address, a 0 indicates the bit should be left unchanged, 1 indicates that bit should be set to 1, and ‘X’ indicates that bit is floating, which means both 0 and 1 for the purposes of writing to memory. This means we can keep one bitmask for overwriting the bits in the address, for handling floating bits I just keep a list of their locations and generate all of the concrete addresses whenever we need to write a value.

Day 15

It’s game on in day 15, memory game on. In particular the rules of the memory game are to list the terms of a well known sequence, the name of which elludes me. For part 1 you have to calculate the 2020th term and for part 2 you have to calculate the 30000000th term. I spent alot of time when first solving this problem trying to think about a more efficient way to calculate the nth term of this sequence in general, but as it turns out no such method exists in this case. So just do what the problem says on the tin, and iterate until you find the answer.

Day 16

All aboard the non-math train for day 16. This days problem was to work out what a number of unknown train ticket fields correspond to based only on the valid range of values for that field. Part 1 is simply to determine which tickets contain values which are not valid in any field and disqualify them. Luckily the input format is fairly rigid so each field only has two valid ranges of tickets, so I just went through and made one big boolean array for whether or not a number was valid. Part 2 requires us to go the full distance and use the valid tickets to work out which field is which. To first limit the possibilities I make a boolean table of potential fields for each ticket field, and then iterate over each ticket disqualifying possibilities. Then I use the resulting posibility matrix to recursively search over different orderings of fields until I find one that works. It may be the case that the first step is unnecessary and that we could brute force straight from the beginning, but that would probably be much slower. The real hope was that after finding all the possibilities we would see that each ticket field had only one option left, but unfortunately no dice.

Day 17

Day 17 was just another cellular automata, the only twist was that it was in 4 dimensions. Nothing much to say here, just bang out your octuple nested for loop to work out the adjacencies and then go to town.

Day 18

Getting toward the tail end of the competition this is where my partcipation started to peter out. Day 18 was one of the problems that I didn’t end up doing, but was also part of the set of problems which motivated me to check out Crafting Interpreters and start working on RLox. This days problem was about implementing a calculator which had different precedence rules, first with addition and multiplication having the same precedence, and then with the precedences reversed. To shoutout some fun solutions, one was to use Python eval to calculate the result after wrapping each number in a class with overloaded operators so that the precedence worked out correctly. Another was to go through and parenthesise every operation and the eval that to get the correct result.

Day 19

Day 19’s problem was another big motivator to check out Crafting Interpreters. The problem was to work out whether a string of characters satisfied a specific grammar. I think that the grammar in this case is actually regular so you can solve both parts by compiling it down to a regex, although part 2 would require some care because it introduces recurisive rules. I instead used a crate called Pest to parse the input, which worked for part 1, but then broke down because of part 2’s recursive rules. I saw some number of people solving part 2 with a recursive DP solution which seemed like a pretty good way to solve the problem, but I’m not super familiar with the algorithm

Day 20

If you see sea monsters, then you’ll be ready for day 20. In this problem we are given a number of satelite pictures of the ocean, in random orientations, and have to stitch them back together, and then find patterns of waves that represent the presence of a sea monster. For part 1 it was sufficient just to find which tiles were the corner tiles, that is tiles with exactly two adjacent tiles. To do this I hashed the boundary edge of each tile and inserted them into a hashmap, with an array of tile ids as values. Then I was able to iterate over each tile, and for each boundary count how many tiles shared that boundary, or the reverse of that boundary, and if that number was two we know that tile is a corner. Then in Part 2 we had to find the seamonsters, which could potentially be across multiple tiles. At this point it would’ve been significantly easier just to reconstruct the entire grid using the map of boundaries we had, however I wanted to do this using only the boundary map. The plan was to start on each point in a tile, and then walk in a direction checking adjacent tiles to see if they matched the seamonster pattern. If the point crossed a tile boundary we would update the direction and orientation based on which side we entered from, and whether or not the boundary had flipped. But this ultimately proved too difficult for me to get working, although I believe it should be possible in theory for a greater mind.

Day 21

I hope you don’t have any allergies going into day 21. In this problem we’re given a number of ingredient lists, each one poorly labelled with the allergens that set of ingredients contains. Poorly here means that not all allergens in that list of ingredients are necessarily listed. The problem then is first to work out which ingredients are definitely safe, and then which ingredients contain which specific allergens, assuming that each allergen is present in exactly one ingredient. The key insight to solving this is realising that the set of ingredients which could potentially contain an allergen is the intersection of the ingredients lists that have that allergen labelled. Put another way, if an ingredient is not present for each ingredient list labelled with an allergen, it can’t possibly be the one ingredient that contains that allergen. I stored these lists of potential ingredients for an allergen as a hashmap of strings, the allergen name, to hash sets of ingredients. Then taking the intersection can be easily done using Rust’s built in intersection function. Once we have that hashmap we can work out which ingredients are definitely safe by working out which never occur in a set of potentials. To work out which ingredients contained specific allergens I used two main ideas, if an ingredient is the only potential ingredient for an allergen we can conclude for sure that it contains that allergen, and when we conclude that an ingredient contains an allergen we can remove it from all other sets of potentials. By repeatedly applying these two steps to the map of potentials we eventually whittle it down to the point where each allergen has only one potential, thankfully no deadlocks or anything in my input set, your mileage might vary in the general case.

Day 22

At this point I had stopped actively participating in the competition, so for these last 4 problems I’m just gonna give my first thoughts on part 1 for each of these days. If there’s enough demand I might revisit, or maybe it’s more correct to just say visit, these problems, and give my full analysis. For this days problem we have to simulate crabs playing a game of war, where the winning player keeps both of the warring cards. The game ends when all of the cards are in one players deck, and the answer is based off the ordering of the winning deck. The easy answer here seems to be to just simulate the game, maintaining two integer lists for the state of the decks. The worry is that just simulating won’t be fast enough to solve the actual input and/or part 2, depending on the problem, but no more intelligent solution leaps out to me immediately.

Day 23

Day 23 is another day of the crabings shuffling number things around. Todays’ numbered things are cups laid out in a circle. Again the obvious solution is just to simulate the rules that the crabs move for moving the cups and let it run, without seeing part 2 it’s hard to say if that’s efficient enough, but for simulating 100 moves it should be plenty fast.

Day 24

This day is about tiling a floor. In particular a hexagonally tiled floor, where each tile is white on one side and black on the other. The part 1 input is a list of directions to a set of tiles to flip from white to black, or black to white as a tile can be flipped multiple times. I think I would go about this by using some kind of hexagonal coordinate system to just represent the colour of tiles using a hashmap, just like with a sparse cartesian grid. I’m not super familiar with hexagonal coordinates, but I found this helpful blog which describes a couple different ways to go about hexagonal coordinates. From there just following the directions and working out the coordinates of the destination tile is pretty straightforward.

Day 25

For the final day we’re back on the math train. The problem is to try and crack a cryptographic key exchange process between two devices. In particular the exchange looks to be using Diffe-Hellman key exchange although there maybe some difference in nuance between the two of them which I’m not noticing. When I see a problem like this in a competitive setting the first thing I think of is brute force. It may not be the fastest solution, but often it may be fast enough or at the very least score you some partial points for that problem, and it’s often one of the fastest to code. Especially for AoC where there are no time limits, you can turn on performance optimisations and leave the program running as long as it needs. To shoutout some more number theoretic things I notice that may or may not help, because the subject number is 7 we know that the final number will have prime factorisation 7^n for some n. So if we start from one of the public keys, and iteratively add 20201227, then the first number which only has factors of 7 must be 7^n, so we can calculate the loop number from there. This feels like it might be faster than just checking all powers of 7, although I’m not sure I could put an upper bound on either approach.