Advent of Code 2020 - Day 19

By Eric Burden | December 20, 2020

In the spirit of the holidays (and programming), I’ll be posting my solutions to the Advent of Code 2020 puzzles here, at least one day after they’re posted (no spoilers!). I’ll be implementing the solutions in R because, well, that’s what I like! What I won’t be doing is posting any of the actual answers, just the reasoning behind them.

Also, as a general convention, whenever the puzzle has downloadable input, I’m saving it in a file named input.txt.

Day 19 - Monster Messages

Find the problem description HERE.

Part One - The One Rule

Today’s puzzle involves our Elf contact at the MIB (I’m going to call him J), asking for help checking some text data against an oddly arranged set of rules that reference other rules within the set. Good news! The messages only consist of the letters ‘a’ and ‘b’ in various combinations. That’s helpful. The approach here was to build one monstrous regular expression for the rule we want to evaluate against, replacing references to other rules with their ‘a’ or ‘b’ representation until all the rule references are fully replaced.

# Setup ------------------------------------------------------------------------
test_input <- c(
  '0: 4 1 5',
  '1: 2 3 | 3 2',
  '2: 4 4 | 5 5',
  '3: 4 5 | 5 4',
  '4: "a"',
  '5: "b"',
  "",
  "ababbb",
  "bababa",
  "abbbab",
  "aaabbb",
  "aaaabbb"
)
real_input <- readLines('input.txt')

# Functions --------------------------------------------------------------------

# Helper function, extracts a pattern-match from a string
extract <- function(str, pattern) {
  regmatches(str, regexpr(pattern, str, perl = TRUE))
}

# Helper function, given a vector of input lines, returns a named vector of 
# parsed validation rules
get_rules <- function(input) {
  input <- gsub('["\']', '', input, )      # Strip any quotes
  split_index <- which(input == "")        # Identify the blank line
  rule_strs <- input[1:(split_index-1)]    # All lines above the blank line
  
  rule_keys <- extract(rule_strs, '^\\d+')         # Rule name, the leading number
  rule_values <- extract(rule_strs, "(?<=: ).+$")  # Everything after the ':'

  # Wrap the values in regex capture groups, except for the static 'a' and 'b'
  # rule values
  root_rule_i <- grepl('[ab]', rule_values)
  rule_values[!root_rule_i] <- paste('(', rule_values[!root_rule_i], ')')
  
  names(rule_values) <- rule_keys  # Add names to the vector of rules
  rule_values
}

# Helper function, given a vector of input lines, returns a vector of 
# messages to validate
get_messages <- function(input) {
  input <- gsub('["\']', '', input)
  split_index <- which(input == "")
  as.character(input[(split_index+1):length(input)])
}

# Helper function, given a rule name (`key`) and the list of validation rules 
# (`rules`), returns the rule text if the rule is in the list of rules, 
# otherwise returns the given `key`. Sometimes, this function will receive
# 'a' and 'b' as `key`s, so it just gives those back.
fetch_if_exists <- function(key, rules) {
  rule <- gsub('[\\^\\$]', '', rules[key])
  if (is.null(rule)) { key } else { rule }
}

# Helper function, given a validation rule `rule`, returns all the references
# to other validation rules.
extract_rule_names <- function(rule) {
  unique(unlist(extract(rule, '\\d+')))
}

# Helper function, given a reference to a validation rule `rule_name` and the 
# list of validation rules `rules`, continuously evaluates the references to
# other validation rules contained in rule `rule_name` until it contains no
# more rule references, then returns the result.
expand_rule <- function(rule_name, rules) {
  rule <- gsub('[\\^\\$]', '', rules[rule_name])  # Remove the string start/end matches
  
  # So long as the rule string contains references to other rules...
  while (grepl('\\d', rule)) {
    
    rule_names <- extract_rule_names(rule)  # The referenced rule names
    
    # Replace each referenced rule name with the text of the referenced rule
    for (name in rule_names) {
      # Wrap the name in word boundaries, to avoid replacing part of a 
      # reference, i.e. the '5' in '15', for example
      pattern <- paste0('\\b', name, '\\b')
      replacement <- fetch_if_exists(name, rules)
      rule <- gsub(pattern, replacement, rule, perl = T)
    }
    
  }
  
  # Remove spaces and add the string start/end matches back
  paste0('^', gsub(' ', '', rule), '$')  
}

# Processing -------------------------------------------------------------------
input_to_process <- real_input              # Which input to process
rules <- get_rules(input_to_process)        # Get rules
messages <- get_messages(input_to_process)  # Get messages
rule0 <- expand_rule('0', rules)            # Expand rule['0'] to regex

# Check each message against rule['0'] to see if it matches
matches <- sapply(messages, function(x) grepl(rule0, x, perl = TRUE))
answer1 <- sum(matches)  # The number of matches

Using this approach, try printing the regular expression for rule['0'] to console. Ain’t she a beaut?

Part Two - Rule Recursion

And here we thought we’d get away with nice, simple, non-recursive rules. Ha! Part two introduces a couple of rules that reference themselves. Luckily, there are regular expressions that can handle that sort of thing. You’ll note that I took the hint from the puzzle text and just straight up replaced the given rule text for the two new rules with regular expressions that would behave nicely with my expand_rule() function and associated code.

# Setup ------------------------------------------------------------------------
source('exercise_1.R')

test_input <- c(
  "42: 9 14 | 10 1",
  "9: 14 27 | 1 26",
  "10: 23 14 | 28 1",
  "1: a",
  "11: 42 31",
  "5: 1 14 | 15 1",
  "19: 14 1 | 14 14",
  "12: 24 14 | 19 1",
  "16: 15 1 | 14 14",
  "31: 14 17 | 1 13",
  "6: 14 14 | 1 14",
  "2: 1 24 | 14 4",
  "0: 8 11",
  "13: 14 3 | 1 12",
  "15: 1 | 14",
  "17: 14 2 | 1 7",
  "23: 25 1 | 22 14",
  "28: 16 1",
  "4: 1 1",
  "20: 14 14 | 1 15",
  "3: 5 14 | 16 1",
  "27: 1 6 | 14 18",
  "14: b",
  "21: 14 1 | 1 14",
  "25: 1 1 | 1 14",
  "22: 14 14",
  "8: 42",
  "26: 14 22 | 1 20",
  "18: 15 15",
  "7: 14 5 | 1 21",
  "24: 14 1",
  "",
  "abbbbbabbbaaaababbaabbbbabababbbabbbbbbabaaaa",
  "bbabbbbaabaabba",
  "babbbbaabbbbbabbbbbbaabaaabaaa",
  "aaabbbbbbaaaabaababaabababbabaaabbababababaaa",
  "bbbbbbbaaaabbbbaaabbabaaa",
  "bbbababbbbaaaaaaaabbababaaababaabab",
  "ababaaaaaabaaab",
  "ababaaaaabbbaba",
  "baabbaaaabbaaaababbaababb",
  "abbbbabbbbaaaababbbbbbaaaababb",
  "aaaaabbaabaaaaababaa",
  "aaaabbaaaabbaaa",
  "aaaabbaabbaaaaaaabbbabbbaaabbaabaaa",
  "babaaabbbaaabaababbaabababaaab",
  "aabbbbbaabbbaaaaaabbbbbababaaaaabbaaabba"
)

real_input <- readLines('input.txt')

# Processing -------------------------------------------------------------------
input_to_process <- real_input              # Which input to process
rules <- get_rules(input_to_process)        # Get rules

# Replace rules '8' and '11' with recursive versions, as described in the
# puzzle instructions. These have been modified to include the relevant
# regex rules for each. For rule['8'], this meant replacing the self-reference
# to rule '8' with the regex for one or more matches to rule '42'. For 
# rule['11'], this meant wrapping the second part of the rule in a named
# capture group ('rec') and replaced the self-reference with a recursive regex
# match to the named capture group `&rec`. 
rules['8'] <- "(42 | (42)+)"
rules['11'] <- "(42 31 | (?'rec' 42 (?&rec)? 31))"


messages <- get_messages(input_to_process)  # Get messages
rule0 <- expand_rule('0', rules)            # Expand rule['0'] to regex

# Check each message against rule['0'] to see if it matches
matches <- sapply(messages, function(x) grepl(rule0, x, perl = TRUE))
answer2 <- sum(matches)  # The number of matches

Today, I learned that recursive regular expressions were a thing. I also learned that they’re extremely flexible and something that I should definitely use going forward.

Wrap-Up

Days like today represent one of the best reasons to try something like Advent of Code. I truly had never heard of recursive regular expressions before, but it turned out this really useful bit of functionality was there all along. I just never had a problem that truly required them come up before. This of course sent me down a rabbit hole of learning about regular expression engines (because not all of them support recursion), which was also quite enlightening.

If you found a different solution (or spotted a mistake in one of mine), please drop me a line!

comments powered by Disqus