First approach for the Kaggle Santa 2019 challenge

Okay here it is: I’m waiting for it like almost a year…
The new Santa Kaggle Challenge.

This year is a special year for me in that challenge. Normally I try to compete with all my time I have and get the best possible score I can achieve and maybe blog about it when it is all over.

But this year I blog about it while competing. I think it is more interesting for people to read the results in between instead of just the polished end result. I know that everyone can copy it now and I’ll not win this challenge but I never did anyway 😀

Maybe I can get a discussion medal with it…

Before we start I also want to thank my patrons again because it wouldn’t be that awesome without them. Normally you get posts 2 days earlier when you support me but this is against the Kaggle rules so everyone gets it as soon as I’ve finished writing.

Let’s start:

The problem

Some families are allowed to visit the workshop of Santa. Amazing news for 5,000 families. Good for them I would say 😉 His workshop is probably quite big but to be not too crowded per day only 300 people can visit the workshop and it isn’t reasonable with less than 125. Santa decided to open the workshop for 100 days. The families are super excited to see all the stuff but some days fit better than the others so they are ranked and sent to Santa in advance.

Basically the problem is: How to match each family to a day?

Of course Santa is Santa and therefore wants to be fair so if a family doesn’t get their best choice they should have something for it but Santa can’t produce more stuff so this time it’s just cash, food and a helicopter ride. Yes you heard that right: A north pole helicopter ride. Oh man that sounds … Okay got distracted here.

Now you can see that this costs money for Santa and you know the world we live in: He wants to minimize his expenses.

Additionally there is some non-linear cost which is called accounting penalty and it’s a complicated formula but I trust the Santa’s accountants so should be correct 😉 (Now there was actually a bug on the evaluation page before).

\[
\text{accounting penalty } =\sum_{d=100}^{1} \frac{\left(N_{d}-125\right)}{400} N_{d}^{\left(\frac{1}{2}+\frac{|N_{d}-N_{d+1}|}{50}\right)}\]

Where \(d\) is the day before Christmas so we are counting towards X-mas here. \(N_d\) is the number of people attending on that day and because we are counting down \(N_{d+1}\) is number of people attending the day before.

Now let’s have a deeper look into the other costs which are described here:

If the family gets their first choice for the workshop tour Santa doesn’t have to pay anything. Otherwise we have the preference costs:

  • 2nd choice: 50$
  • 3rd choice: 50$ + 9$ per family member
  • 4th choice: 100$ + 9$ per family…