Testing Rebecca Black’s Claims

Posted by Kurt on March 25th, 2011

I was unsure about some of the bold claims that Rebecca Black made in her latest hit video, so I wrote some unit tests for it. It turns out she is telling the truth!

public void testRebeccaIsntLying() {
    Calendar rebeccasCalendar =
        Calendar.getInstance(TimeZone.getTimeZone("America/Los_Angeles"));

    // The song went viral on Friday, March 11, 2011
    rebeccasCalendar.set(2011, Calendar.MARCH, 11);

    // Yesterday was Thursday, Thursday
    rebeccasCalendar.add(Calendar.DAY_OF_MONTH, -1);
    assertEquals(Calendar.THURSDAY, rebeccasCalendar.get(Calendar.DAY_OF_WEEK));

    // Today i-is Friday, Friday (Partyin’)
    rebeccasCalendar.set(2011, 2, 11);
    assertEquals(Calendar.FRIDAY, rebeccasCalendar.get(Calendar.DAY_OF_WEEK));

    // Tomorrow is Saturday
    rebeccasCalendar.add(Calendar.DAY_OF_MONTH, 1);
    assertEquals(Calendar.SATURDAY, rebeccasCalendar.get(Calendar.DAY_OF_WEEK));

    // And Sunday comes afterwards
    rebeccasCalendar.add(Calendar.DAY_OF_MONTH, 1);
    assertEquals(Calendar.SUNDAY, rebeccasCalendar.get(Calendar.DAY_OF_WEEK));
  }

I sent this code out for review at work, and received quite a bit of insightful feedback from my co-workers:

  • As you mentioned two lines up, “yesterday was Thursday, Thursday”. Do we need to assertEquals twice?
  • I appreciate the attempt, but I have to say I find your test lacking. For starters, you only assert that yesterday was Thursday once.  I’d see two sequential assertions as the absolute minimum. More concerningly, a crucial property of the calendar is that it can be used to determine whether or not we need to get down.  Yet I see no assertion that we need to get down if and only if it is indeed Friday. Even if, in fact, it proves that no one is looking forward to the weekend, I don’t believe your test will catch it, to say nothing of our current observed state of partying. But these are just examples.  Who needs a Calendar object if this is all it’s good for?
  • Shouldn’t the time zone be determined dynamically? Rebecca may be traveling.
  • I’d ask you to assert that she has chosen the appropriate seat, but this may be out of the scope of this change.  A TODO will be fine.
  • While you are adding the above test, I would like you to consider updating this test case as well to assert that Rebecca is indeed looking forward to the weekend, as well as partying partying during the weekend, as appropriate given the day.
  • Can you devise tests for the following constraints made in the original documentation: 1. Gotta get down on Friday.  (e.g. ASSERT(didGetDown) 2. Everybody’s looking forward to the weekend.

Data structure for weighted sampling

Posted by Kurt on March 9th, 2009

I was asked the following question: Design a data structure that supports 2 operations:

  • insert(element, weight)
  • sample()

When you insert an element into the data structure, you specify a weight. This weight divided by the total of the other weights in the data structure, is the probability that the item is returned when sample() is invoked. You should design your data structure with both speed and space in mind.

Here is a good solution. The insert() operation requires O(1) constant amortized time and the sample() operation requires O(lg n) time due to binary search. The space requirements is O(n) as well.

public final class BinarySearchWeightedSampler<E> {
  private final List<E> elements;
  private final List<Double> weights;

  public BinarySearchWeightedSampler() {
    this.elements = new ArrayList<E>();
    this.weights = new ArrayList<Double>();
  }

  public void insert(E element, double weight) {
      elements.add(element);
      if (weights.isEmpty()) weights.add(weight);
      else weights.add(weight + weights.get(weights.size() - 1));
  }

  public E sample() {
    return elements.get(
        binarySearch(Math.random(), 0, weights.size() - 1,
                     weights.get(weights.size() - 1)));
  }

  private int binarySearch(double val, int low, int high,
      double totalWeight) {
    if (high < low) return low;
    int mid = low + ((high - low) / 2);
    int midValue = weights.get(mid) / totalWeight;
    if (midValue > val)
      return binarySearch(val, low, mid - 1, totalWeight);
    else if (midValue < val)
      return binarySearch(val, mid + 1, high, totalWeight);
    else
      return mid;
  }
}

If we ignore the space requirements and assume the weights have finite precision (I chose integers for clarity, but any finite precision number can be made to work), we can sample in O(1) constant time:

public final class DiscreteWeightedSampler<E> {
  private final List<E> elements;

  public DiscreteWeightedSampler() {
    this.elements = new ArrayList<E>();
  }

  public void insert(E element, int weight) {
    for (int i = 0; i < weight; ++i) {
      elements.add(element);
    }
  }

  public E sample() {
    return elements.get((int) (Math.random() * elements.size()));
  }
}

Modified version of Webby Blue
Copyright © 2008 kloover.com. All rights reserved.
**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**