Skip to content
Marshall Pierce edited this page Mar 22, 2015 · 2 revisions

Amazingly slow string length calculation

Create a system that reads lines from a prompt and displays the length of the line using the following highly improbable length calculation method:

public static void showLength(String str) {
    try {
        Thread.sleep(5000);
    } catch (InterruptedException e) {
        System.out.println("Interrupted, was calculating length of <" + str + ">");
        Thread.currentThread().interrupt();
        return;
    }

    System.out.println("Length of <" + str + "> is " + str.length());
}

Given the highly intensive nature of this calculation, you should farm off this work into a thread pool to make the most of your CPU. You could submit a new Runnable for each String, or have a BlockingQueue of Strings that a fixed set of Runnables draws from.

Once that's all done, try adding functionality so that you can quit your program by typing a magic string like "!quit" and having that cancel in-progress tasks and shut down your executor.

Parallel web crawler

Starting with a URL (or set of URLs), download the page and look for things of this pattern:

<a href="http://foo.com">asdf</a>

You can use jsoup or jericho to walk the parse tree.

Extract any URLs you find (like http://foo.com in this case), visit them, etc. You can use the stuff built into the JDK for HTTP downloading or use a library like http components from Apache.

Think carefully about how to store which URLs have already been visited (so you don't visit them twice) and which ones have been discovered but not visited yet. Remember, worry about getting it correct before you worry about making it fast. There are many "right answers" to this, especially since assumptions about network performance, etc are just estimates.

You should make your program stop after visiting some number of URLs so as not to annoy your target servers. Experiment with how different numbers of threads affect the overall URLs/sec rate.

More implementations of Handoff

Based on the examples in IntrinsicHandoff and ExplicitHandoff, write a few more implementations. Try using BlockingQueue or Semaphores.

Other challenges are adding two types of poll() method. Make one that returns null if there's nothing in the Handoff and one that blocks until something is in the handoff and then returns what was put in (but doesn't clear it out like take() does).

ForkJoin

See Dan Grossman's census processing project: [http://www.cs.washington.edu/homes/djg/teachingMaterials/grossmanSPAC_project.html]

Clone this wiki locally