Monday, July 27, 2009

The Broken Stick Experiment

I found several interesting video lectures at the BLOSSOMS Video Library. One in particular that I enjoyed is called The Broken Stick Experiment [Flash]. In it, Professor Richard Larson asks the seemingly simple question,
If you break a stick randomly in two places, what is the probability that you can form a triangle from the three pieces?
If one of the pieces is too long, then the other two can't meet to form the triangle.
The lecture is only 33 minutes, and Professor Larson builds up a very pretty geometric proof, so I won't spoil it by posting the answer here. I will provide code for a simulation, though (yes, this is still a programming blog). See if you can guess the answer before running the simulation or watching the video.

import java.util.Arrays;
import java.util.Random;

public class BrokenStick {

public static void main(String[] args) {
Random random = new Random();

int trials = 10000;
int triangles = 0;

double[] breaks = new double[2];
double[] sides = new double[3];

for( int i=0; i < trials; ++i )
{
breaks[0] = random.nextDouble();
breaks[1] = random.nextDouble();
Arrays.sort(breaks);

sides[0] = breaks[0];
sides[1] = breaks[1] - breaks[0];
sides[2] = 1.0 - breaks[1];
Arrays.sort(sides);

if( sides[2] < (sides[0] + sides[1]) )
{
triangles++;
}
}

System.out.println("Triangles: " + triangles);
System.out.println("Trials: " + trials);

double p = (double)triangles / trials;
System.out.println("Proportion: " + p);
}
}


The simulation breaks 10,000 sticks each into three random pieces using Java's Random class to come up with random lengths. If the longest piece of broken stick is longer than the two remaining pieces, a triangle can't be formed. Ten-thousand trials should be enough to accurately estimate the proportion of triangles made from the broken sticks to two decimal places. How close was your guess?

Related:
The Broken Stick Revisited

41 comments:

rbonvall said...

Thanks, I had a fun time solving it :)

At least to me, this problem lends itself nicely to a geometric solution. In the a-b-c space, the possible breaks must satisfy a+b+c=1, which is a plane. Then, it reduces to find the fraction of the "positive part" of the plane that satisfies the constraints. And I always have fun drawing intersecting planes :P

Bill the Lizard said...

rbonvall,
"And I always have fun drawing intersecting planes :P"

Your comment reminds me of the first lecture (or maybe the second, I can't quite remember) in MIT's Linear Algebra video series. Gilbert Strang shows several methods for solving systems of equations, and makes a complete mess of drawing intersecting planes. He uses that as an illustration of why solving systems of equations using matrices is far superior. :)

John | Retro Programming said...

Thanks, it's an interesting problem. I think I've got the solution, I just need to run a program to double check my result :-)

Alexey Busygin said...

I think it should be 33,3%.

Bill the Lizard said...

Alexey,
It's not 33%, but your initial guess is closer than mine was.

Trevel said...

I estimate at just under 25% triangle. Trying to figure if there's a flaw in my logic, but I can't pick it out...

tg said...

I agree with Trevel. Independent breaks which both must occur on the same half of the stick. <25%

tg said...

Got the right answer, but not sure if the logic is right:

If two breaks occur on the same half of the stick, then on piece is longer than the sum of the other two. These are independent events. The probability of one break on 50% of the stick is 1/2, the probability of two (independent) breaks on the same half is 1/2*1/2 = 0.25.

Bill the Lizard said...

tg,
Wouldn't that indicate that there's a 25% chance of not forming a triangle?

tg said...

back to the drawing board....

Bill the Lizard said...

tg,
Don't feel bad, you almost had me convinced. It took me ten minutes to spot the flaw, and I've watched the video twice! :)

tg said...

How about one more try using the same concept (except using it correctly to represent all three lengths, as I didn't do the first go round):

Probability of NOT forming P(NF) =
P(a > b+c)*P(b>a+c)*P(c>b+a)

As mentioned earlier P(a>b+c) = probability of two breaks on 1/2 of the stick = 0.25.

P(NF) = 0.25*0.25*0.25 = 0.75
Thus P( NOT NF) = 0.25

Bill the Lizard said...

tg,
0.25 * 0.25 * 0.25 = 0.015625

Back to your original statement, though, you said "If two breaks occur on the same half of the stick, then on piece is longer than the sum of the other two." This is correct, but it's only one way that you could get one piece that's longer than the other two. You could also break the stick on opposite ends of the stick and have the middle piece be too long.

tg said...

B the L, you are a patient individual.

One final attempt to redeem myself. The "*" was supposed to be "+". The risks of writing at work.

We are looking for the Probability of (a>b+c) OR (b>a+c) OR (c>a+b).

a + b + c = L
P(a>b+c) = P(a>L/2)

For a>L/2, two breaks must occur on the right half:
P(of a break occurring on the right half) = 0.5
P(of two independent breaks occurring on the right)=
P(a>L/2) = 0.5*0.5 = 0.25

For b>L/2, two independent breaks must occur on either the left quarter, the right quarter, or both breaks on the left quarter, or both on the right quarter. The probability of ONE break landing on that 50% of the stick = 0.5. The probability of another break independently landing on the same 50%of the stick = 0.5*0.5 = 0.25.

c is the same as a, so
P(c>b+a) = P(a>b+c) = 0.25

For any one length > L/2 = (a>b+c) OR (b>a+c) OR (c>a+b) = 0.25+0.25+0.25.

If this is wrong, spare me the indignity and just delete my comments.

Appreciate your comments, didn't realize how lazy I'd gotten!

Bill the Lizard said...

tg,
Now you have it! I like the approach you used because you're showing the exact opposite of what the problem asks for, the probability of NOT forming a triangle. This kind of thinking is a very powerful tool in mathematics, and is often overlooked.

Sam152 said...

Very interesting post, my nifty c++ program calculated about 18.9% successes after 10,000 tries. How close is this?

Bill the Lizard said...

Sam152,
That's a little bit low; lower than rounding error would explain. Do you want to share your code? You could either post it here, or as a Stack Overflow question and I'd be happy to take a look.

fiberfiend6891 said...

Just a heads up - the link directly to the video has http doubled and doesn't work when clicked directly (I had to copy/paste and get rid of the double):
http://http//blossoms.mit.edu/video/larson-watch.html

This is my first time seeing your blog, and I absolutely LOVE it!

Bill the Lizard said...

fiberfiend6891,
I appreciate the heads up. That link is now fixed. Thanks for reading.

Tom Raywood said...

Bill the Lizard,

Hi. Great question.

There is one thing I'm not clear on, though, from the description. Do obtuse triangles count here, or is the solution limited to acute?

Bill the Lizard said...

Tom,
You could end up with an acute, obtuse, or (rarely) equilateral triangle. You've got me curious now, so I'll have to modify the code later and run another simulation to see if acute or obtuse triangles are more common. Good question!

Tom Raywood said...

Bill the Lizard,

Good. If you limit outcomes to acute triangles, (including equilaterals), what probability do you get?

Oh, and another question [intended to confirm an assumption on my part]: Is this strictly an ordered set or can any of the three pieces form the base?

Bill the Lizard said...

Tom,
Any of the three sides could form the base (I'm assuming by base you mean the longest side). In the code I gave you can see how I work around this by calculating the length of the three sides, then immediately sorting them (on line 24) so that I always know where the longest side ends up.

I'm working on an entire post to answer your other question. :)

Tom Raywood said...

Bill the Lizard,

Gee, no I hadn't imagined that the base of the triangle had to be the largest of the three pieces. All of these unstated parameters make it very difficult to approach the problem in a systematic way.

So far, yes, the triangle can be of any sort, the pieces can be moved around at will and, finally, (if I understand your last answer), the base of the triangle is expected to be whichever piece [of the stick] that's the longest.

Is this correct? Is there anything else in the way of allowances and/or limitations?

Bill the Lizard said...

Tom,
Disregard my last answer, as I had misunderstood your question. There are no constraints on what kind of triangle you can get (acute, obtuse, right, or no triangle at all) or on the order of the pieces. Any of the pieces can be the longest when you randomly select the two break-points. The only reason I sort the pieces in my code is because I need to know which piece is longest, so that I can tell if the pieces form a triangle or not. This has no impact on the final outcome.

Tom Raywood said...

Bill the Lizard,

Whoa, hey, no, it can't be as simple as that can it?

All it takes for the three pieces to form a triangle is that none of the pieces have a length greater than or equal to 1/2 the length of the stick. And that's a full 50% of the time.

What gives? I thought certainly the question would pose a greater challenge than that.

Bill the Lizard said...

Tom,
Close. "All it takes for the three pieces to form a triangle is that none of the pieces have a length greater than or equal to 1/2 the length of the stick." This much is true, but that doesn't happen a full 50% of the time (which was also my initial guess before I watched the full lecture).

Tom Raywood said...

Bill the Lizard,

So true. (I'm feeling better already.) After 100 simulations with n=1,000,000 I'm seeing slightly less than 50%. More specifically I'm seeing 49.994067% and am definitely curious as to what explains this slight deviation. I think I'd rather contemplate it for a while though, that is, to see if I can figure it out myself.

Tom Raywood said...

Bill the Lizard,

To follow up here, beginnings of a guess are as follows: For some small range [in the form of a difference], as two legs of the triangle both approach 1/2, the small difference between them cannot be bridged by the infinitesimally small remaining leg. Be nice to make a formal statement.

Bill the Lizard said...

Tom,
Check your simulation code against mine. Your answer is off by quite a bit more than just rounding error. You can post your code here, or on Stack Overflow if you'd like me to take a look at it (provided it's in a language I know reasonably well, Java, C, C++, Python, PHP, or any dialect of Basic).

Tom Raywood said...

Bill the Lizard,
Definitely courting tedium at this point.

n R1 R2 R3 T P1 P2 P3 C1

These represent column headings in Excel.

R1, R2 and R3 denote random numbers using the RANDBETWEEN function; I chose for each number the range from 1 to 10,001.

T denotes the sum of R1, R2 and R3 for each row.

P1, P2 and P3 denote which percentage of T is entailed of R1, R2 and R3 respectively.

C1 (for condition 1) represents a check to see whether P1, P2 and P3 all three entail percentages less than 0.5, posting a 1 if they do and a 0 if they don't.

n denotes the number of iterations which, as I said, I took the time to set at 1 million.

Totaling the C1 column provides a sum which, divided into 1 million, quite closely approximates the probability that no 'leg of the triangle' will be greater than or equal to one half.

Additionally I generated a macro which recalculates the worksheet 100 times, posts each result to a table and then finds the average of that set.

You can imagine my surprise to hear that the actual probability varies significantly from what this arrangement points up.

I'll take a look at that lecture this weekend.

Bill the Lizard said...

Tom,
From your explanation it looks like you're generating three random lengths to represent the three pieces of stick. The problem with this approach is that there's nothing to make sure the three pieces all add up to the original length of the stick.

Try it by generating only 2 random numbers (x and y) that represent the break points. If 10,000 is the length of the original stick, and you pick 2 random numbers between 0 and 10,000 then the lengths of the three pieces will be x, y-x, and 10,000-y when x < y, or y, x-y, and 10,000-x when x > y. (It's easier if you just select two random numbers then sort them so x < y, then just always use x, y-x, and 10,000-y for the three lengths, which is essentially what I did in my code.)

I'll try to get my next post up this weekend. I have the answer to your question about acute and obtuse triangles, but proving why I'm getting the number I'm getting is proving to be a little bit complicated.

Tom Raywood said...

Bill the Lizard,
I'll work this up and see what gives. It's definitely counterintuitive though that always having the same stick length would even matter, that is, any more than it would matter that triangle of type Q presents as q1, q2, ...qn for n of any size. Be nice to hear the why behind your assertion. Maybe the lecture will help in this regard.

Bill the Lizard said...

Tom,
The particular value of the stick length doesn't matter as much as making sure the three pieces add up to whatever length you decide on. I used a stick length of 1.0 in my similation, but it doesn't hurt anything if you choose a length of 10,000.

What really makes a difference is how you choose your random lengths. If you choose three random lengths then add them together, you don't really know the length of the stick you started with. If you start with a specific length, then make two random breaks within the boundaries of the stick, you're guaranteed to end up with three random lengths that add up to your original stick length.

It's possible that I just didn't correctly understand the explanation of your spreadsheet. Without seeing the formulas you used I can't be sure that this part of your simulation isn't correct. The errors you're getting could be from another part of the calculation, but this is the most logical place to start looking.

Tom Raywood said...

Bill the Lizard,

I suspect most people would anticipate that having a consistent or particular stick length shouldn't matter here. What's at issue is where the two breaks appear, that is, relative to overall length. A million sticks all of different lengths and all randomly broken in two places should demonstrate the same probabilistic characteristics of a million sticks all of the same length because, again, what's at issue is relative in any event.

This principle can be demonstrated as follows. In my simulation each stick length is represented by T, that is, the sum of R1, R2 and R3. If I wanted/needed to ensure that each and every occurrence of T always comes out to equal a certain value, a few more columns could be inserted which accomplish this simply by adjusting the sizes of R1, R2 and R3 by some definitive amount. For example, if for one row T presented as equal to 16,000 [compared to, say, the value of 21,000 I'd predetermined as a uniform stick length], obviously all I'd have to do is increase each of the original random numbers by 21/16.

So even though now all million stick lengths are the same, the actual percentage of that length occupied by each R1, R2 and R3 hasn't changed at all. The outcome would still match the 50% probability that my simulation currently produces.

In short, Woe to The Foe, heh heh heh.

Here are the formulas I employ for...

...{R1,R2,R3}
................ =RANDBETWEEN(1,10001)

..........{T}
................ =SUM(E2:G2), because R1, R2 and R3 occupy columns E, F and G.

...{P1,P2,P3}
................ =E2/H2; =F2/H2; and =G2/H2, because T occupies column H.

.........{C1}
................ =IF((J2<0.5)*(K2<0.5)*(L2<0.5),1,0), because P1, P2 and P3 occupy columns J, K and L.

And fun was had by all.

Spartacus said...

1. Probability that both cuts are on opposite sides of the midpoint: 1/2
2. Probability that two cuts on opposite sides of the midpoint are less than half the stick length apart: 1/2

Therefore the probability of forming a triangle is 1/2*1/2=1/4

Here is a more difficult problem with an even more surprising result: You break the stick into two pieces, then randomly choose one of the two pieces and break it into two pieces. What is the probability that the resulting three pieces can form a triangle?

You will be amazed by the solution!

(Hint: the answer is not just a simple fraction like the original problem; however there is a very nice closed form)

Bill the Lizard said...

Spartacus,
I've read about half-a-dozen variations on this problem (mostly linked from the notes on the BLOSSOMS page) but I haven't seen that one. This might take a while. :)

Anonymous said...

Sam152's solution is correct; it is the answer the Spartacus's variant.


Bill's original problem statement was under-specified, admitting two interpretations that yield different answers. (Do you choose two cutpoints independently and uniformly on the original stick; or do you make one uniform random cut, and then make another uniform random cut on one of the pieces?)

See http://www.cut-the-knot.org/Curriculum/Probability/TriProbability.shtml

Anonymous said...

This issue might be very old and might already been resolved. but just for clarification your value for second side is wrong and should be taken as follows:

breaks[1] = random.nextDouble()*(1-breaks[0]);

With this correction, the probability comes to 0.193 which is the correct solution.

Bill the Lizard said...

Anonymous,

No, that's incorrect. The second break can take place at any point along the length of the stick. Your change restricts it to only the length of the first piece broken off.

Anonymous said...

Ok yes you are correct, this imposes a restriction and thus reduces the probability. the correct answer is 0.25 only and not 0.193.
thanks.