Thursday, November 20, 2008

A math proof

It's intuitively obvious that this is true, and it's probably trivial to prove it mathematically, but I am several decades out of practice at writing proofs so I can't quite figure out how to prove it.

If you have a sample of numbers and all you know is its average, not the individual values or how many; and then you add a bunch more numbers to the sample, and the new numbers have an average higher than the original sample's average, the new cumulative average must be higher.

As a word problem: If a sports player has an average on some statistic for his career, then next year, his average for that year is higher than his average for his career up to that year, his new career average must have increased.

In mathematical lingo:
Let A and B be a set of numbers, with at least one element each.
Let mA and mB be the averages of the values in these sets.
Let C be the union of sets A and B; then mC is the average of the values of the set C.
If mB > mA, it follows that mC > mA.

Trying to write a rigorous mathematical proof, I get nowhere. I start with tA/nA < tB/nB and try to do things to that inequation (adding things to both sides, for instance) in hopes of getting closer to (tA+tB)/(nA+nB) > tA/nA but I never get anywhere.

I suppose there's some small chance that this is one of those things where proving it is a lot harder than it seems intuitively, but more likely, it just means that in the twenty years since my college days, I have forgotten almost everything. Meh.


drscorpio said...

OK. I don't know how to format subscripts. This may get messy.

Let k = the number of elements in A, and t = the number of elements in B.
mA = (sum of the elements of A)/k, so
(sum of the elements of A) = k*mA.
Similarly, (sum of elements of B) = t*mB.
Also mC = [(sum of the elements of A)+ (sum of the elements of B)]/(k+t).
So, mC = (k*mA + t*mB)/(k+t).
Now, we assume mA < mB, so
t*mA < t*mB, and thus,
k*mA + t*mA < k*mA +t*mB.

Now, mA = mA*(k+t)/(k+t)
So, mA = (k*mA + t*mA)/(k+t)
By the inequality at the end of the last paragraph,
mA < (k*mA +t*mB)/(k+t) = mC. QED

Hawthorn Thistleberry said...

Okay, I can understand the proof. (That last step took me a few tries.) But I still don't remember what I would have done to come up with it! I used to know how to do that.

drscorpio said...

I can see how the last part wasn't completely clear. I was frustrated by my lack of formatting ability, so I cut it short.

The key idea in the proof was realizing that you needed to express the sums in terms of the averages. Then the only other clever bit (IMHO) was multiplying by 1 in the form (k+t)/(k+t).

I basically wrote down everything I could algebraically derive from the pieces I had until I saw a connection.

I think your main issue here may just be lack of practice. I do this stuff every day of my working life as well as teach other people how to do it. If you spent more time thinking about such things, your abilities would soon sharpen.

Hawthorn Thistleberry said...

I did the same thing, just starting with what I had and doing stuff with it in hopes of getting to something, but maybe I stopped too soon.

Subscripts are <sub>A</sub> in HTML but Blogger won't allow that in comment boxes, though it is allowed in the original post.