Tuesday, November 10, 2015

Coxeter Exercise 4 of Section 1.3 in "Introduction To Geometry"


In this note we do the exercise listed in the title according to the hint given by the author.

   In the last blog entry it was shown that right triangles BOR and RPP1 in the diagram above were similar by constructing a circle whose center was the midpoint of segment OB, which included points B, P, O, R, and thus had inscribed angles BPR and BOR both subtended by chord BR and therefore equal to each other. We will call this circle [BPOR]. This result is the key to the exercise which is to prove the Erdos-Mordell Theorem: If O is any point inside a triangle ABC and P, Q, R are the feet of the perpendiculars from O upon the respective sides BC, CA, AB, then OA + OB + OC ≥ 2(OP + OQ + OR).
   The hint supplied by Coxeter is this: Let P1 and P2 be the feet of perpendiculars from R and Q upon BC. Define analogous points Q1 and Q2, R1 and R2 on the other sides. Using the similarity of the triangles PRP1 and OBR, express P1P in terms of RP, OR, and OB. After substituting such expressions into OA + OB + OC ≥ OA(P1P + PP2)/RQ + OB(Q1Q + QQ2)/PR + OC(R1R + RR2)/QP, collect the terms involving OP, OQ, OR, respectively.
   In the diagram above the points P1 and P2 are shown along with the feet of the perpendiculars from the arbitrary point O in ABC. Including the perpendiculars from P to Q1, R to Q2, P to R1, and Q to R2 makes for a pretty complicated drawing. It is suggested the reader draw the other two cases separately in order to better picture the argument below. Note: To play around with the parameters of the diagram, try different (sliding) values of the variables defined at the left in this diagram.
   Just as we constructed circle [BPOR] and established the similarity of triangles BOR and RPP1, we can construct five other such circles and pairs of similar triangles. Thus, circle [CPOQ] yields the similarity of triangles COQ and QPP2, circle [CQOP] yields COP similar to PQQ1, circle [AQOR] yields AOR similar to RQQ2, circle [BROP] yields BOP similar to PRR1, and circle [AROQ] yields AOQ similar to QRR2.
   From these similarities we can write:
PP1/OR = RP1/BR = RP/OB, so PP1 = (OR x RP)/OB
QP2/CG = PP2/OQ=PQ/OC, so PP2 = (OQ x PQ)/OC
PQ1/PC = QQ1/OP = PQ/OC, so QQ1 = (OP x PQ)/OC
RQ2/RA = QQ2/OR = RQ/OA, so QQ2 = (OR x RQ)/OA
PR1/BP = RR1/OP = RP/OB, so RR1 = (OP x RP)/OB
QR2/AQ = RR2/OQ = RQ/OA, so RR2 = (OQ x RQ)/OA.
   Also notice that in the quadrilateral RP1P2Q, the distance between parallel segments RP1 and QP2 is P1P2 and therefore RQ ≥ P1P2 = P1P + PP2, or (P1P + PP2)/RQ ≤ 1. Analogously, we get
(Q1Q + QQ2)/PR ≤ 1 and (R1R + RR2)/QP ≤ 1. Thus,
OA + OB + OC ≥ OA (P1P + PP2)/RQ + OB (Q1Q + QQ2)/PR + OC (R1R + RR2)/QP.
   Now substitute the expressions obtained by the similarities into the right hand side of this last expression and collect terms to get
OA + OB + OC ≥ OR[OA.RP/RQ.OB + OB.RQ/RP.OA] + OQ[OA.PQ/RQ.OC + OC.RQ/PQ.OA] + OP[OB.PQ/RP.OC + OC.RP/PQ.OB] which is of the form
OA + OB + OC ≥ OR[z + 1/z] + OQ[y + 1/y] + OP[x + 1/x], where x, y, z > 0.
   If f(u) = u + 1/u, u > 0, then f'(u) = 1 - 1/u^2 = 0 when u=1, f''(1) > 0, so the minimum value of f is f(1) = 2. Thus OA + OB + OC ≥ 2(OR + OQ + OP).
 

Monday, November 9, 2015

Similar Triangles


     Two triangles are called similar when they have the same three angles. When triangles are similar they have the same shape. That is, corresponding sides (the sides opposite equal angles) are in proportion. So if triangle ABC is similar to triangle DEF where angle A = angle D, angle B = angle E, and angle C = angle F, then BC/EF = AC/DF = AB/DE, where, for example, AB is the length of the side from point A to point B in triangle ABC, etc.
   In general, since the angles of a triangle always add up to 180 degrees, we only have to show two of the angles are equal because that implies the remaining ones are equal. In the case of two right triangles, since one angle is known to be 90 degrees, we only have to show one of the other two angles are equal to establish similarity.
   In high school Geometry it is usually pretty straightforward to establish similarity in the problems one is asked to solve. However, there are some cases where things get a little complicated. This note is about one such case.
   In the following diagram we have OR perpendicular to BR, OP perpendicular to BP and RP1 perpendicular to BP at point P1. We want to prove that triangle OBR is similar to triangle PRP1. Since they are both right triangles it is sufficient to show that angle ROB is equal to either angle PRP1 or angle RPP1. From the diagram it "looks like" it should be the latter. Before proving this result I would encourage the reader to play around with this diagram to try to establish the equality by straightforward methods. After all, with all these right triangles around it seems one could deduce this result without too much trouble. When I tried this exercise I was frustrated after many such straightforward attempts and a restless night of sleep. Eventually, an idea popped into my head that yielded the desired result. The fact that it wasn't so straightforward raises an interesting question in itself. That is, what makes a relatively simple problem harder than it first appears? Or, how "obvious" are geometric relationships? Or, why are some things harder to prove than others? Thoughts for another day.



NOTE:    This problem arose as part of Exercise 4 in Section 1.3 of H.S.M. Coxeter's "Introduction to Geometry". The problem in Coxeter is to prove the Erdos-Mordell theorem: If O is any point inside a triangle ABC and P, Q, R are the feet of the perpendiculars from O upon the respective sides BC, CA, AB, then OA + OB + OC ≥ 2(OP + OQ + OR). [The hint he gives is to let P1 and P2 be the feet of the perpendiculars from R and Q upon BC and to define analogous points Q1 and Q2, R1 and R2 on the other sides. Then, using the similarity of the triangles PRP1 and OBR, P1P can be expressed in terms of RP, OR, and OB. This can be done for the other analogous similar triangle pairs. Finally, noting that RQ ≤ P1P + PP2, etc., we can write OA+OB+OC ≤ OA(PP+PP2)/RQ + OB(Q1Q +Q2)/PR + OC(R1R+RR2)/QP. Finally, substitute expressions and collect terms involving OP, OQ, OR to get desired result.] So everything depends on establishing the similarity of the triangles PRP1 and OBR in the diagram above.

   The idea that opened the way for me was to notice that the angles RPP1 and ROB are both subtended by the segment BR. Thus, if a point could be found that was equidistant from B, R, O and P then we could draw a circle containing these four points which contained the chord BR subtending  inscribed angles RPP1 and ROB thus making them equal. Of course, it is not always true that given any four points such a circle can be found, but in this case it did work out. Let's see how to find the center of such a circle.
   Since the center must be equidistant from points R and O it lies on the perpendicular bisector of segment RO. Let D and E be the points where this perpendicular bisector of RO meet segments BO and RO respectively. Now look at triangle RDO with perpendicular bisector DE. Since OE=ER, ED is common, and the bisector is perpendicular to RO, the two triangles formed by the bisector are congruent (SAS). Thus RD = OD. Now we look at the angles. Angle ROB is the complement of angle RBO and it is also equal to angle ORD which is the complement of angle ORD. Thus angle RBO = angle DRB so triangle DRB is isosceles and DR = DB. Thus, B, R, and O are equidistant from D.
   Now we can make the same argument with respect to O and P with perpendicular bisector D'F of OP meeting BO at D' and OP at F. From the analogous argument we get D' equidistant from B, P and O.
   Now we need to show that D=D'. But we know that D and D' are both on segment BO and that both are the midpoint of BO (DO=DR=DB from the first argument and D'O=D'P=D'B from the second) so D=D' and we will call this common point D the center of the circle of radius DB that includes points B, P, O, and R.
   Finally, since chord BR subtends both angles RPB and ROB in this circle, we know these angles are equal and so the triangles BOR and RPP1 are similar. QED

   The amazing thing is that this result holds for any point O inside any triangle (as described in the NOTE above). The complete exercise will be done in the next blog entry coming soon.


Thursday, October 15, 2015

C program for the Khan Academy Pixar Parabola


This program takes three input pairs, A,B,C [in the form “x,y” when prompted], B being the point from which segments are joined to the other two forming tangents to the parabola constructed by forming the points P(t) = (1-t)Q(t) + t R(t), where Q(t)=(1-t)A+tB and R(T)=(1-t)B + tC for t in [0,1]. The proof that this forms a parabolic arc is given in oriolescience.blogspot.com entry 16SEP15. The program outputs the parabola which contains A and C and to which lines AB and CB are tangent. The program follows the logic of the proof.

        To run the program just copy the program into a plain text file with extension .c and then compile it using "gcc name.c" and run "./a.out" which is the executable produced by gcc.


#include <stdio.h>
#include <math.h>
int main(void) {

char s[255],c;
long int i,j;
float x,y,z,a1,a2,b1,b2,c1,c2,d1,d2,mv,mu,mv1,mv2,mu1,mu2;
float A1,A2,B1,B2,C1,C2,D1,D2,ba1,ba2,ba,bc1,bc2,bc,xtemp,xp,aa,bb,cc,A,B,C,D,E,F,check;
float costheta,sintheta,theta;
int k,m,nr;
short n;
float xx,yy,zz;
/*Enter the three points*/
printf("Enter coordinates of Point A: x,y: ");
scanf("%f,%f", &a1,&a2);
printf("Enter coordinates of Point B x,y: ");
scanf("%f,%f", &b1,&b2);
printf("Enter coordinates of Point C: x,y: ");
scanf("%f,%f", &c1,&c2);

printf("A (%f,%f), B (%f,%f), C(%f,%f)\n",a1,a2,b1,b2,c1,c2);
/*Calculate midpoint of AC and determine unit vectors for positive u,v-axes*/
d1=(a1+c1)/2; d2=(a2+c2)/2; nr=-1;
if ((d1==b1) || (d2==b2))
{ /*Check for D with same 1st or 2nd coordinate as B*/
if (d1==b1) { /*D is directly above or below B*/
if(b2<d2) {nr=0;mu1=1;mu2=0;mv1=0;mv2=1;}
else {nr=180;mu1=-1;mu2=0; mv1=0; mv2=-1;};
}

else { /*D is directly to right or left of B*/
if (b1<d1) {nr=270; mu1=0;mu2=-1;mv1=1;mv2=0;}
else {nr=90; mu1=0;mu2=1;mv1=-1;mv2=0;};
}
}
else { /*There is a rotation which is not a multiple of 90*/
mv1=d1-b1; mv2=d2-b2; xtemp = sqrt( mv1*mv1 + mv2*mv2 );
mv1=mv1/xtemp; mv2=mv2/xtemp;
mu1=mv2; mu2=-mv1; /*make unit u = unit v cross unit k*/
}
/*get uv-coordinates for Point A and Point C as components along u, u dot BA, and v, v dot BA, etc*/

ba1=a1-b1;ba2=a2-b2;
A1= mu1*ba1 + mu2*ba2;
A2= mv1*ba1 + mv2*ba2;

bc1=c1-b1;bc2=c2-b2;
C1= mu1*bc1 + mu2*bc2;
C2= mv1*bc1 + mv2*bc2;

printf("unit u is (%f,%f)\n",mu1,mu2);
printf("unit v is (%f,%f)\n",mv1,mv2);
printf("Point A in {A,B,C} is (%f,%f)\n",A1,A2);
printf("Point C in {A,B,C} is (%f,%f)\n",C1,C2);
/*Now can write the coefficients of the parabola in standard form wrt uv-coordinates, i.e. in {A,B,C}
coordinate system as P(u) = au^2 + bu + c ,for u in A1 to -A1=C1 */
aa=(A2+C2)/(4*A1*A1);
bb=(A2-C2)/(2*A1);
cc=aa*A1*A1;
printf("P(u) =  v = %6.3f u^2 + %6.3f u + %6.3f\n",aa,bb,cc);

/*Check that points are on this parabolic arc*/
xx=A1; yy=A2;
check= aa*A1*A1 + bb*A1 + cc - A2;
printf(" At (%5.2f, %5.2f) check = %5.4f \n",A1,A2,check);
xx=C1;yy=C2;
check= aa*C1*C1 + bb*C1 + cc - C2;
printf(" At (%5.2f, %5.2f) check = %5.4f \n",C1,C2,check);


/*Find the Vertex, Focus and Directrix with respect to the coefficients of P(u)*/
xtemp = cc - (bb*bb)/(4*aa); xp= 1/(4*aa);
printf("Vertex (%6.3f,%6.3f)\n", -bb/(2*aa), xtemp);
xtemp = xtemp + xp;
printf("Focus (%6.3f,%6.3f)\n", -bb/(2*aa), xtemp );
xtemp = xtemp - 2*xp;
printf("The directrix is the line v = %6.3f\n", xtemp);

/* Now get the equation in original XY-coordinates by applying appropriate rotation and translation*/
/*Rotation takes unit u-vector to unit i vector of XY-system*/
/* Theory: For simplicity of notation, identify the uv-coordinate system with the X"Y"-coordinate system.
Let  Y"=aX"^2 + bX" + c be the equation in the {A,B,C} or X"Y"-coordinate system. This system is obtained
from the XY-coordinate system by first translating by B to the X'Y'-coordinate system, where X'=X-b1,
and Y'=Y-b2, which in turn is rotated by angle theta with costheta=mu1, sintheta=mu2, the coordinates
of the unit vector u of the X"Y"-coordinate system. That is, u=(mu1,mu2) is the positive unit vector in
the direction of the X"-axis. Then X" = X'costheta + Y'sintheta and Y" = -X'sintheta + Y'costheta. For
example the point with X'Y'-coordinates (1,0) will have X"Y"-coordinates (costheta, -sintheta). (Same point
has different "names" in the different coordinate systems.)
    In our case we are going in reverse, from the X"Y"-system back through the X'Y'-system to the XY-system.
Starting with Y"=aX"^2 + bX" + c, we substitute (-X'sintheta + Y'costheta) for Y" and (X'costheta + Y'sintheta)
for X" to get an equation in X' and Y'. Now substitue X-b1 for X' and Y-b2 for Y' to get an equation in X and Y.
Expanding and collecting terms we get the expression A X^2 + B Y^2 + C XY + D X + E Y + F = 0, as below*/

costheta = mu1;
sintheta = mu2;
printf("Rotation theta has cosine %f and sine %f\n", costheta,sintheta);
printf("Translate by B and Rotate by theta (Y) = %6.3f (X)^2 + %6.3f (X) + %6.3f \nto get \n",aa,bb,cc);
A = aa*costheta*costheta; B = aa*sintheta*sintheta; C = aa*2*sintheta*costheta;
D = sintheta + (bb*costheta) - (2*aa*b2*sintheta*costheta) - (2*aa*b1*costheta*costheta);
E = -costheta + (bb*sintheta) - (2*aa*b1*sintheta*costheta) - (2*aa*b2*sintheta*sintheta);
F = (aa*b1*b1*costheta*costheta) +(aa*b2*b2*sintheta*sintheta) + (2*aa*b1*b2*sintheta*costheta);
F = F - b1*bb*costheta - b1*sintheta - b2*bb*sintheta + b2*costheta + cc;
printf(" %6.3f X^2 + %6.3f Y^2 + %6.3f XY + %6.3f X + %6.3f Y + %6.3f = 0\n", A,B,C,D,E,F);

/*CHECK original points in this parabolic equation*/
xx=a1;yy=a2;
check = (A*xx*xx) + (B*yy*yy) + (C*xx*yy) + (D*xx) + (E*yy) + F;
printf(" At (%5.2f, %5.2f) check = %5.4f \n",xx,yy,check);
xx=c1;yy=c2;
check = (A*xx*xx) + (B*yy*yy) + (C*xx*yy) + (D*xx) + (E*yy) + F;
printf(" At (%5.2f, %5.2f) check = %5.4f \n",xx,yy,check);


return 0;
}


/* Sample Program Prompts, Input and Output
Enter coordinates of Point A: x,y: 3,9
Enter coordinates of Point B x,y: 2.5,6
Enter coordinates of Point C: x,y: 1,4
A (3.000000,9.000000), B (2.500000,6.000000), C(1.000000,4.000000)
unit u is (0.707107,0.707107)
unit v is (-0.707107,0.707107)
Point A in {A,B,C} is (2.474874,1.767767)
Point C in {A,B,C} is (-2.474874,-0.353553)
P(u) =  v =  0.058 u^2 +  0.429 u +  0.354
 At ( 2.47,  1.77) check = -0.0000 
 At (-2.47, -0.35) check = 0.0000 
Vertex (-3.712,-0.442)
Focus (-3.712, 3.889)
The directrix is the line v = -4.773
Rotation theta has cosine 0.707107 and sine 0.707107
Translate by B and Rotate by theta (Y) =  0.058 (X)^2 +  0.429 (X) +  0.354 
to get 
  0.029 X^2 +  0.029 Y^2 +  0.058 XY +  0.520 X + -0.895 Y +  2.338 = 0
 At ( 3.00,  9.00) check = 0.0000 

 At ( 1.00,  4.00) check = 0.0000 

*/

Tuesday, September 22, 2015

The Equilibrium Constant as a Ratio of Probabilities

     Given the balanced chemical equation of a reaction in equilibrium aU + bV <-> cY + dZ, we define as the forward reaction aU + bV -> cY + dZ with reactants U,V and products Y, Z and the reverse reaction as cY + dZ -> aU + bV with reactants Y, Z and products U, V. The coefficients a, b, c, d represent the relative amounts (in moles) of the substances U, V, Y, Z needed for the reaction. That is, a moles of U and b moles of V react to form c moles of Y and d moles of Z. [1 mole is a quantity (6.022 x 10^23) of things just like 1 dozen is a quantity (12) of things]
     By definition, equilibrium is when the rate of the forward reaction equals the rate of the reverse reaction. So when a reaction is in equilibrium, the concentrations of the substances (reactants and products) remains constant even as the reactions continue happening.
     If we add (or subtract) the amount of one (or more) of the substances, the reaction will no longer be in equilibrium. The reactions will continue at different rates until equilibrium is again reached at the same equilibrium concentrations as before.
     For example, if we add substance U to the mixture the increase in the concentration of U will cause an increase in the rate of the forward reaction (more U-molecules reacting with the available V-molecules) tending to increase the product concentrations [Y] and [Z] while decreasing the reactant concentrations [U] and [V]. The product concentrations will increase until the reverse rate of reaction matches the forward rate. Note the forward rate will decrease as the concentrations of [U] and [V] decrease.
      A reaction depends on the right molecules interacting with each other with the right kinetic energy to cause the necessary reaction-causing collisions to effect the redistribution of electrons amongst the constituent molecules. The ambient temperature is a measure of the average kinetic energy of the molecules and the actual molecules have a kinetic energy distribution which we can take as a probability density function about this average. Thus some molecules have higher than average KE and some lower. In particular, the higher the temperature the more higher energy collisions. At the same time, the concentrations of the constituent molecules is also directly related to the rate of reaction-causing collisions. The more molecules we have per volume, the more likely they will collide.
     A mathematical description of the discussion above follows. The rate of a reaction
aU + bV -> cY + dZ is governed by the probability of getting effective collisions between reacting molecules. For example, in the forward reaction we need to get a molecules of U together with b molecules of V in a small enough volume dW and with sufficient kinetic energy to make effective collisions. Assuming for now that the temperature T allows some number of such collisions (the number then being proportional to the temperature) the other factor is the concentrations [U] and [V] of the reactants. The probability of getting a U molecule in a volume dW is directly proportional to [U] and the same for V molecules. Thus the probability of getting the required number of molecules in dW for a reaction to occur is proportional to [U]^a [V]^b. Let P(T) be the probability of a reaction occurring at temperature T given the molecules are within the reaction volume dW. Then the overall probability of a reaction is proportional to the product [U]^a [V]^b x P(T). Since P(T) is fixed for the reaction we have that the probability of an effective collision is directly proportional to [U]^a [V]^b. Thus the rate of the forward reaction is directly proportional to [U]^a [V]^b. Similarly, the rate of the reverse reaction is directly proportional to [Y]^c [Z]^d. Since at equilibrium these rates are equal, we know that the ratio [U]^a [V]^b/[Y]^c [Z]^d must be equal to some constant for a given temperature T. This equilibrium constant is a property of the chemical reaction aU + bV <-> cY + dZ and can be used to predict outcomes of experiments.

Wednesday, September 16, 2015

The Khan Academy, Pixar and Parabolas

     The Khan Academy is a treasure which I recommend to everyone. Recently they added a short course called Pixar in a Box (link at bottom of entry) in which the first section describes how parabolic arcs are used to make the incredible fields of grass or fur on bears or hairs on heads, etc. It is a fun read and in about a half hour you will get the idea of how these things are done. However, I was left a bit dissatisfied about the explanation of why the construction used really defines a parabolic arc and so I worked out a proof that satisfied myself and perhaps might be of interest to others.
   
     The idea is that if you choose any three distinct points A, B, and C, and divide segments BA and BC into n equal subdivisions each, and connect the dividing points in reverse order, the envelope of the drawn segments will form a polygon and as n increases to infinity, the polygon will approach a parabolic arc. The Khan Academy lesson shows this in pictures which I recommend viewing.

  
     What I will do here is prove the general statement of this property.

    Statement to be Proved: Given any three distinct points A, B, C, define the sets of points 
Q(t) = (1-t)A + tB, R(t) = (1-t)B + tC, for t in [0,1]. The set of points 
P(t) = (1-t)Q(t) + tR(t) for t in [0,1] define a parabolic arc.

Proof: We know that we have a parabolic arc if we can show that there is a point F and a line L such that for every point P in the set, the distance from P to F equals the distance from P to the line L. 

Define the two-dimensional rectilinear coordinate system {A,B,C} as the system with origin B, a “vertical” v-axis being the line through B and the midpoint M of segment AC where M is on the positive half of the axis, and a “horizontal” u-axis being the perpendicular to line BM through B such that the unit positive u-vector is the cross product of the unit positive v-vector and the unit vector k pointed upward from the uv-plane.

In {A,B,C}, let the points have coordinates A(a1, a2), C(c1, c2), B(0, 0), 
M( (a1+ c1)/2, (a2+ c2)/2 ) where by definition (a1+ c1)/2 = 0, so c1 = -a1 .

Then P(t) = (1-t)Q(t) + tR(t) = (1-t)[(1-t)A + tB] + t[(1-t)B + tC]
= t2 (A-2B+C) +2t(B-A) + A, which we can write componentwise as the pairs:
P(t) = [t2 (a1 - 2b1 + c1) +2t(b1 - a1) + a1) , t2 (a2 - 2b2 + c2) +2t(b2 - a2) + a2) ] 
= [ a1(1-2t) , t2 (a2 + c2) + a2(1-2t) ] for t in [0,1].
Let u = a1(1-2t) so t = (a1 - u)/2a1 and P(u) = [ u , au2 + bu + c] for u in [a1, -a1] where 
a = (a2 + c2)/4a12 , b = (a2 - c2)/2a1 , c = a a12 . This shows P(u) forms a parabolic arc (i.e. a quadratic in the {A,B,C} coordinate system).

In particular, we know that for a parabola v = au2 + bu + c ,
the vertex V is ( -b/2a , c - b2/4a), the focus F is V+(0, p), and the directrix L is the line v = v2 - p, where p = 1/(4a). Thus we have proved the statement.

Example: Find an equation for the parabolic arc through A(3,9) and C(2,4) as defined by P(t) above with B(5/2,6).
Let the original rectilinear coordinate sytem which defines A, B, and C for the problem be called the standard xy-coordinate system. In this system the midpoint of segment AC is M(5/2, 13/2). We define the {A,B,C} rectilinear coordinate system as that having origin B(0,0), “vertical” axis as the line through B and M, and the “horizontal” axis as the perpendicular to line BM through B. In this case the “vertical” axis is vertical with respect to the standard coordinates because B and M have the same first coordinate in that system. This tells us that there is no rotation involved in the change of coordinates. Therefore the change of coordinate systems involves only a translation sending point (x,y) in the standard system to the point (x-5/2, y-6) in the new system. 
Thus A(3,9) becomes A(1/2, 3), B(5/2,6) becomes B(0,0), and C(2,4) becomes C(-1/2,-2). Using these translated coordinates, we can write, using the formulas derived in the proof above, P(t)  = [(1/2 - t), t - 6t + 3], for t in [0,1] so letting u = (1/2 - t), get P(u) = [ u , u2 + 5u + 1/4], for u in [1/2, -1/2]. 

Thus p=1/4, V(-5/2,-6), F(-5/2, -6 +1/4), L is v= -6 - 1/4. These values can be translated back to the standard coordinate system by adding (5/2,6) to get the values for the desired focus and directrix given the original points as F(0, 1/4) and y= -1/4 as well as the vertex (0,0) which of course correspond to the parabola y = x2.

More work would be involved in an example where the line through B and M was rotated with respect to the standard y-axis but the theory is the same and of course the proof does not depend on the particulars of the orientation of the {A,B,C} system.






https://www.khanacademy.org/partner-content/pixar/environment-modeling-2/mathematics-of-parabolas2-ver2/a/parabolas-lesson-brief

Friday, July 17, 2015

Three Useful Theorems from Vector Calculus

In a second year Calculus class, one is usually introduced to vector fields which are so important in Physics. The theorems of Green, Gauss and Stokes are the classic trio which tie together line and surface integrals with ordinary double and triple integrals involving such fields. An illuminating application of Stokes' Theorem is found in my earlier blog on surface tension.

The Theorems of Green, Gauss, and Stokes: A Study Guide from First Principles is a paper I wrote in 2014 to review the mathematical tools necessary to establish these theorems. As such, the paper might be used as a handy study guide for reviewing many of the key aspects of basic Calculus.


If link does not work, copy this address into your browser:
https://drive.google.com/file/d/0B9grE0MhdrFLTklxQzRvdlMtazQ/view?usp=sharing

Conics and Orbits

In a first look at the motion of an object under the influence of gravity, one often makes the simplifying assumption that while an object is in motion near the surface of the earth the force of gravity is constant. Under this assumption, the equations for acceleration, velocity and distance are simple polynomials in time which are easily solved with high school algebra. In particular, the resulting equation of motion is a quadratic equation in time meaning that the path of the object's motion is a parabola. How close is this parabolic path to the actual (gravity varying with distance from the center of the earth) path?

Conics and Orbits is a paper that I worked on in 2013 to address this question. It first takes a Geometric look at Conics in which the basic properties of Conics are developed directly from their definitions as sets of points with a fixed ratio of distances to a fixed point and a fixed line. It then examines the fundamentals of Orbits to show that their paths are conics. At the end it gives a quantitative answer to how good the approximation of a parabola is to the true path (very good for moderate initial velocities)!

In case link does not work, copy this address in browser:
https://drive.google.com/file/d/0B9grE0MhdrFLRHYwV29FWDNzejA/view?usp=sharing



Wednesday, June 3, 2015

Vector Calculus - Feynman's treatment

         This is just a shout out for the treatment of "Differential Calculus of Vector Fields" and "Vector Integral Calculus" given in Chapters 2 and 3 of Richard Feynman's "Lectures on Physics", Vol II.
         In my opinion, when these topics are presented by math professors they are often separated from their real-world origins and applications. Separated like so, the defined entities have no physical life so to speak and just exist as analytical tools (however useful they might be). It always bothered me that this topic, usually encountered near the end of the Calculus sequence, never came to life the way that most of the rest of Calculus did. Of course, what I did not realize, was that that life was just over the Physics horizon (Electromagnetism, Fluid Dynamics, etc) when I turned to study more abstract mathematics.
         The theorems are quite beautiful in that they establish properties which hold for general surfaces curves. That is, it might be pretty easy to analytically establish a result for a given surface (like a section of a sphere) where known properties of that surface can be used. But to establish a result for any surface is a pretty nice analytical accomplishment. On the other hand, if there is a physical interpretation involved, picturing the situation might bring the abstract result to life and make it more memorable!!
     In any case, I am pointing the work out for anyone who might be interested and for anyone who might like to be reminded, here are the four summary results of the two chapters.

1) The operators δ/δx, δ/δy, δ/δz, can be considered as the components of a vector operator ∇ and the formulas that result from vector algebra by treating this operator as a vector are correct.

2) The difference of the values of a scalar field η at two points equals the line integral of the tangential component of the gradient of that scalar along any curve between the first and second point.
   η(2) - η(1) = line integral of ∇η ・ ds along any curve connecting (1) to (2)  , ∫  ∇η ・ ds, where ds is the differential segment in the direction from (1) to (2), that is ds  = t ds where t is the unit tangent vector along the curve. So η(2) - η(1) = ∫  ∇η ・ t ds.

3) The surface integral of the normal component of any vector field C over a closed surface S equals the integral of the divergence of the vector field over the interior volume V. That is
∫  Cn da = ∫ ∇・C dV where the left integral is taken over the closed surface S and the right integral is taken over the volume V.

4) The line integral of the tangential component of any vector field C around a closed loop Γ equals the surface integral of the normal component of the curl of that vector field over any surface S bounded by Γ. That is ∫ C ・ d= ∫ (∇xC) ・ n da, where the left integral is taken over Γ and the right integral is taken over the surface S.

   And that's what vector calculus is all about.


Wednesday, May 27, 2015

Surface Tension and Capillary Action

Surface tension and capillary action


What is capillary action?

Capillary action, the result of a pressure differential at the surface of a liquid in a slender tube, is the rise or fall of the liquid in the tube above or below the original liquid line of the container into which the tube is inserted.

Preliminaries:
  1. Density = Mass/Volume; Force = Mass*Acceleration (N); Pressure = Force/Area (Pa);
    Work = Force*distance (J)
  1. Newton (N) = kg-m/sec2; Pascal (Pa) = N/m2; 1 atm= 101.3 kPa = 760 mmHg = 760 torr; 1 liter (L) = 0.001 m3 = 1 (dm)3 = 1000 cc; Joule (J) = Nm; the joule is also the unit of Energy (the capacity to do work)
  2. In general, if a liquid has density D and the pressure at the surface is P, the pressure at depth H is P+ gDH where g is the acceleration of gravity. Let’s see why. If we consider a cylinder of the liquid of height H, radius R and top at the surface then the mass of the column is M = D*(π*R2*H) and the downward force due to this mass is, Mg =  D*(π* R2*H)*g. Therefore, the additional pressure at the bottom of the cylinder due to this column of liquid is the force per area or g* D*(π*R2*H)/π*R2 = gDH. Thus the total pressure at depth H is P + gDH. For example, water has density 1.0 kg/L, the “deep water” additional pressure at depth H is gH where g is 9.8 m/sec2 and H is measured in meters. That is, 9.8 m/sec2*1.0 kg/L*H m = 9.8*H kPa. So at 10m below the surface of a pool the pressure is nearly twice the standard atmospheric pressure (1 atm). If a tube is inserted 5 cm into a container of water, and the container is in a room with 101.3 kPa pressure, the total pressure at the base of the tube is 101.3 + 9.8(.05) kPa = 101.8 kPa.
  3. When we look at a pressure difference on either side of an interface between two fluids we measure it in the direction opposite to the outward normal (the normal pointing away from the center of curvature). That is, we measure it in the direction towards the center of curvature. In this way we will always have ∆P > 0.


Given a sufficiently thin tube, whether the liquid in the tube rises above or falls below the original liquid line of the container depends on the relative strengths of two forces: cohesion between the molecules of the liquid and adhesion of the liquid molecules to the walls of the tube. For example,  water in a glass tube is a case in which adhesion (water to glass) is stronger than cohesion (water molecules bonded by hydrogen bonds), whereas liquid mercury is a case where cohesion is stronger than adhesion. We will get back to this after explaining surface tension.

Surface tension is a phenomenon that takes place at the interface of two substances. 
For our discussion we will assume an interface between a liquid and a vapor although the arguments can apply to any two immiscible liquids.
There are several ways to explain where surface tension comes from and what it is. Because of this there can be confusion about how surface tension acts. In fact, the way it acts can be described in different ways. There is a very good explanation of surface tension at: doc.utwente.nl/79082/1/why_is_surface.pdf

In particular, this paper fills in the gap that most explanations leave as to why the surface tension acts parallel to the surface of the interface.

The difficulty arises because most descriptions of surface tension point to one phenomenon, the difference in the number of bonds that exist for bulk molecules versus interface molecules due to the asymmetry at the surface, as the origin of surface tension. The difference in molecular density on the two sides of the interface and the subsequent reduction in bonds established with molecules above the surface by the surface molecules, leads to increased free energy in the interface molecules which means that the system tends to minimize the number of such higher energy surface molecules. That is, the system tends to minimize its surface area. The work needed to create new surface area is then defined as surface tension. The units of work (energy) per unit area are equivalent to force per unit length and so this is also used to define surface tension. But from this description it is not obvious why the force that arises from surface tension is parallel to the surface. 

In the paper cited above, the direction of the force that arises from surface tension is explained in terms of two forces acting on the molecules near the surface (within a few molecular diameters). Not only the attractions used to explain the difference in free energy in the surface molecules, but also an isotropic close-range repulsive force that acts as a counterbalance to the atrractive forces. Although best described in the detail of the paper cited, here we will summarize the argument.

The repulsive force is a very close-range force that arises only when molecules are near enough to each other that their electron clouds repel each other (according to the Pauli exclusion principle which says that no two electrons can occupy the same quantum state, that is, have the same four electronic quantum numbers, in the same atom). This repulsion is isotropic - the same in all directions.
The attractive force between molecules are relatively longer range and are anisotropic - not necessarily the same in all directions. In particular, at the surface, where there is an imbalance in molecules above and below the surface, the net attraction on a surface molecule will be inward towards the liquid.

A thought experiment can explain the interaction and the resultant net force from these forces. Consider imaginary surfaces of length w, at depths d1, just below the surface, and d2 a few molecules deeper. Now consider the two subsystems, ad1, the liquid above d1, and bd1, the liquid below d1. Similarly we can consider the subsystems ad2 and bd2, the bodies of liquid above and below d2 respectively. Now consider the overall attraction and repulsion of ad1 on bd1 and ad2 on bd2 across the width w. Since our system is in equilibrium we must have attraction equal to repulsion at each depth. But the attraction of ad1 is certainly less than the attraction of ad2 because there are fewer attracting molecules in ad1. Since attraction equals repulsion at each level we must have that the repulsion is also less at d1 than at d2. Let a1, r1, a2, r2 be the attraction and repulsion forces of subsystems ad1 on bd1 and ad2 on bd2 respectively. We have a1=r1 < a2=r2. Note the overall forces are proportional to w. As we go deeper into the liquid we soon arrive at the state where all forces are perfectly balanced as they are in the bulk of the liquid.

Next we consider two subsystems called left-hand-side (lhs) and right-hand-side (rhs) where the liquid is divided into two parts by a plane perpendicular to the surface (interface) and again of width w. Now let us consider the attraction and repulsion of lhs on rhs. Since repulsion is isotropic, the repulsive force is still r1 at d1 and r2 at d2 in this horizontal direction. But unlike the vertical case, the horizontal attractive forces are balanced and the same at every depth. (Note: since the horizontal forces are balanced for both repulsion and attraction at every depth (net zero) there is no requirement that they balance each other (as was the case in the vertical case). Thus at d1 we have a net attractive force which has diminished at d2 and will go to zero in the bulk. This is the key result which is missing from most discussions of surface tension and I am indebted to the team in the cited paper for this insight.
Taking the thought experiment a few steps further, we see that the line drawn on the surface to establish the lhs/rhs subsystems was arbitrary so that for any point P on the surface it is attracting in all directions. Therefore, for interior points on the surface the net attraction is zero. However, at the boundary of the surface there is a net attraction parallel (tangent) the surface and normal to the boundary curve (the non-normal components cancel and the normal components sum to the net). Again the overall attraction is proportional to the length of the boundary curve. Finally, if the surface is curved we make the same argument in the limit by considering the tangent plane at P as the approximation to the case discussed.
So we have the following result: surface tension is a force per unit length along the boundary of the interface which acts in a direction tangent to the surface and normal to the boundary. Thus, if we think of the interface as a sheet, the surface tension acts as a pulling on the sheet around the edges to make the sheet taut (i.e. introduces tension in the sheet). If an object is placed on the sheet the sheet can hold it as long as the upward force (vertical component of surface tension of sheet times length of edge) is greater than or equal to the weight of the object.

The surface tension acts parallel to the surface. Thus, where the surface is flat the surface tension is all horizontal and where the surface is curved the surface tension acts along the tangent to the curve so it has both a horizontal and a vertical component.

Example of surface tension at work: When the circular pod of radius r of a water-walking insect rests on the water, it presses down a bit on the surface creating a tangential angle z around the circumference where the water curves back to the surface. Since all other surface tension components are horizontal and mutually cancelling, the vertical component, T sin z, where T is the surface tension of the water, is the net force which is applied around the circumference for a total net force upwards of (T*sin z)*(2πr) = (vertical component of surface tension) * (distance around circumference). If this force is at least equal to the weight w of the insect then the insect will not break through the water surface. That is, we need the force (2πrTsin z) ≥ w for the insect to rest atop the water....not floating, but held atop by the surface tension of the water.

Another example of surface tension at work: A small quantity of water wants to be a sphere because that is the shape with the minimal surface area for a given volume (in this case the volume of the drop of water). That is, any deformation of the sphere would increase the surface area thus increasing the number of molecules in a higher energy state which doesn’t happen any more than a marble resting anywhere but at the bottom of a U-shaped trough as systems always try to go to the least energy state. [Note: When the volume increases gravity comes more into play in shaping the water, for example, the teardrop shape or such a blob.]

Another example of surface tension at work: The liquid/air interface surface in a tube is called the meniscus. In a sufficiently narrow tube the meniscus will be curved either in a convex or concave shape as seen from above. The surface tension in the liquid will try to minimize the surface area of the meniscus thereby shaping it into a portion of a sphere (that is a shape of constant curvature) just as in the last example.
Suppose we had a meniscus of radius of curvature R in a tube of radius r. Suppose that z is the angle the (tangent to the) curve makes with the downward tube at the air/liquid interface. Then we can show that cos z = ± r/R depending on z being less than or greater than 90º. To see this, let Q be a point on the meniscus/tube intersection and let O be the center of the sphere of which the meniscus is a part. Then the length of OQ is R. Let B be the intersection of a vertical line through O and the perpendicular to this line through Q. Let L be the vertical line through Q (the line of the tube). Let E be the intersection of the tangent line and the vertical line through O. We know that the tangent line is perpendicular to the radial line through Q. The the length of BQ is r and triangle ∆OBQ is a right triangle with hypotnuse R. First consider the case z < 90º. Angle z is the complement of angle BQE which in turn is the complement of angle BQO so angle BQO = z and cos z = r/R. Now consider the case z >90º. Let x be the angle between the radial line through Q and the line L. Then x = z-90º and x equals angle BOQ as opposite angles of a transversal through parallel lines L and the vertical through O. Thus sin x = r/R = sin (z-90º) = -sin(90º-z) = -cos z. So cos z = -r/R.


Cohesion and adhesion in a capillary system.

Case 1: Let’s first look at the case of water in the narrow tube (sufficiently small radius so adhesion/cohesion dominate gravitational effects on the water surface). When we submerge one end of the tube into the water container to a depth H we notice that the water in the tube rises above the container water level by a height of h and we notice that when we look down the tube from above the surface of the water in the tube is concave. If we use a photograph we could measure the angle z that the tangent to the curve makes with the vertical tube at the air/water interface (z < 90º). If we think about the water surface being higher around the tube than at the center it makes sense when adhesion is stronger than cohesion because what happens in the water is that molecules are attracted more to the outside, towards the tube, away from the center, so with less density near the center there is a pressure differential, ∆P, at the center between the air pressure above the surface, p1, and the pressure just below the surface, p2. Here ∆P = p1-p2 as in preliminaries 4. This difference in pressure pushes the water down while the adhesion acts to hold the water to the walls and together they cause the concave water surface as seen from the top. This differs from the situation where a horizontal interface exists where the pressure above and just below the surface are the same (∆P=0) until at deeper depths where the added pressure due to the weight of a column of water increases the pressure in the water as discussed in preliminaries 3.

  Suppose p1 is the ambient air pressure in the room. Then the pressure on the container surface water is p1 as is the pressure at the top of the water in the tube. Suppose we observe the water in the tube at height h above the container level. As in preliminaries 3, the pressure at the bottom of the tube is p1 + gdH pushing up on the water in the tube. But since the pressure, p2, in the water is less than p1, the stronger pressure p1 + gdH at the bottom of the tube pushes the column of water upwards until it reaches a height h above the original water level where now the downward pressure of the water in the tube, p2 + dgh + dgH matches the upward pressure. The increase of downward pressure in the tube comes from the pressure from the newly created h-high “capillary-effect” column of water in the tube. Thus, equilibrium is reached when p1 + gdH = p2 + dgh + dgH, or p1-p2 = ∆P = dgh, where ∆P is the change in pressure at the water/air interface in the tube in the direction towards the center of curvature of the surface.

Case 2: Now suppose cohesion dominates, so cohesion is greater than adhesion. Let’s use mercury as such a liquid. Molecules in the mercury are pulled together more than they are attracted to the tube wall so the density increases and the pressure in the mercury, p2, is greater than the atmospheric pressure, p1 at the interface. We note that the surface of the mercury is now lower around the tube than it is in the middle as the difference in pressure pushes up on the middle against the more limited adhesive attraction to the wall. Again, we could measure the angle z that the tangent to the surface curve makes with the vertical tube at the air/water interface (z > 90º). As in case 1, let H be the depth of the tube in the container of mercury, d the density of mercury and h the height needed for equilibrium. Just as before we must have p1 + gdH = p2 + gdH + gdh, or ∆P = p2-p1= -gdh. But now, since ∆P is positive, we must have h < 0. This means that the mercury column in the tube must go h units below the original mercury level in the container.

The analysis so far has been based on pressure and from this we derived the height h (positive or negative) that a liquid attains in a tube relative to the original liquid level in a container in terms of ∆P. We saw that h was positive when adhesion dominated cohesion and that in this case the meniscus was concave so the contact angle z satisfied z < 90º. We also saw that h was negative when cohesion dominated adhesion and that in this case the meniscus was convex so the contact angle z satisfied z > 90º. There is more to the story.

What is ∆P?  It turns out that we can actually compute ∆P as a function of the surface tension of the liquid and the shape of the interface. This is the Young-Laplace equation which says ∆P = γ (∇・n) where n is the outward unit normal (away from center of curvature), γ is the surface tension, and ∆P is the pressure difference in the direction towards the center of curvature (so ∆P ≥ 0). In the case of a spherical shape, this reduces to ∆P = 2γ/R where R is the radius of curvature. This result is explained in detail with some examples in another blog article called Young-Laplace Equation. Using this result we can write (for a spherical interface)
∆P = dgh = 2γ/R.
Now, using the example we worked out at the end of the section on surface tension, we know that cos z = ±r/R where r is the radius of the tube and z is the angle between the curve and the downward verticle. Thus we get


∆P = dgh = 2γ/R = ± (2γcos z)/r where + is for z < 90° and - for z > 90°.

This result, 

γ = ± dghr/2cos z 

allows you to determine the surface tension from experimental measurements. 


Another approach for the special case of a spherical interface.
Finally, we can view the equilibrium condition as one in which the vertical component of surface tension at the interface is counterbalancing the weight (gain/loss ) of the liquid in the tube compared with the container level weight. 

In case 1, where z < 90°, the vertical component of the surface tension is γcos z. We see this by noting that the surface tension has magnitude γ in the direction tangent to the curve and this tangent makes the angle z with the downward verticle. So resolving this into verticle and horizontal components, the horizontals cancel around the boundary and the verticals give us a total force of 2πrγcos z upward. As before, the increased weight of the water being held up by the surface tension is πr2hdg. Thus, at equilibrium we get πr2hdg = 2πrγcos z or
rhdg = 2γcos z , as before.

In case 2, where z > 90°, the vertical component of the surface tension is 
γcos(π-z) = -γcos z so the net vertical force due to surface tension is -2πrγcos z (a downward force) which is balanced by the decreased weight of the water -πr2hdg. So again, 

rhdg = -2γcos z

Thus,

γ = ± dghr/2cos z  where + is for z < 90° and - for z > 90°, like before.

This second derivation did not rely on the Young-Laplace equation which provides a much more general framework than the spherical case which can be handled with a less sophisticated tool.



Sunday, May 24, 2015

Young-Laplace Equation

Young-Laplace Equation:  To show that ∆P=γ(∇・n) where n=unit normal to surface, γ is the surface tension of the liquid, ∆P=pressure difference Pl - Pv, in moving across a vapor-liquid interface from a vapor with pressure Pv to a liquid with pressure Pl. In the case of a spherical surface of radius R as the interface, show ∆P=2γ/R.

Preliminaries:

  1. The gradient operator, ∇ = (δ/δx1, δ/δx2, δ/δx3) can act on a scalar function producing a vector, or on a vector function producing a tensor.  If f(x1,x2,x3) is a scalar function and g(x1,x2,x3) is a vector function, g=(g1,g2,g3) where the gi are scalar functions, then ∇f=(δf/δx1, δf/δx2, δf/δx3), a vector, and ∇g= (∇g1,∇g2,∇g3), a tensor. It is sometimes convenient to think of ∇g as a 3x3 matrix with each gradient as a column vector. Also recall that (∇・g) is called the divergence of g,
    div g = (δ/δx
    1)g1 + (δ/δx2)g2 + (δ/δx3)g3, a scalar, and (∇ x g) is called the curl of g,
    curl g = ((δ/δx
    2)g3 - (δ/δx3)g2, (δ/δx3)g1 - (δ/δx1)g3, (δ/δx1)g2 - (δ/δx2)g1), a vector.
  2. The dot product, ab, of two vectors a and b, is the scalar Σ aibi i=1,2,3. As a matrix multiply this is (1x3) times (3x1). For a constant vector a, (a・∇) is a differential operator, (a1 δ/δx1, a2 δ/δx2, a3 δ/δx3 ), and is not the same as the scalar value (∇・a), or div a.  Also, (a・∇)f = a・∇f. The dot product of a vector a and a tensor (b,c,d), a・(b,c,d), is the vector (ab,ac,ad). As a matrix multiplication this is (1x3) times (3x3) yielding (1x3) where b, c, d are the columns of the matrix. The dot product of a tensor (b,c,d) and a vector e , (b,c,d)・e, is the vector (eb,ec,ed). As a matrix multiplication this is (3x3) times (3x1) yielding (3x1) yielding (3x1) where b, c, d are the rows of the matrix. Using these definitions it is easy to show that a・((b,c,d)・e) =  (a・(b,c,d))・e so we can write a・(b,c,d)・e to mean one or the other.
    In particular, considering the tensor ∇f, we have (a・∇)f = a・∇f = a・∇(f
    1,f2,f3)
    = (a・∇ f
    1, a・∇ f2, a・∇ f3), and  a・∇fn = (a・∇f)n = a・(∇fn) =  Σ ai δfj/δxi nj i,j=1,2,3.
  3. The total force, fp, due to constant pressure P on a surface S is the product PA where A is the surface area and where the force acts normal to the surface at all points. Taking ∆S as an approximation of a surface area element by an element of the tangent plane at some point on that surface area element and the force on this element as P∆Sn where n is the unit normal to this tangent plane, then fp = ∫S PndS, where ∫indicates a surface integral. Recall, if the surface is parameterized as r(u,v)=(x(u,v),y(u,v),z(u,v)) over the region Ruv, then n = ru x rv/| ru x rv |.
  4. Similarly, the total force, ft, due to surface tension γ along a closed curve C which forms the boundary of the surface S is the product γL where L is the length of C and where the force acts normal to C and parallel (tangent) to the surface at all points. The direction of ft is given by t x n where t is the unit tangent to C and n is the unit normal to S (thus their cross product is normal to C and parallel to S). We can compute ft as a line integral, fp = γ ∫C t x n dr, where the total force due to the surface tension is represented as a limiting sum of the approximation forces  (γ ∆r t x n) taken along C.
  5. Stokes’ Theorem states that  ∫C F・dr = ∫C F・t dr = ∫S (∇  x F)・ndS.
  6. In general, curl(a x b) = ∇ x (a x b) = (b・∇)a-(a・∇)b + a (∇・b) - b (∇・a). If a is a vector function and b is a constant (vector function), then ∇ x (a x b) = (b・∇) - b (∇・a), since any differential operation on a constant vector function leaves the zero vector.
  7. For any vectors a, b, c, the triple product ab x c = bc x a = ca x b = - ac x b where ab x c =  determinant of the matrix with rows a, b, c.
  8. a. For scalar function f and vector function v we have
    ∇(fv)・v = (δ(fv)/δx
    1, δ(fv)/δx2, δ(fv)/δx3)・v
    =  (δ(fv)/δx
    1v, δ(fv)/δx2v, δ(fv)/δx3v)
    = ((v
    1[ v1δf/δx1+f δv1/δx1] + v2[v2δf/δx1+f δv2/δx1 ] + v3[v3δf/δx1+f δv3/δx1]), ___, ___)  So by symmetry we can write,
    = ( (vv)δf/δx
    1 + f δv/δx1v, (vv)δf/δx2 + f δv/δx2v, (vv)δf/δx3 + f δv/δx3v)
    =
    (vv)∇f + f (∇vv).
    b. In particular, if v is a unit vector, vv = 1 , so the first term reduces to ∇f and
    vv = ( δv/δx
    1v, δv/δx2v, δv/δx3v) where
     2
    δv/δxiv = δ(vv)/δxi = 0. So when v is a unit vector, ∇(fv)・v = ∇f.
  9. Since for scalar function f and vector function v, div(fv) = ∇f・ v + f div v, when ∇f is perpendicular to v, div v = f div v.
  10. The surface integral  ∫S f(x,y,z) dσ is a limiting sum of products f(xi,yi,zi) dσi taken over area elements dσi of the surface S where (xi,yi,zi) is a point on dσi. When we have a vector function F instead of the scalar function f we define  ∫S F ・ dσ  = S Fn dσ where n is a unit normal to the surface. In this paper we will also consider the case where we are evaluating an integral of the form  ∫S F dσ ( a sum of vectors ) which we define as  ∫S F dσ = (  ∫S F1 dσ, ∫S F2 dσ, ∫S F3 dσ). In particular we will be integrating to get the total force obtained by the limiting sum of products (pressure times area) or (surface tension time length) over elements of surface area on a surface or arc length along a curve. See (3) and (4) below.

Main argument:

Let S be the liquid-vapor interface surface, C  the closed curve boundary of the interface, γ the surface tension of the liquid and ∆P the difference in pressure
 P
l - Pv between the liquid and its vapor. If the system is at equilibrium, the total force on the surface is zero. By (3) and (4) in the preliminaries this gives us the equilibrium equation:
0 = f
p + ft  or   ∫S ∆P ndS = - γ ∫C t x n dr . 

From (5) we have  ∫C F・dr = ∫C F・t dr = ∫S (∇  x F)・ndS , so letting F = g x b where g is a vector function and b is a constant (vector funtion),
 ∫
C (g x b)・t dr = ∫S (∇  x (g x b))・ndS. Using (6) and (7) we can write
b・ ∫
C t x g dr = ∫S ((b・∇)g - b(∇・g))・ndS. Since (∇・g) is a scalar we can use (2) to write
b・ ∫
C t x g dr = b・ ∫S ( ∇gn - (∇・g)n ) dS. Since this is true for an arbitrary b, we have
C t x g dr =  ∫S ( ∇gn - (∇・g)n ) dS.  Now substitute g = γn and using the fact (4) that ∇γ is tangent to the surface (normal to n) and the result in (9) we have 
 ∫C t x γn dr = γ ∫C t x n dr =  ∫S ( ∇(γn)n - (∇・ γn)n ) dS = ∫S ( ∇(γn)n -  γ(∇・n)n ) dS. Using (8b) we get  γ ∫C t x n dr  =  ∫S ( ∇γ γ(∇・n)n ) dS. Now if we assume γ is constant on the surface then ∇γ = 0, and we have  γ ∫C t x n dr  = -  ∫S γ(∇・n)ndS. From the equilibrium equation we then have ∫S ∆P ndS =  ∫S γ(∇・n)ndS or ∫S (∆P - γ(∇・n)) ndS = 0. Since this is true for any S, we must have ∆P = γ(∇・n). QED

For constant γ we have Pl - Pv = γ(∇・n) where n is the unit normal to the surface away from the liquid. When the surface is a plane we have ∇・n = 0, since n is constant. In general, it can be shown that  (∇・n) measures the local mean curvature of the interface given by (κ1 + κ2)/2. In the case of a spherical surface, (∇・n) = 2/R  where R is the radius of curvature (radius) of the sphere (κ1 = κ2 = 1/R). We show this in the examples below.

Note: Pl - Pv is the pressure difference going from the vapor to the liquid which is the opposite direction we are taking for the (outward) normal n as positive orientation. In considering a droplet or bubble this means the pressure inside is greater than the pressure outside. Thus the bubble tries to expand but this expansion is countered by the surface tension. Equilibrium occurs when these forces are balanced. In particular, the Young-Laplace equation says that the difference in pressure (at equilibrium) is proportional to the surface tension and inversely proportional to the radius. Thus smaller bubbles have greater pressure differences, etc.

Note: Soap bubbles have two surfaces with the air (with a thin film of liquid in between). Therefore the pressure difference on the two sides of a soap bubble is twice that of a single surface.


Example 1: Use ∆P = γ(∇・n) to show that on a spherical surface, ∆P = 2γ/R.
We need to show ∇・n = 2/R.
For the sphere x2 + y2 + z2 = R2, the unit (outward) normal is given by n = (x,y,z)/R where R = (x2 + y2 + z2)1/2 . Consider δ/δx(x/R) = (R - x2/R)/R2. The same type of expression is obtained for δ/δy(y/R) and δ/δz(z/R) so we have
∇・n = (R - x
2/R)/R2 + (R - y2/R)/R2 + (R - z2/R)/R= (3R - (x2 + y2 + z2)/R)/R2
= 2R/R
2 = 2/R.

Example 2: Show directly that on a half-sphere ∆P = 2γ/R. (Not using Young-Laplace)

Step 1. Show that the total force on the half-sphere x2 + y2 + z2 = R2 , z ≥ 0, where constant pressure P is acting normal at every point is (0, 0, PπR2).

S P ndS = P ∫S ndS = ( P ∫S n1dS, P ∫S n2dS, P ∫S n3dS). Parameterize to spherical coordinates x = Rsinφcosθ, y = Rsinφsinθ, z = Rcosφ for 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π/2. Then rθ = (-Rsinφsinθ, Rsinφcosθ, 0) and rφ = (Rcosφcosθ, Rcosφsinθ, -Rsinφ). Thus
r
θ  x rφ = (-R2sin2φcosθ, -R2sin2φsinθ, -R2sinφcosφ). Since 0 ≤ φ ≤ π/2, the final component is negative so we take the unit normal n = - rθ x rφ/|rθ  x rφ| as the outward normal to the surface. We have P ∫S ndS = P ∫Rθφ n |rθ  x rφ| dφdθ
= P ∫∫ (R
2sin2φcosθ, R2sin2φsinθ, R2sinφcosφ) dφdθ
= PR
2 ( ∫∫ sin2φcosθ dφdθ, ∫∫ sin2φsinθ dφdθ, ∫∫ sinφcosφ dφdθ) for 0 ≤ θ ≤ 2π,
0 ≤ φ ≤ π/2. We integrate these components to get PR
2 (0, 0, π) and so the total force due to pressure is (0, 0, PπR2).


Step 2. Show that the total force due to surface tension is given by (0,0,-γ2πR) around the equator of the half-sphere.

Here C is the equator x2 + y2 + z2 = R2 , z = 0, so the unit tangent at (x,y,0) on the curve is t = (-y,x,0)/R and the unit normal to the surface at (x,y,0) is n = (x,y,0)/R so the direction of the surface tension force is t x n = (0,0,-1). The curve (equator) has length 2πR, so the total force due to surface tension is (0,0,-γ2πR).

Step 3, By the equilibrium equation, PπR2 = γ2πR so P = γ2/R.


Note: Although the Young-Laplace equation for the spherical shaped surface can be established without the higher mathematics used in the general result, the general result is a much more powerful tool and saves the work of establishing the result for specific cases like we did here.