matheqns

Monday, March 31, 2014

Angry Glass :)

Last week's glass was bad. kept making simple mistakes. Anyway, took all that anger out on glass, and ended up with this. Another addition for my collection :)








Tuesday, March 18, 2014

Paperweight and color vase.


This weekend hillary, the person who got me into glass came to hershey, pa. We met, and gave her the glass pieces I made. Made a paper weight and colored vase for her. Below are pictures. Link to her blog, http://herald-dispatchblogs04.blogspot.com/

Paperweight and vase

Paperweight

Saturday, March 1, 2014

My vase


After last week's fiasco were I broke 4 cups, was finally able to get one in the oven. It was supposed to be a cup, but the shape was nice, so ended with vase instead. I like thicker glass because it does more to light then thinner ones. It reflects, bends and adds glow to the piece. Thats one reason I love blenko glass. :D

This week made 2 cups and a vase. Although, only one of them is close enough to be considered good. Others are bad. Btw, was using the sock I turned into heat shield. It worked good. Got just 2 burn marks, one on left hand, and other on upper arm. In gap between where sock ended and shirt began. There was no sock covering those.  :D




Friday, February 21, 2014

First cup.

my first cup.  
now i have the basic techniques, have ot practice this for next 2-3 months. This is supposed to be the first piece so you can look back and say, at least I did not make that. But today, was so bad that could not even save them.

Today tried 4 cups and dropped 3. The one I got into the oven was not even what I wanted. I was aiming for cup, got a vase. 

Whats worse is burnt hand several times, and made a lot of mistakes. Might have been distracted.

Earlier, was walking outside without jacket in AM, and then went to glass where temperature was 90. I think it got worse due to no sleep. 

Just angry.  x-(




Saturday, January 18, 2014

How I got to glassblowing :)


This post does not have math equations in it, its more of 'dear diary' post. I have almost no artistic skills, at least not traditional ones. I like glass blowing, ice skating and love trains. Mainly because one involves high temperature physics/chemistry, next different physics than we (at least I) normally experience, and third involves physics of controlling million ton mass. In short, love playing with fire, ice and heavy objects. This post is about how I got into glassblowing. Have used initials of people for privacy purposes. 

I have been involved in glass for nearly 4 years now. It started with my obsession for trains, but did not want to spend my money on trains, so started trading blenko glass. Through that, came in touch with hillary, and started helping her out with blenkoproject blog, even made a website for them. The other blog listed as mine is actually hillary's. I just have admin rights to it. links below. 

I wanted to learn glass making, but the closest place I knew was corning. Although, they have short classes, they have them as 7 day crash course, and after that you cant practice. Seemed useless to learn an art that takes more than 10 years of practice to master in 7 days, and not practice. However, read many text books on glass making, high temperature physics, and colored glass. All technical, not artistic. But in glass making, biggest issue most people face is to learn how glass behaves at high temperature. 

Glass melts around 1500 C and if you cool fast below 700 C it cracks. So working range is usually between 800 and 1200 C, and as glass cools, its behavior changes. With in minutes it goes from gravy type of liquid to clear. And glass responds differently to heat and cold. For glass at 1000C, room temperature is cold. In fact, one of the techniques to cool the glass (marving) is to roll hot glass on steel table top (earlier it used to be marble) and make a semi-solid layer on top, so you can mold the glass. Its something like shaping a ballon filled with dense liquid. In fact, the way glass is removed from steel rod is using thermal shock, and not actually cutting it.

In short, was involved with glass for 3 years and was studying glass for 2 years. Was looking for a place to go and blow glass, but all were either too far or too expensive. I did fuse glass in my apartment, but theres only so much you can do in a microwave. About 6 months ago, there was a fire in state college (Waupelani dr), where some apartments got burnt down. And a person (C) was looking to raise money for them, and emailed on psu-community-garden workshop. I wanted to do something, but whom ever I asked was either involved with an organization or wanted to give bible with cash. So asked my friends and people on PSU-PD list serve, and got about 800. When I went to give her money, she told she likes glass. Then I went on my usual monolog, and she told about a place at PSU where people go and blow glass, and told I should check it out. So I did.

I emailed them, and they asked to shadow artists for 6 months, and said after that they will decide if they will teach me or not. I think this is their way to weed out people who are not willing to put in time or effort. I went there for a month, and then sent a thank you note to C. She got excited and wanted to learn too, so she started shadowing too. Yesterday was first time we got to work on our own. It wasnt easy. My mind is usually restless, and is filled with random thoughts. Sometimes, even I dont know what am talking, and thinking. But when I shadowed people, or was working with glass, it was blank. In fact, I dont even notice other people talking to me. How are you supposed to learn in a class something, when that something puts you in a trans. 

Any way, after many mishaps, made 2 snow men and 1 paper weight. I burnt my hand in the process, but not while making. After making the first snowman, got excited, and dropped my arm on hot steel rod. Thankfully, it was after making the glass. The rod was red hot when I started making.  She did the same too, made 2 snow men and 1 paper weight part. Not the hand burn part. Any way, it was fun. And because of shadowing, it was easy to pick up. The pieces are annealing now. Will get them soon, and then there will be another dear diary post. Not this long, mostly pics. :)


Wednesday, January 15, 2014

Just felt like deriving Rodrigues' rotation formula


Woke up again at 11, and unable to fall asleep. My sleep cycle is following Mr Murphy, most active and alert when am supposed to be sleeping. Was reading something online, and came across wikipedia's post on Rodrigues' rotation formula, and saw that they derived in a juvenile manner, using 11th grade math (http://en.wikipedia.org/wiki/Rodrigues'_rotation_formula#Derivation). So just felt like derived it in slightly more elegant manner, and am here. Will add figures that go with this later. Dont have scanner, nor pencil, paper or patience to draw lines now. Until then, use your imagination. :)

Let v be the vector to be rotated by angle $\theta$ about axis k.

Component of v along k,
$v_k = (v \cdot k) k$
Component perpendicular to k
$v_{\perp} = v - (v \cdot k) k$
After rotation by $\theta$,
Component perpendicular to k, along $v_{\perp}$
$v_{\perp} = |v_{\perp}| cos(\theta) \hat{v}_{\perp} =  \frac{ \left( v - (v \cdot k) k \right)}{\left| v - (v \cdot k) k \right|} cos(\theta) $


$v_{\perp} = |v_{\perp}| cos(\theta) \hat{v}_{\perp} =  \left| v - (v \cdot k) k \right| cos(\theta) \frac{ \left( v - (v \cdot k) k \right)}{\left| v - (v \cdot k) k \right|}  $
$v_{\perp} =  \left( v - (v \cdot k) k \right) cos(\theta)  $

Component perpendicular to k and $v_{\perp}$
$v_{\perp,2} = |v_{\perp,2}| sin(\theta) \hat{v}_{\perp,2} = \left| v - (v \cdot k) k \right| sin(\theta)  \frac{ k  \times  \left( v - (v \cdot k)  \right) } {\left| k  \times  \left( v - (v \cdot k) k \right|}     $
as k is unit vector,
$ \left| k  \times  \left( v - (v \cdot k) k \right| = \left|   \left( v - (v \cdot k)   k \right| $
So
$v_{\perp,2} =  \left( v - (v \cdot k) k \right) sin(\theta)  $
also,
$k  \times  \left( v - (v \cdot k)  \right)  = k  \times  v$.

Now rotated vector is,
$v_{rot} = v_k + v_{\perp} + v_{\perp,2}$
$ v_{rot} = (v_k \cdot k) k + \left( v - (v \cdot k) k \right) cos(\theta) +  ( k  \times  v) sin(\theta)  $
Simplifying
$ v_{rot} = v cos(\theta) +  (v \cdot k) k  ( 1 -  cos(\theta)) + ( k  \times  v) sin(\theta)   $



Tuesday, January 14, 2014

Can Hessian be substituted by its corresponding diagonal matrix?


Am up again, and its 4 am. The gym does not open until 6, so time to ramble on. Previous posts focused on using gradient descent methods for minimization. However, they were for optimization of cost functions that were parametrized by one variable. For multi-variate cases, the update rules for a to minimize F(a) are,
(eq1) $a_{n+1} = a_n - \mu  \left. \nabla F \right| _{a_n}$ and
(eq2) $a_{n+1} = a_n - \mu  \left. H(a_n) \right|_{a_n}^{-1} \left. \nabla F \right| _{a_n}$
The approximated errors estimated earlier all apply to gradient descent eq1 and eq2, however, computations using eq2, require computing hessian which is computationally intensive. Most algorithms (Quasi-newton methods) approximate the hessian based on changes in gradient function $\left. \nabla F \right| _{a_n}$ when solution is updated to  $a_{n+1}$ from  $a_{n}$. However, this still involves computing and inverting a large matrix of dimension $n^2$. In this post I check errors introduced by replacing the hessian in eq2 by a diagonal matrix whose diagonal elements are same as the hessian. 
(eq3) $a_{n+1} = a_n - \mu  \left.H_d(a_n) \right|_{a_n}^{-1} \left. \nabla F \right| _{a_n}$
This way, you need to store and compute n additional terms as each time step, and as the inverse of a diagonal matrix is much easier to compute. Now will try and investigate how this approximation influences the gradient descent method, and can stability be guaranteed? If so, under which conditions? The results below are random thoughts for now, will use them to show convergence in upcoming post. 

Rewriting eq2, 
(eq4) $a_{n+1} = a_n - \mu  \left.H(a_n) \right|_{a_n}^{-1} \left. \nabla F \right| _{a_n}$
Now splitting hessian into diagonal and non-diagonal terms, 
(eq5) $a_{n+1} = a_n - \mu  \left( \left.H_d(a_n) \right|_{a_n} + \left.H_{nd}(a_n) \right|_{a_n} \right)^{-1}  \left. \nabla F \right| _{a_n}$

Using matrix inversion Lemma:
$(A-B D^{-1} C)^{-1} =  A^{-1} + A^{-1} B ( D - C A^{-1} B)^{-1} C A^{-1}  $, 
When B and D are identity matrices and C = -C,
$(A+ C)^{-1} =  A^{-1} - A^{-1}( I + C A^{-1} )^{-1} C A^{-1}  $,
On further simplification,
$(A+ C)^{-1} =  A^{-1} - A^{-1}( I + A C^{-1} )^{-1}  $, 
Dropping subscripts and rewriting the hessian,
$(H_d+ H_{nd})^{-1} = H_d^{-1} - H_d^{-1}( I + H_d H_{nd}^{-1} )^{-1}  $,
Substituting in the update equation eq3, 
(eq6) $a_{n+1} = a_n - \mu  (H_d^{-1} - H_d^{-1}( I + H_d H_{nd}^{-1} )^{-1}) \left. \nabla F \right| _{a_n}$
(eq7) $a_{n+1} = a_n - \mu  H_d^{-1} \left. \nabla F \right| _{a_n} +  \underbrace{\mu H_d^{-1}( I + H_d H_{nd}^{-1} )^{-1} \left. \nabla F \right| _{a_n}}_{\text{error due to ignoring non diagonal terms}}$
Further simplification gives,
(eq8) $a_{n+1} = a_n - \mu  H_d^{-1} \left. \nabla F \right| _{a_n} +  \mu H_d^{-1}H_{nd} H^{-1} \left. \nabla F \right| _{a_n}$

Will stop now, will post proof using this over the weekend. its almost 6, gym time :D 





Sunday, January 12, 2014

Optimization using Newton-Raphson with Taylor series approximations

This post is the logical extension from previous two where I checked if taylor series approximations of derivatives can be used for optimization using steepest descent. I found that the derivatives computed this way provided good-enough approximations for use in optimization routine, the next post was investigation into how the perturbation size (or step size) must be chosen for computing the derivatives. This post will focus on optimization using Newton-Raphson method with derivatives computed using taylor series approximation.  Again, the aim is to find a that minimizes F(a). In Newton-Raphson's method, a is updated using
(eq1) $a_{n+1}  = a_n - \mu \frac{F'  (a_n) }{F'' (a_n) }$

Cost function at (n+1) step is
(eq2) $F(a_{n+1}) = F(a_{n}) -  \mu \frac{F'  (a_n) }{F'' (a_n) } F'  (a_n)  + O(\mu^2) $
(eq3) $F(a_{n+1}) = F(a_{n}) -  \mu \frac{F'  (a_n) ^2}{F'' (a_n) }   + O(\mu^2) $
Errors in eq2 are of second order of learning rate, and it can be seen from Eq3, that Newton-Raphson method will result in reduction of error in each step, as long as learning rate and initial guesses are chosen appropriately. 

I am writing this post to investigate what errors will result from approximating derivatives in (eq4) with taylor series approximations. Now, the Newton-Raphson update rule becomes
(eq5) $a_{n+1}  = a_n - \mu \frac{\bar{F}'  (a_n) }{\bar{F}'' (a_n) }$.

As before, let h quantify the perturbation (step) size. Therefore, first and second derivatives approximations have the following errors,
(eq6) $\bar{F}'(a_{n}) = F(a_{n}) + \frac{ha_n^2}{2}F''(a_n)  + O(h^2) $
(eq7) $\bar{F}''(a_{n}) = F''(a_n) + O(h^2) $
Neglecting the second order terms, and substituting these in eq3 as derivative estimates,
(eq8) $F(a_{n+1} )  = F(a^n ) - \mu \frac{F(a_{n}) + \frac{ha_n^2}{2}F''(a_n) + O(h^2)}{F ''(a_n) }F (a_n) + O(\mu^2)$.
Neglecting the second order terms, and substituting these ,
(eq8) $a_{n+1}  = a^n - \mu  \frac{F(a_{n})}{F''(a_n)} - \frac{\mu ha_n^2}{2} F (a_n) + O(h^2)  +O(\mu^2)$.

Again, choosing
$\mu = ch$
for 0<c<1 will give errors in second order of learning rate. I would recommend choosing c as close to zero as possible, because Newton-Raphson method tolerates higher learning rate, while the taylor series approximation of derivatives does not. So choosing very small c (0.0001 for example) will result in stable solutions, and all adaptive methods to adjust learning rates can be applied. I tested this on exponential fitting and it gave solutions much faster than standard gradient descent. Next post will be about using this for conjugate-gradient method. Although I love it, and am a fan of smart algorithms, am not looking forward to it. Another issue is that all these checks are done for 1-parameter problems. Numerically, they apply to solutions using multiple parameters too, however, next step is to check under what conditions will approximation using these methods fail. 


Friday, January 10, 2014

Optimization using taylor series: Stability Analysis


In last post I tested if forward difference approximation of first derivative from taylor series expansion can be used in optimization routines. I was surprised to find that this method gave stable results, inspite of my repeated attempts to make it unstable. Although, the stability of the method depended on the perturbation size (or step size) used to compute approximate first derivative. As long as it was small (below 5% for exponential fit) the algorithm gave convergent solution, but for larger values, it did not. Below is an attempt to estimate source of error due to taylor series expansion, and provide a formal way to estimate the perturbation size.

The optimization goal is to minimize F(a) using the update rule,
(eq1) $a^{n+1} = a^n - \mu \left. \frac{dF}{da} \right|_{a^n}$

The estimated derivative is computed as,
eq(2) $ \left. \frac{\bar{dF}}{da} \right|_{a^n}= \frac{F(a_n+a_n h)-F(a_n)}{a_n h}$
Noting that,
eq(3) $F(a_n+a_n h) = F(a_n ) + a_n h \left. \frac{dF}{da} \right|_{a^n} + \frac{a_n ^2h^2}{2} \left. \frac{d^2F}{da^2} \right|_{a^n} + O(h^3)$ ,
The approximate derivative is
eq(4) $\left. \frac{\bar{dF}}{da_n} \right|_{a^n} =  \left. \frac{dF}{da} \right|_{a^n}+ \left. \frac{ah}{2} \frac{d^2F}{da^2} \right|_{a^n} + O(h^2)$.

The cost function, after update using eq1 is
(eq5) $F(a^{n+1}) = F(a^n)   - \mu  \left.  \frac{\bar{dF}}{da}  \right|_{a^n} \left. \frac{dF}{da} \right|_{a^n} + O(\mu^2)$.

Substituting approximate derivative from eq4,
(eq6) $F(a^{n+1}) = F(a^n)   - \mu \left( \left.  \frac{dF}{da}  \right|_{a^n} + \left. \frac{ah}{2} \frac{d^2F}{da^2}   \right|_{a^n} + O(h^2) \right) \frac{dF}{da}  + O(\mu^2)$
Rearranging gives
(eq7) $F(a^{n+1}) = F(a^n)   - \mu \left( \left.  \frac{dF}{da} \right_{a^n} \right)^2  \underbrace{ -  \frac{ah \mu}{2} \left. \frac{d^2F}{da^2} \right|_{a^n}   \left. \frac{dF}{da} \right|_{a^n} + O(h^2) }_{\text{Effect of taylor series approximation}}  + O(\mu^2)$

Based on eq7, choosing
(eq8) $h = c \mu$
for any 0<c<1 will result in errors of second order in learning rate, and the cost function reduces to,
(eq8) $F(a^{n+1}) = F(a^n)   - \mu \left( \left.  \frac{dF}{da} \right|_{a^n} \right)^2 + O(\mu^2)$,
i.e. choosing h according to eq8 results in errors of second order in learning rate.

CONCLUSION:  If h is chosen according to eq8, the error due to computing derivative using taylor series expansion is of same order as that from analytically computing the derivative. This way, all the adaptive methods to modify learning rate can be applied to this algorithm too. 



Optimization using taylor series expansion: Feasibility test

THE IDEA:
This post is to investigate if it is possible to perform optimization using taylor series estimates of first derivatives. Most optimization problems involve converting cost function into a parametric form, and evaluating its first derivative. Then by equating the derivative to zero, you solve for the parameters that describe the solution. For now, I will ignore the second order conditions. 

So the optimization problem becomes: Find optimal set of parameters a* that minimize F(a). Or find a* so that  
(eq1)  $ \left. \frac{d F}{d a} \right|_{a^*} = 0$
Most times, its not easy to solve this analytically, so people resort to numerical techniques. One of the most common numerical technique is to use a gradient descent method, where you start from an initial guess, and propagate solutions until (n+1) as, 

(eq2) $ a^{n+1} = a^{n} - \mu \left. \frac{d F}{d a} \right|_{a^n}   $

However, it may not be possible to calculate the derivative of the cost function always. So I was wondering how well will approximating the derivative with numerical approximation work?
(eq3) $  \left. \bar{ \frac{d F}{d a}} \right|_{a^n} = \frac{F(a^{n} + \delta a^n ) - F(a^{n} )}{\delta a^n}  $


THE EXPERIMENT:
To test this, I chose to fit exponential data to data generated from an exponential function. The data to be fit was chosen as, 
(eq4) $Y = a_0 e^{a_1} + a_2 + \alpha(0,\sigma)$
I chose the generating function as, 
(eq5) $a_0 = 5$, $a_1 = -3$, $a_2 = 3$, $\sigma = 0.5$.

To estimate the derivative from eq3 I chose the perturbation as 1% of the current solution and learning rate of 0.001. I started with an initial guess of 1,1,1. I ran the solution update according to eq2 4000 times. I chose these numbers on purpose because I wanted to give any advantage to the optimization scheme.* My goal was to give no advantage to the optimization routine, and I chose parameters for optimization as lame as possible. It is possible to get better initial guesses. Knowing structure of eq4, you can estimate the initial guesses for parameters very close to the real solution. The learning rate need not be decided apriori, I could have introduced a subloop to keep reducing learning rate until it results in small change in solution parameters. I could have made the perturbation size in eq3 adaptive, so the errors in estimating errors are small .However, I chose to not do any of these.

I decided to do a monte-carlo simulation by using this method a 1000 times, and checking how many times the optimization routine crashes, or gives unstable results. 

RESULTS: 

I was expecting (and hoping) for a catastrophic failure. Surprisingly, the optimization gave stable result every time I tried it. Its 5:30 am now, and have ran the simulations about 104 times, and no crashes, so stopped it.


Figure 1. Representative results from optimization. 

LEARNING:
Optimization using estimates of first derivative from taylor series expansion provided stable results, and may be viable option to use for optimization. Given the simplistic implementation, and guaranteed convergence. However, choosing perturbation size large can result in unbounded errors. When I tried using 20% perturbation size, I got large errors (i know 20%). So next post will be about how to choose appropriate perturbation size, either analytically or numerically. Also, second order estimates (central differencing) of derivatives are more stable than first order estimates (forward or backward differencing), so am assuming this will result in improved stability, and larger range for perturbation size. These will be topics of upcoming posts.

CONCLUSION.
Its 6 am, off to sleep. :)


_____
*I can write a whole post on how to choose parameters for optimization, and how each one affects the final solution. However, its trivial. And if you read this far, you can figure it out on your own. 






About blog

I am my restless-24-seven self. Those few who know me, know that I think I know math, and I love it. Although, when you learn something, you only learn what you don't know about it. And the more you learn, the more you realize the grandeur of unknown. As one great person once said, "Only fools and geniuses know everything about everything, and I haven't met a genius until today."

This blog is to document random ideas that I have at odd times, and to document the work done to test/verify those ideas.

I am convinced that math is art, and hope to showcase art in it here. This is my creative outlet, nothing I say here or write here represent my views. I write these to clear my head. I write these mostly when am outside myself, hence the odd 4 am posts.

Also, am obsessed with 2 and all powers of 2. And it scares me :(