I debated whether to post this to Literature (since I am describing a paper), or here (since

I am describing how the mathematics in the paper relates to nomography), or to Software

(since the algorithm has been implemented in the free package [i]R[/i]). I decided to post

here and maybe have a supplementary post in Software.

Lately I've been doing a lot of research (mostly reading papers) into transformations, and

in particular for the present discussion, transformation to additivity.

This is in the situation where you have a collection of z=F(x,y) data (likely in the form of

a table, but that's not necessary - it can just be a collection of x,y,z triples). The aim is

to find a transformation h(z), such that

h(z) ~= f(x) + g(y) (where ~= is "approximately equals")

- this is a transformation to additivity in the sense that the transformation h(z) makes

the bivariate function F into a sum of univariate functions.

There are actually several algorithms around that try to do this!

I will talk mostly about a particular one, ACE ("alternating conditional expectations"), which

is designed to work with noisy data. This is described in a paper by Breiman and Friedman [1].

There are some other algorithms that I may describe later.

In the notation of the paper, there is a response variable, Y and predictors X_1,...,X_p.

The ACE algorithm finds a transformation \theta(Y) and smooth functions \phi_1,...,\phi_p

such that the correlation corr(\theta(Y), \phi_1(X_1) + ... + \phi_p(X_p) ) is maximized.

(Excuse the LaTeX-ese, but its hard to write mathematics in ASCII. )

In terms of our problem, ACE attempts to find a smooth(ish) transformation h(z) and a pair

of smooth functions f(x) and g(y) such that the correlation corr(h(z), f(x)+g(y)) is maximized.

ACE alternates between estimating the \phi() functions by doing a kind of local averaging (smoothing)

of the ith partial residuals (the transformed Y minus the sum of the [i]other[/i] phi_j(X_j)) to

estimate phi_i, and once the \phi functions have all been re-estimated, the estimate of the transform

\theta is updated by averaging(/smoothing) the sums of the phi functions at each value of Y.

The overall approach has a number of deficiencies for our purposes, but on the whole it works quite well.

Even though the basic algorithm is not particularly tricky to implement, fortunately free code to

do this is already available. I will write about how you can get it and perhaps afterward a little

about how to use it in a post on the software section.

[b]Relationship to nomography[/b]

If Z=h(z), X=f(x) and Y=g(y), then we have a standard nomogram form, Z=X+Y.

This is easily implemented as a parallel-scale nomogram (though it can be done in other

forms, including N charts and even the elliptic-function nomograms Ron posted about).

Once you have the set of tick marks(z(i),x(j), or y(k)) you just take the relvant scale

and transform via the estimated h, f, or g. This is easiest if the data is already at the

desired tick marks, but the program can deal with other cases. Well, there are few other

details, but it's not particularly complex.

[b]Minor problems and issues[/b]

1) If the noise is very low (or nonexistent, as with data created from a purely functional

relationship), you need to play around with the convergence criteria, and I found it helps

to find a functional approximation to the smooth curves and cycle around with transformed

data as a new input.

2) As the algorithm goes through, places where it doesn't fit well tend to get strongly

transformed so that they have less impact on the correlation (e.g. if it fits large values badly,

it will tend to use, say an inverse tranformation, making the large values and their large errors

simultaneously smaller). You can combat this if you play with the weights to each point (I will

explain what to do here at some later time) as the data gets transformed (it won't stop it using

a strong transformation, but it stops it thinking that that fixes the problem).

[b]What next?[/b]

In a later post or posts I will put up an example or two showing what's involved, from equation

to nomogram, and describe how to get the software.

[b]References[/b]

[1] Breiman, L and Friedman, J.H. (1985), Estimating Optimal Transformations for Multiple

Regression and Correlation, Journal of the American Statistical Association

## Additive models and parallel nomograms

### Who is online

Users browsing this forum: No registered users and 2 guests