TOP: Introduction

FORWARD: Laser guide stars

- 3.1. Requirements to wave-front sensing
- 3.2. Shack-Hartmann WFS
- 3.3. Curvature sensors
- 3.4. Other wave-front sensors
- 3.5. Wave-front reconstruction

The problem of measuring wave-front distortions is common to optics (e.g. in the fabrication and control of telescope mirrors), and typically is solved with the help of interferometers. Why do not use standard laser interferometers in Adaptive Optics Wave-Front Sensors (WFSs)?

First, an AO system must use the light of stars passing through the
turbulent atmosphere to measure the wave-fronts, hence use incoherent
(and sometimes non-point) sources. Even the laser guide stars are not
coherent enough to work in typical interferometers. ** WFS must
work on white-light incoherent sources.**

Second, the interference fringes are chromatic. We can not afford to
filter the stellar light, because we want to use faint stars. **
WFS must use the photons very efficiently.**

Third, interferometers have an intrinsic phase ambiguity of ,
whereas atmospheric phase distortions exceed , typically. **
The WFS must be linear over the full range of atmospheric
distortions.** There are algorithms to "un-wrap" the phase and to
remove this ambiguity, but they are slow, while atmospheric turbulence
evolves fast, on a millisecond time scale: ** WFS must be fast.**

These requirements are fulfilled in several existing WFS concepts. Each WFS consists of the following main components:

**Optical device**which transforms the aberrations into the light intensity variations (unlike radio waves, the phase of optical waves can not be measured directly for a very fundamental physical reason related to the quantum nature of light). The optical part determines WFS linearity and response.**Detector**transforms the light intensity into electrical signal. This signal has an intrinsic noise due to the photon nature of light, but may also contain a contribution of detector noise. Light integration in the detector causes a delay in the control loop, which limits the servo bandwidth.**Reconstructor**is needed to convert the signals into phase aberrations. The computation must be fast enough, which means, practically, that only linear reconstructors are useful. A linear reconstructor typically performs matrix multiplication.

Needless to say that any real WFS has a finite spatial resolution,
which must match the size of correcting elements (e.g. inter-actuator
spacing of the DM). Wave-front distortions of smaller size are not
sensed. However, they influence the WFS signal, causing the so-called
* aliasing error* (like an aliasing error in temporal signals
with finite sampling, see the Figure). Turbulence spectrum decreases
at high spatial frequencies, hence aliasing error is often of little
importance compared to other AO errors, e.g. to the fitting error.

A well-known Hartmann test devised initially for telescope optics
control was adapted for AO and is the most frequently used type of
WFS. An image of the exit pupil is projected onto a * lenslet
array* - a collection of small identical lenses. Each lens takes a
small part of the aperture, called * sub-pupil,* and forms an image
of the source. All images are formed on the same detector, typically a
CCD.

When an incoming wave-front is plane, all images are located in a
regular grid defined by the lenslet array geometry. As soon as the
wave-front is distorted, the images become displaced from their
nominal positions. Displacements of image centroids in two orthogonal
directions are proportional to the average wave-front slopes in
over the sub-apertures. Thus, a Shack-Hartmann (S-H) WFS *
measures the wave-front slopes.* The wave-front itself is
reconstructed from the arrays of measured slopes, up to a constant
which is of no importance for imaging. Resolution of a S-H WFS is
equal to the sub-aperture size.

** Question: ** What is the maximum angular size of the source when
images from adjacent sub-apertures begin to overlap? Take lenslet size
of 0.5 mm and its focal distance 50 mm. Will this lenslet array be
adequate for an AO system with sub-aperture size =1 m?

** Question: ** Estimate the r.m.s. slopes of wave-fronts on the
sub-apertures as a function of sub-aperture size and (use
the coefficients of atmospheric tip and tilt from Sect. 1.10). Compute
for =1 m and 1 arcsecond seeing.

A good feature of the S-H WFS is that it is completely achromatic, the
slopes do not depend on the wavelength. It can also work on non-point
(extended) sources. If is the wave-front phase, the
* x*-slope measured by a S-H WFS is computed as

(1) |

(2) |

Now the error of slope measurement which arises from the photon noise
will be estimated. Let radians be the radius of the image
formed by each sub-aperture. For extended sources, is equal to
the source size (more precisely, to the dispersion of the intensity
distribution around the center). For point sources,
if the sub-apertures are smaller than (diffraction-limited
images), or
for large sub-apertures (image size
determined by the atmospheric blur). The image intensity distribution
can be regarded as a probability density distribution of the arriving
photons. Hence, each arriving photon permits to determine image
position with an error of . When photons are detected
during exposure time, the photon error of the centroid position (i.e.
slope) becomes
, like after repeating the same
measurement times.

In the photometric band R (wavelength around 600 nm) where the modern
detectors are most sensitive, a star of magnitude 0 gives a flux of
8000 photons per second per square centimeter per nanometer of
bandpass (effective bandpass may reach 300 nm for a good CCD). For a
star of magnitude * m* the flux diminishes by
times. In calculating the flux available for the WFS detector, the
optical transmission must be taken into account.

** Question: ** Compute the number of photons detected in 1 ms
exposure time per sub-aperture of 1 m available from a star of 15-th
magnitude. Assume total transmission of 0.3 and quantum efficiency of
0.6.

It is generally agreed to express all wave-front errors in radians. We multiply the slope error by to obtain the variance of phase difference between the edges of sub-aperture in square radians:

(3) |

** Question: ** How many photons per exposure are needed to achieve a
1 radian photon error in a S-H WFS with ? Assume that
imaging and sensing is done at the same wavelength.

The error of reconstructed wave-fronts is proportional to
with a coefficient called * noise
propagation*. It is known that for a S-H WFS noise propagation is of
the order of 1 and increases only slowly with the number of elements
(the slopes are integrated in the reconstructor, so noise is not
amplified).

The photon flux is proportional to the square of sub-aperture size . It means that, for a given , the photon error of a S-H WFS is independent of the size of its sub-apertures. This conclusion applies only to the ideal detector; in real systems with CCDs (e.g. NAOS at VLT) larger sub-apertures are selected for fainter guide stars.

How many detector pixels must be allocated for each sub-aperture? In
order to compute the centroids accurately, the individual images must
be well sampled, more than 4x4 pixels per sub-aperture. However, each
pixel of a CCD detector contributes the readout noise which dominates
the photon noise for faintest guide stars. Thus, in some designs
(e.g.
Altair for Gemini-North) there are only 2x2 pixels per
sub-aperture. In this case each element works as a * quad cell*,
the * x,y* slopes are deduced from the intensity ratios:

(4) |

The response of a quad-cell slope detector is linear only for slopes less than , the response coefficient is proportional to (hence may be variable, depending on seeing or object size). This is the price to pay for the increased sensitivity, which is of major importance to astronomers.

** Question: ** What shape of the guide star image is needed to
achieve the exactly linear response curve of a quad cell?

The S-H WFSs are very common because they rely on a proven technology and solid experience, are compact and stable. These WFSs require a calibration of the nominal spot positions, which is achieved by imaging an artificial point source.

The curvature wave-front sensing was developed by F. Roddier since 1988. His idea was to couple a curvature sensor (CS) and a bimorph DM directly, without a need for intermediate calculations (although nobody actually does this).

Let be the light intensity distribution in the intra-focal stellar image, defocused by some distance , and - the corresponding intensity distribution in the extra-focal image. Here is the coordinate in the image plane and is the focal distance of the telescope. These two images are like pupil images reduced by a factor of . In the geometrical optics approximation, a local wave-front curvature makes one image brighter and the other one dimmer; the normalized intensity difference is written as

(5) |

** Question: ** Draw the pairs of intra- and extra-focal images for
Zernike
aberrations from 2 to 6. Hint: defocused images from astigmatism to number 12.

For a source of finite angular size the intra- and extra-focal images are blurred by the amount of . The blur must be less than the projected size of sub-aperture :

(6) |

(7) |

Larger de-focusing is needed to measure wave-front with higher resolution, the sensitivity of CS will be reduced accordingly. This means that a CS may have problems for sensing high-order aberrations.

For point sources and large sub-apertures (a case of practical interest) the blur is defined by the atmospheric aberrations, , as in the S-H WFS. If the AO system works in the closed loop and the residual aberrations (at the sensing wavelength) become small, the blur is reduced to , permitting to reduce de-focusing and to gain the sensitivity. This feature is actually used to a limited extent in the real AO systems: de-focusing is reduced once the loop is closed.

The high-frequency wave-front distortions (smaller than sub-aperture
size) have power spectrum (variance of Fourier amplitudes) proportional
to , but their *
curvature* spectrum is proportional to and may cause a
large * aliasing error*. To prevent this, the signal must be
smoothed before being sub-divided into sub-apertures (sampled).
Smoothing is achieved by decreasing the defocusing , which also
increases the sensitivity. In short, the choice of in a CS is
critical and must be adjusted to varying seeing conditions. The signal
of a CS is only a more or less crude approximation of the true
wave-front curvature...

We give without derivation the formula for a phase variance due to
photon noise in a CS when the defocusing is adjusted to its optimum
value:

(8) |

The scale of intra- and extra-focal images depends on defocusing which must be changed during operation. This is not convenient; in fact the curvature signal is detected in the pupil image with fixed scale, while the amount of de-focusing is adjusted by a special optical element (see below). The outer sub-apertures project onto the pupil boundary, their signal provides information on the radial phase gradients, including global tip and tilt (see the Figure).

The CSs that actually work in astronomical AO systems (e.g. in
PUEO and
Hokupa'a )
use the Avalanche Photo-Diodes (APDs) as light detectors. These are
single-pixel devices, like photo-multipliers. The individual photons
are detected and converted to electrical pulses with no readout noise
and small dark count, maximum quantum efficiency is around
60%. Individual segments of the pupil are isolated by a lenslet array
(which, typically, matches the radial geometry of the bimorph DM),
then the light from each segment is focused and transmitted to the
corresponding APD via an optical fiber. The number of APDs is equal to
the number of segments. Outer segments sample the edge of the
aperture, and their signals are proportional to the wavefront
gradients along normal.

APDs are bulky and expensive, hence this design is suitable only for low-order systems. In order to have only 1 detector per pixel, the intra- and extra-focal images are switched in time and directed to the same APD, then the signal is de-modulated in the wave-front computer. The focus modulation is done by placing an oscillating membrane mirror in the focal plane (typical frequency is 2 kHz). The defocusing is inversely proportional to the amplitude of membrane oscillation, which is adjusted to varying seeing conditions and can be reduced once the AO loop is closed, increasing the sensitivity of the CS. Some useful turbulence compensation was achieved even with signals as low as 1 photon per sub-aperture per loop cycle!

Alternative solution would be to use CCDs as light detectors in the CS. This is discussed for a long time, but not yet implemented in real systems. The drawback of CCDs is their readout noise which becomes a dominating noise source at low light levels. Special CCDs were developed at ESO that permit multiple modulation cycles per single readout.

** Question: ** Suppose that a CCD with 5 electrons readout noise is
used in the WFS. How large a number of detected photons must be to
make the readout noise smaller than the photon noise?

The problems of interferometric wave-front measurement can be overcome
when the interfering beams represent wave-fronts with a small lateral
shift (this is called ** shearing interferometer**). If
the shear is less than , the phase differences are less than 1
wavelength, and there is no ambiguity. The light intensity in
the interferogram is

(9) |

For small shifts the phase difference is proportional to the first
derivative (slope), hence the signal of a shearing interferometer is
is similar to that of S-H WFS. Two shears in the orthogonal directions
are needed to measure * x,y* slopes. The first successful AO system
(RTAC) used a WFS based on the shearing interferometer, but this
approach is now completely abandoned in favor of S-H WFS.

** Question: ** Estimate the maximum shear to preserve a
linear response of the shearing interferometer under given seeing
conditions (given ).

Other types of ** interferometers** were suggested for wave-front
sensing. Some of them can provide signals directly proportional to the
phase (thus not needing reconstructor), although in a limited
dynamical range. Such solutions can be interesting for correcting
high-order residual aberrations (e.g. in AO systems with a very high
degree of compensation as needed for detecting extra-solar planets).

The ** pyramid WFS** (P-WFS) is being developed by Italian
astronomers. A transparent pyramid is placed in the focal plane and
dissects the stellar image into four parts. Each beam is deflected,
these beams form four images of the telescope pupil on the same CCD
detector. Thus, each sub-aperture is detected by 4 CCD pixels. This
optical setup is similar to Foucault knife-edge test.

Let us suppose that the light source is extended and use the geometric optics. A wave-front slope at some sub-aperture changes the source position on the pyramid, hence changes the light flux detected by the 4 pixels which would otherwise be equal. By computing the normalized intensity differences we get two signals proportional to the wave-front slopes in two directions. The sensitivity of a P-WFS depends on a source size . P-WFS can be viewed as an array of quad-cells and is similar to a S-H WFS.

What happens when a point source (star) is used and when diffraction effects are taken into account? The intensity distributions in the four pupil images become complicated and non-linear functions of the wave-front shape, P-WFS does not measure slopes any longer. In case of weak aberrations (amplitude much less than ) the wave-front shape can still be reconstructed, although in a more complex way. In order to retrieve the linearity, the star is rapidly moved over the pyramid edge (e.g. in a circular pattern), creating a ring-shaped source. This is not modulation (like in the CS), but simply smearing of the point source, because the signal is integrated over one or more wobble cycles.

** Question: ** Draw the four pupil images in a P-WFS for the case of
defocusing (Zernike mode number 6).

What are the advantages of a P-WFS? First, there is no lenslet array, the sub-apertures are defined by the detector pixels. It means that for faint stars the number of sub-apertures can be reduced simply by binning the CCD. Second, the amplitude of the star wobble can be adjusted as a trade-off between the sensitivity (smaller wobble) and linearity (larger wobble). At small amplitudes the sensitivity of a P-WFS can be higher than that of a S-H WFS (see Astron. Astrophys. V. 369, P. L9, 2001). Finally, it is possible (at least in principle) to place several pyramids in the focal plane, in order to combine the light from several faint guide stars on a single detector. Despite the general interest in P-WFS, there are yet no working AO systems with this kind of WFS.

The phase can be retrieved from the analysis of two simultaneous
images of a star, one in-focus and the other one defocused (or,
generally, with some known aberration). This approach is called **
phase diversity**. The algorithm is non-linear (hence slow?), the
advantages of its application to AO are not yet clear.

The "ideal" WFS is not yet invented. There is no general theorem which would state the absolute sensitivity limit of any WFS due to photon noise. Instead, we have several empirical solutions, optimize their parameters and choose the best among available options.

In this section the problem of computing the wave-front shape from the data provided by a WFS is addressed in a general way.

The measurements (WFS data) can be represented by a vector (its
length is twice the number of sub-apertures * N* for a S-H WFS,
because slopes in two directions are measured, and equal to * N*
for CS). The unknowns (wave-front) is a vector , which can be
specified as phase values on a grid, or, more frequently, as Zernike
coefficients. It is supposed that the relation between the
measurements and unknowns is linear, at least in the first
approximation. The most general form of a linear relation is given by
matrix multiplication,

(10) |

A ** reconstructor** matrix * B* performs the inverse operation,
retrieving wave-front vector from the measurements:

(11) |

** Question: ** For a given number of sub-apertures * N*, estimate
the number of arithmetic operations needed to reconstruct phase. How
does it depend on the imaging wavelength (for given Strehl ratio)?

The number of measurements is typically more than the number of unknowns, so a least-squares solution is useful. In the least-squares approach we look for such a phase vector that would best match the data. The resulting reconstructor is

(12) |

In almost all cases the matrix inversion presents problems because the
matrix is * singular*. It means that some parameters (or
combinations of parameters) are not constrained by the data. For
example, we can not determine the first Zernike mode (piston) from the
slope measurements. In practice the matrix inversion is done by
removing the indetermined (or poorly determined) parameters with the
help of * Singular Value Decomposition* algorithm. In S-H systems
with square geometry, poorly determined modes typically include
"waffle" (quasi-periodic deformation with actuator-grid frequency).

How many Zernike modes can be reconstructed with a S-H WFS having
sub-apertures? At first sight, up to 2. In fact, only , because
the * x,y* slopes are not completely independent, they are
redundant. For a CS, the maximum number of modes is also .

The least-squares reconstructor is not the best one. It is known from
the statistical textbooks that by using * a priori* information on
the signal properties a better reconstruction can be achieved. In case
of AO, this information is the statistics of wave-front perturbations
(e.g. a covariance of Zernike modes) and the statistics of WFS noise.
Looking for a solution that gives the minimum expected residual phase
variance (hence maximum Strehl ratio), we obtain a reconstructor
matrix which is similar to a Wiener filter.

In case of one-dimensional signals, the Wiener filter in frequency
space is written as

(13) |

** Question: ** The spatial power spectrum of slope errors is white
(independent of frequency * f*) and the power spectrum of
atmospheric tilts is proportional to . How does the maximum
frequency of the compensated aberrations depend on the noise level
?

In AO systems the expressions for * minimal variance* reconstructor
involve the interaction matrix and the covariance matrices of noise
and atmospheric perturbations. Similar results are obtained using
other statistical approaches (maximum likelihood or maximum * a
posteriori* probability).

For any reconstructor * B*, the noise of the reconstructed phase
is

(14) |

** Summary.** Wave-front sensor is the most critical part of
astronomical AO systems because guide stars are often faint, limiting
the achievable degree of turbulence compensation. The two most common
WFS concepts, Shack-Hartmann and curvature, were studied. For both of
them we can compute the photon error and estimate the error of
reconstructed wave-fronts as a function of guide star magnitude and
system parameters. The basic ideas of wave-front reconstruction were
introduced without going into much details.

TOP: Introduction

FORWARD: Laser guide stars