TOP: Introduction

- 5.1. Why MCAO?
- 5.2. Correction-limited field-of-view
- 5.3. Turbulence tomography
- 5.4. Tilt problem in MCAO
- 5.5. Modal MCAO systems
- 5.6. Layer-oriented MCAO
- 5.7. MCAO: the near future

Multi-Conjugate Adaptive Optics (MCAO) is a further development of the
original AO concept. It consists in correcting the turbulence in three
dimensions with more than one deformable mirror (DM). Each DM (see
the Figure) is optically conjugated to a certain distance from the
telescope. We call this * conjugation altitude*, although the
term * range* would be more correct. The benefit of MCAO is
reduced anisoplanatism, hence an increase of the compensated
field-of-view (FoV) size.

In order to avoid vignetting, the projected diameter of the DMs must
be equal to
, where is the radius of the FoV
and * H* is the conjugation altitude. Hence, upper DMs must be
larger than the telescope pupil, they are called *
meta-pupils*. The beams from scientific objects and guide stars (GSs)
do not sample the whole meta-pupil, but have smaller * footprints*.

** Question:** What is the meta-pupil diameter for a 8 m telescope
with DM2 conjugated to 8 km and FoV diameter 2 arcminutes? What are
the diameters of the beam footprints for a sodium LGS?

The signals driving these DMs are obtained from several WFSs, each
observing its own guide star (GS). The information from the WFSs is
processed by a * reconstructor* in order to retrieve the
3-dimensional instantaneous wave-front perturbations, as in the
medical tomography where the 3D structure of an object is derived by
viewing it from different angles. In our context the technique is
called * turbulence tomography*, reconstruction is done by matrix
multiplication.

Tomography is useful even with only one DM, because it permits to
infer the compensation signal for a target which is at a large angular
distance from GSs. In this way a better compensation quality is
achieved comparing to the use of just one GS, and the
sky coverage of
an AO system with natural GSs is improved. With LGSs, tomography helps
to reduce the
cone effect: solving for several turbulent layers from
the signals of several GSs permits to combine these layers in a
correct way (without stretching or missing portions) to achieve the
best compensation for a selected target. Thus, tomography can be used
without MCAO, but MCAO would not work without tomography.

The currently mounting interest in tomography and MCAO is directly
related to the perspective of turbulence correction at large
telescopes (LTs) and at future * Extremely Large Telescopes* (ELTs)
in the whole optical and IR range. The logic leading to this
conclusion is shown on this scheme and is
as follows:

- We have seen that NGSs are not suitable for the whole-sky correction in the visible (sky coverage is practically zero). Hence, LGSs are needed.
- Because of the
cone effect, LGSs would not work in the visible range
at LTs (nor in the IR at ELTs). Hence, multiple LGSs and tomography
are needed.
- Even with LGSs there is a problem of sky coverage because of
tip-tilt
correction. Hence, the corrected FoV must be increased by MCAO to
improve the coverage.
- The uniformity of correction with MCAO will be much better and the size of FoV larger, increasing the efficiency of AO systems and their usefulness to astronomy. However strange it might seem, this argument (i.e. the quality of the scientific results) is usually the last on the list.

The enthusiasm about MCAO and few wrong papers contributed to an
impression that this is a magic solution for a complete removal of
turbulence effects. In the following we provide some realistic
estimates of the large (although still finite) gain brought by MCAO and of
its associated problems.

Suppose that by some magic the instantaneous perturbations in all
atmospheric layers are known, and that we have at our disposal the
finite number * M* of ideal deformable mirrors. How large a FoV can
be corrected? By answering this question, we obtain the size of *
correction-limited* FoV.

It was shown (JOSA, V. A17, P. 1819, 2000) that in the limit of very large telescope the residual phase variance due to anisoplanatism, , is given by a familiar expression

(1) |

In order to compute , the turbulence altitude profile
must be known. The profile must be multiplied by some *
weighting function F(h)* and integrated over altitude to obtain the
residual phase error, hence to obtain .
The weighting function is zero at the conjugate altitudes of DMs
(corresponding layers are perfectly corrected by these ideal DMs), and
positive at intermediate altitudes. These functions are plotted in the
Figure for the classical AO ( *F0* - one DM conjugated to zero
altitude), for the case when one DM is conjugated to 5 km
(*F1*), like in the
Gemini-N Altair AO system, and for
the case of a MCAO with 2 DMs (*F2*) conjugated to 2 km and 10 km.

The 2-DM curve was obtained by assuming that each intermediate layer
is corrected by * both* DMs, and the correction is shared between
the DMs is in an optimum proportion. This strategy gives better
results than a simple "allocation" of turbulent slabs to be
corrected by the adjacent DMs, although an order-of-magnitude FoV size
can be estimated primitively by summing up the anisoplanatic effects
of all slabs. The same strategy of optimally shared corrections
applies to more than 2 DMs.

For any particular the conjugation altitudes of DMs that result in the largest FoV can be found. When there are strong turbulent layers in the atmosphere, it is advantageous to conjugate DMs to these layers. However, in all cases a significant fraction of turbulence is distributed continuously at all altitudes (see the profile), hence the performance gain obtained for "layered" profiles is small compared to continuous profiles with the same . The optimum conjugation altitudes for 1, 2, and 3 DMs are shown HERE for a particular profile. The compensation quality is only a weak function of exact DM conjugation altitudes.

** Question:** What would be the
compensated FoV size in an ideal MCAO system considered here with 2
DMs and * all* turbulence in 2 thin layers?

The actual gain in the FoV size brought by the increasing number of
DMs was computed for the 12 profiles at Cerro Paranal: 4-5 times with
2 DMs, 7-10 times with 3 DMs. Increasing the number of DMs further
brings smaller benefits, and for a large * M*, eventually,
. This result is intuitively clear for a
continuously distributed turbulence: the thickness of slabs affected
to each DM is inversely proportional to * M* (recall that
correcting the first Zernike modes brings the largest gain, here the
situation is somewhat similar). So, to correct a large FoV of diameter
2 with sub-aperture size * d*, we need, very roughly,
DMs.

** Question:** Supposing that by using 2 DMs instead of one we widen
the FoV size by 5 times, estimate the achievable FoV diameter at 0.5
and 2.2 microns if at 0.5 microns is 2.5 arcseconds.

The first works in turbulence tomography aimed at modeling the turbulent atmosphere as few thin layers and at trying to infer the phase perturbations in these layers from the signals obtained on several GSs (by solving a system of linear equations). In order to do this, the number of unknowns must be less than or equal to the number of measurements, which means at least 1 GS per layer. The layers were identified with DMs, of course.

In reality there is an infinite number of turbulent layers in the
atmosphere and the WFS data are noisy. This calls for statistical
techniques like optimum filtering. What is actually needed is *
not* the reconstruction of the whole turbulent volume, but the best
possible estimate of the compensating signals using the information
actually available from the GSs. This approach is also called
tomography, it was experimentally demonstrated (Nature, V. 403, P. 54,
2000).

Suppose again that the telescope is very big, and that only NGSs are
used (no wave-front stretching). Then the problem can be treated by
Fourier techniques. Each spectral component of the wave-front
distortion (a sinusoidal perturbation) is measured and corrected
separately. As seen in the Figure, the relative spatial shift of the
signals between the two sources separated by an angle is
, where * H* is the distance between the layers. For a
Fourier component with spatial frequency * f* the phase shift is
. WFSs measure the combined effect (sum) of both
layers, which is different for the two GSs owing to the phase shift.
From these two signals it is possible to reconstruct the two layers by
solving the algebraic system of two equations. However, when the phase
shift is exactly , the two signals become identical and the
system can not be solved. It happens at the critical frequency
.

If there is turbulence between the layers, the situation remains
qualitatively similar. For phase shifts more than 1 radian, the
Fourier components from different GSs become de-correlated and the
achievable degree of turbulence compensation diminishes. It means
that in order to correct small-scale perturbations (large * f*),
the distance between guide stars (i.e. FoV size) must become
smaller. When the whole atmosphere becomes thinner (smaller * H*),
the corrected FoV opens up.

** Question:** Estimate the maximum FoV that can be corrected with
sub-apertures of 1 m for a turbulence thickness of * H*=5 km.

** Question:** For a uniform turbulence distribution of * H*=5 km
thickness estimate the number of DMs and GSs needed to correct a FoV
of 5 arcminute diameter in the visible (sub-aperture size * d*=0.3
m).

These considerations lead to a formula (JOSA V. A18, P. 873, 2001)
expressing the residual error of a wave-front of some scientific
target that can be reconstructed using a constellation of * K*
bright NGSs at some radius around the target:

(2) |

Using more GSs, we reduce the apparent "thickness" of atmosphere and open up the tomographic FoV. The gain in the FoV size is equal to . It amounts to 10-20 for typical profiles and for 3 and 5 GSs, respectively.

This theory is very general and does not take into account telescope
diameter, for example. In reality the overlap between the footprints
of GS beams on the high-altitude layers will be incomplete, some
portions of the layers may be not sampled at all, hence remain
unmeasured. These effects may dominate the total tomographic error
under certain conditions (4 m telescope in the IR range) or may be
insignificant in some other cases (8 m telescope in the visible range
or ELTs). The tomographic patch size provides a
system-independent * lower* limit to the error of phase
reconstruction.

Using several GSs, we collect more photons. Does it mean that the tolerance on the GS brightness can be relaxed and fainter GSs can be used for tomography, as compared to classical AO? The answer depends on the size of reconstructed FoV. If FoV is much smaller than , the GS signals are correlated and, indeed, individual GSs can be fainter than a single GS. On the other hand, if we want to take advantage of the full tomographic FoV, the GSs must be at least as bright (or even brighter) than single GS in the classical AO, because the solution of the tomographic problem leads to noise amplification (like in other inverse problems).

There may be not enough NGSs for tomography, especially for shorter
imaging wavelengths. Moreover, the WFS must be re-configurable for
each telescope pointing, and the command matrix of MCAO must be
updated accordingly. Clearly, LGSs should be a better solution for an
MCAO system (see Gemini MCAO). However, in view of LGS problems, some
researchers think of using several NGSs and tomography to correct the
scientific object, even with a single DM. F. Rigaut proposes to
correct only lower atmospheric layers. The resulting performance will
be a way below diffraction limit, but an improved seeing over a wide
FoV can be profitable for observations in the visible. E. Gendron
proposes to build a multi-object spectrometer where each target will
be corrected by a miniature AO system using the signals from several
surrounding NGSs and tomographic reconstruction.

Tips and tilts of several LGSs remain undetermined for the same reason
as in
single-LGS AO systems. As a consequence, the information brought
by the LGSs becomes insufficient for a full solution of tomographic
problem. In addition to the overall tip and tilt, there appear at
least 3 additional undetermined modes (or * null modes*). They
correspond to the differential astigmatism and defocus between the two
DMs (see the Figure). These modes do not influence on-axis image
quality, but rather produce a differential tilt between the different
parts of the FoV, or * tilt anisoplanatism* (this is why they can
not be measured with LGSs). Simulations show that if tilt
anisoplanatism is left uncorrected, the stars in the FoV will move
with respect to each other, as though the whole FoV were randomly
distorted.

** Question:** Draw the relative displacements of 5 GSs located in
the FoV as shown that are provoked by the Zernike modes 4, 5, 6
applied to the upper DM.

The three additional modes can be sensed with two additional NGSs, making their total number 3. The differential tilts between the NGSs constrain these modes. Alternatively, a single NGS can be used to sense Zernike modes 2 to 6 (radial orders 1 and 2). This requires a brighter NGS, of course. The first solution seems to produce better performance and better sky coverage and hence is preferable.

What happens if the tip-tilt sensors of the 3 NGSs are positioned with small errors? The MCAO system will compensate these errors in the closed loop, hence the FoV will be distorted! For example, the plate scale (arcseconds per pixel) will change if the upper DM has a static defocus. Special procedures must be applied to ensure that these errors do not compromise the astrometric performance of an MCAO system (like flattening of the upper DM before closing the loop).

The insight into the tilt anisoplanatism provided by MCAO leads to a
suggestion to use 3 NGSs even for the "standard" tip-tilt
correction. If, in addition to the tip and tilt, the modes 4-6 are
corrected by a DM conjugated to some altitude, a large part of the
tilt anisoplanatism will vanish. It means that an improvement of the
image quality can be achieved not only in the vicinity of the tip-tilt
NGS, but in a wider FoV. A second advantage of using 3 NGSs for
tip-tilt correction is that even without addition of a low-order DM
the tilt anisoplanatism is measured. Hence, a better correction of the
scientific target can be achieved, e.g. in LGS-based AO systems. The
calculations of sky coverage show that the need to have 3 NGSs instead
of one is over-compensated by the increased FoV where these NGSs can
be found. Hence, "tilt tomography" promises an improved sky coverage
even for LGS-based single-conjugate AO systems.

The general scheme of an MCAO system is shown in Figure in Sect. 5.1 . The WFSs provide information on a
certain number of wave-front parameters, e.g. measure several Zernike
modes. This * data vector* is multiplied by a command matrix
(see Reconstructors)
to obtain the correction signals applied to DMs.
These signals can also be specified as Zernike modes, this is why we
call it a modal MCAO.

The problem of command matrix optimization was treated in a number of works. When noise and turbulence statistics are taken into account, something like a Wiener filter is obtained. Typically, the optimization criterium is the minimum weighted residual phase variance over the FoV (or in some specified FoV locations). A simpler and more traditional approach consists in constructing an interaction matrix and inverting it to obtain the command matrix. However, optimization gives substantially better results (see below).

The performance of MCAO systems can be studied using complete Monte-Carlo simulation in a computer. This technique requires large amount of calculations and is suitable for a detailed performance analysis of an MCAO system at design stage. Alternatively, the optimized command matrix and the resulting performance can be derived from the second-order statistical quantities, like covariances of Zernike coefficients (in modal MCAO) or covariances of S-H signals and DM actuator signals (in zonal MCAO). The modal covariance codes are the fastest tools to date.

Monte-Carlo simulations and covariance codes address the issues which were neglected in the Fourier theory, namely beam overlap, cone effect (in case of LGSs), and finite order of correction. In the Figure, the residual phase variance of the first 66 Zernike modes in a 8 m telescope is plotted for an object at the FoV center (this corresponds to tomography, because the object can be corrected with only one DM). The solid lines show the results with 3 and 5 NGSs at increasing distance from the object. The dotted line shows the limiting tomographic error for 3 NGSs and an infinite telescopes; as can be seen, the actual errors are much larger, because here beam overlap is the major source of tomographic error.

The dashed line show the performance achievable with 3 sodium LGSs, under the condition that the tip and tilt are perfectly compensated for the object. At close LGS separation the performance is worse than with NGSs because of the cone effect. When the LGS radius reaches 9 arcseconds, their distance from the telescope axis is just 4 m; in this case the cone effect is partially removed by tomography, the residual error is below 1 square radian at 0.5 microns (caution: higher modes must be considered before stating that cone effect is beaten and LGS correction in the visible is possible).

It might seem strange that at large separations 3 LGSs give better
results than 3 NGSs; the reason for this paradox is the tip-tilt
compensation, supposed to be perfect for LGSs. The dash-dot line
shows the case when LGSs are replaced by NGSs with perfect tip-tilt
compensation, to demonstrate that the apparent gain is indeed due to
this assumption.

Both covariance codes and Monte-Carlo simulations demonstrate that the
quality of compensated images delivered by MCAO is much more uniform
over the FoV than in the classical AO. For example, variation of the
Strehl ratio (at 2.2 microns) over the 2 arcminute FoV is plotted for
a 2-DM MCAO system using 3 NGSs (full line). Each of the 2 DMs
corrects 66 Zernike modes.
For comparison, the performance of MCAO
with inverse command matrix (dashed line) and the performance of a
classical AO (dotted lines) are over-plotted. The positions of GSs and
test points are shown on the insert.

The variations of the PSF shape across the FoV are simulated by
R. Conan for a classical AO compensating 66
Zernike modes at the 8 m
telescope (left) and for the MCAO with 3 DMs and 3 NGSs (right). The
FoV size is 4x4 arcminutes, guide stars (marked as red dots) are the
14-th to 15-th magnitude natural stars around the planetary nebula NGS
2346. Imaging wavelength 2.2 micron.

Another unexpected result of the simulations is that for a 2-DM MCAO
system the conjugation altitude of the second DM can be changed in a
wide range without affecting the performance. This is very useful: the
distance between telescope and turbulent layers changes in time and,
additionally, depends on the telescope zenith distance. These changes
can be accommodated by re-optimization of the command matrix, it is not
needed to change the optical conjugation.

The concept of * layer-oriented* MCAO is developed by
R. Ragazzoni and his colleagues. It is close to the original
MCAO idea of J. Beckers and to the early versions of medical
tomography, when the layers of a 3-D object were isolated by
"focusing" on them while illuminating the object from different
angles.

Suppose that we measure the wave-fronts using many natural guide
stars. If the WFSs are optically conjugated to some altitude * H*,
the signals of all NGSs corresponding to this layer would be
identical. However, other layer at altitude * h* will be seen with
various relative shifts. If all signals are averaged, the measured layer
will not be affected, but other layers will be
smoothed with a typical length of 2*(H-h)*, where
is the
radius of the FoV. In short, the contribution of our selected layer
will be enhanced as compared to other layers.

In a layer-oriented system (LOS), the averaging of the signals from
many stars is done not in a computer, but by adding their light on a
single detector (this can be achieved with a multi-pyramid WFS). The
combined signal is fed to a DM conjugated to the same altitude. Part
of stellar light is used by another WFS conjugated to another altitude
which drives a second DM. Of course, the layers are not completely
independent: a WFS at some layer * H* "sees" smoothed wave-fronts at
all other layers and the smoothed corrections applied to all other
DMs. But the system works in closed loop, trying to adjust itself and
to drive to zero the signals in all layers. There is a hope that the
contributions of individual layers will be eventually disentangled. The
simulations and theory show that this indeed happens under some
conditions.

** Question:** For a layer-oriented
MCAO system with two DM-WFS pairs separated by 5 km and a FoV diameter
of 5 arcminutes, estimate the size of perturbations that are left
un-compensated in an intermediate turbulent layer. For the same
conditions, estimate at which spatial scales there will be a strong
cross-talk between the layers.

Layer-oriented MCAO can be regarded as an attempt to solve the tomographic problem by hardware. The system can not be optimized with respect to the brightness of individual stars, turbulence profile, etc. It is expected that under given conditions it will perform slightly worse than an optimized modal MCAO. On the other hand, LOS concept has several advantages: simplified computing (AO loop is closed in each layer independently), a possibility to use many very faint NGSs (to overcome detector readout noise) and a possibility to track the wind-driven turbulence in individual layers with longer exposure times. Not all problems in actual implementation of this concept are solved as yet.

** Question:** How the photon noise will change after the number of
layers in a LOS is doubled?

Gemini team has embarked on actually building an MCAO system for the Gemini-S telescope. The project is now (2001) past the conceptual design stage. The goal is to achieve a uniform turbulence compensation in the near IR J,H,K bands over the 1 arcminute FoV.

Although the system parameters may still change, its main features are summarized in the Table.

DM conjugate ranges | 0, 4.5 and 9 km |

DM Orders | 16, 16 and 8 actuators across the pupil |

Number of Guide stars | 5 sodium LGSs and 3 NGSs |

LGS geometry | center and 4 corners of 42.5 arcsec square |

WFS Orders | S-H, 16 by 16 (LGS); Tip-tilt (NGS) |

LGS Laser Power | 10 W per beam |

Launch Telescope | Behind telescope secondary, 45cm |

NGS magnitudes | 3 times 19 (for 50% Strehl reduction in H) |

Control bandwidths | 33Hz (LGS); 0-90Hz (NGS) |

The compensated FoV will be at least 1 arcminute square (up to 2 arcminutes for partial compensation in K band), the variations of Strehl ratio across the FoV are constrained to be few percent. Detailed performance characteristics are available at the Gemini WEB site.

In a parallel effort, European Southern Observatory (in collaboration with several European institutions ) is planning to build an MCAO demonstrator for 8-m VLT telescope that will use NGSs. The goal of this project is to show the feasibility of MCAO, which is perceived as a major milestone towards ELT projects (ELTs are considered as useless without MCAO).

Work on MCAO is also being done at the Lund observatory, at Durham (UK) and at Palomar Observatory.

** Summary.** Multi-conjugate AO will attempt to correct the
3-dimensional turbulence, improving the accessible FoV and other AO
parameters, especially when LGSs are used. It relies on turbulence
tomography - a technique to extract the multi-layer correction
signals in the optimum way be measuring several GSs. With 2-3 DMs and
3-5 GSs, the FoV is opened up by a factor of 5-10, depending on
vertical turbulence profile. Tomography will improve AO performance
even with single DM in different ways (correct cone effect; better
tip-tilt correction; better sky coverage).

TOP: Introduction