Structure Determining Methods

Structure Determining Methods

David Sayre

IBM Research Center, Yorktown Heights, NY

1974 ACA Annual Meeting, College Park, PA

One of the areas in which American crystallography has made a large contribution is that of structure determining methods, i.e. methods of passing from the diffraction intensities of a structure to the structure itself. An early example is the calculation of the first Fourier maps of electron density (Duane 1925; Havighurst 1925), carried out in the laboratory of physics at Harvard University. W. H. Bragg (1915) had suggested that it might be possible to pass from the intensities to the distribution of scattering power in the crystal by some Fourier process, but it remained for Duane to give the modern expression for the electron density as a Fourier sum with the structure factors as coefficients. Duane also discussed the problem of phasing the structure factors, which he did in the case of sodium chloride by symmetry arguments and by considering the chlorine atom to be at the origin, where it would make all coefficients positive. Using these techniques Havighurst calculated maps of NaC1, KI, NH₄I, NH₄Cl, and diamond. His maps are sometimes described as one-dimensional, but in fact they used full 3-dimensional data and are one-dimensional only in the sense that the density was evaluated along specific lines in the unit cell. It was later, in connection with the diopside structure, that W. L. Bragg (1929) showed that useful maps (projections of the structure along an axis) could be obtained from 2-dimensional data, which were then much more commonly available than 3-dimensional data. Incidentally, it was Duane who supplied one of the first high-precision methods of measuring Planck's constant by determining the high-frequency limit of the X-rays produced by electrons of known energy.

The Duane-Havighurst and Bragg maps may be considered as initiating the modern era of structure determination, for although they were of structures which had already been solved, it was evident that the preparation of similar maps could assist greatly in the solution of unknown structures. Indeed the present era of structure determination may be said to differ from the early period largely in the fact that it is based on the idea of first passing from structure-factor magnitudes to map, and then from map to structure, while in the pre-1930 technique the passage from magnitudes to structure was attempted essentially in one step (see Fig. 1). The vast increase in structure solving power today is basically due to the idea of dividing the problem into two relatively more tractable problems.

The first method of using maps in the solution of unknown structures was the Fourier refinement technique (Robertson 1933); once correctly initiated it usually leads rapidly to the successful completion of the structure, and it is still frequently used in the structure-completion process. The first examples of its use, due to Robertson, were with rigid-molecule structures, where it could be initiated by finding the approximate position and orientation of the molecule.

Fig. 1. Structure determination (schematic). Upper box is in reciprocal space and shows the diffraction pattern (magnitudes and phases). Lower boxes are in direct space and show images or maps (Patterson, electron density) and the structure itself (atom parameters (x, y, z, B)'s). Lines show structure determining techniques, in approximate historical order from left to right. Pre-1930: the magnitudes were found from the diffraction experiment ( DIF), and a direct structural interpretation (INT) was attempted. (Dashed lines indicate techniques that may not succeed.) The interpretation could be checked by use of the structure factor formula (SF). Post-1930: introduction of maps into the technique. F is Fourier summation, DINT is density map interpretation, P is Patterson or |F|² summation, PINT is Patterson map interpretation, HA is heavy-atom methods, DIR is direct methods, PHYS refers to the possibilities noted in the last paragraph of the article. Not shown are the structure and phase refinement methods, or the processes of calculating a density map from a structure (inverse of DINT) or magnitudes and phases from a density map (inverse of F).

Subsequently, for structures of more general types, essentially three further structure determining methods have been found: in historical order they are: Patterson methods, heavy-atom methods, and direct methods. The first employs unphased maps; the second and third are means of obtaining phased maps, which are much easier to interpret than unphased maps; the first and second today are usually used in combination. The first, and to a large extent the third, were introduced by American crystallographers, and the second by British crystallographers. Here we shall concentrate principally on the first and third.

Patterson Methods

The Patterson or |F|²-map (Patterson 1934) was discovered by A. L. Patterson, who was working at M.I.T. at the time. An interesting personal account (Patterson 1962) has been left to us by the discoverer. In this work he showed that the |F|²-map may be used as a sort of poor man's F-map, more difficult to interpret because of showing the structure convoluted with its own inversion, but immediately available once the diffraction intensities are known.

I think it is fair to say that for some 30 years after Patterson's discovery the first thing done in virtually every attempt to determine a crystal structure was to calculate "the Patterson", as it was called by every crystallographer except Patterson; this was done to see if it would yield information on the atom positions or on the form or orientation of the molecules. If it did, an attempt to produce an interpretable F-map would be made, and if this succeeded the Fourier refinement process would begin.

Two tendencies can be discerned in the techniques of applying Patterson's idea in structure determination. In America, the theory of the |F|²-map was stressed and an attempt was made to develop it into a general method of structure determination. This can be seen, for example, in the work of David Harker (1936), then at Cal Tech, in analyzing the effects of crystal symmetry on the map; of Patterson (1944), then at Bryn Mawr, on the existence of multiple map interpretations; of M. J. Buerger (1950, 1951), at M.I.T., on systematic methods for interpreting maps; and more recently in the work of C.E. Nordman (1966) at Michigan, M.G. Rossmann (1972) at the University of Cambridge and Purdue, and others, on methods for locating and orientating in the unit cell a known or approximately known structure via its |F|²-map. In Britain, on the other hand, stress was laid on practical aspects of interpreting the maps. It was quickly noticed that the heavy-atom peaks in a Patterson map are generally rather easy to interpret. It was not long before this observation bore fruit with the discovery of the heavy-atom phasing method (Robertson 1936). Thus was born the combination of Patterson and heavy-atom techniques which over 40 years has registered an increasingly magnificent chain of successes on large structures, from penicillin in the 1940s, through the smaller proteins in the 1960s, to the viruses and other macromolecular biological systems under study today.

Direct Methods

It was after World War II that the American tendency to regard the process of structure determination from a mathematical point of view began to produce the third of the major structure determining techniques in use today, the direct methods. A direct method consists of a system of mathematical relations which involve the structure-factor amplitudes and phases (but not the atom coordinates), together with a method for solving the relations for the phases. There are three principal families of mathematical relation systems in use today: in decreasing order of current importance, they are the probabilistic linear relations providing estimates of the values of structure invariants and seminvariants, the convolutional relations, and the determinantal relations. Because of their convenience, direct methods are now usually the first technique tried for structures of up to 100-200 non-hydrogen atoms. The major relation systems will be discussed here in the order in which they appeared historically*.

*In addition to the relation systems listed, one should mention the earliest such system, that of Ott (1927), later rediscovered by M. Avrami (1938) at the University of Chicago. The list here also omits the density modification methods, several of which (including non-crystallographic symmetry and solvent smoothing) have recently begun to be of great importance in macromolecular structure studies.

Determinantal relations. The determinantal inequalities were discovered in 1950 by J. Karle and H. Hauptman at the Naval Research Laboratory in Washington, D.C. (Karle and Hauptman 1950). Earlier, D. Harker and J. S. Kasper, working at the research laboratory of the General Electric Company, had discovered the first system of mathematical relations among the structure factors to allow an unknown structure to be solved directly from the diffraction data (Harker and Kasper 1948; Kasper, Lucht and Harker 1950). Interest in the Harker-Kasper inequalities was considerable, and additional inequalities and improvements in the method of deriving inequalities were found by J. Gillis (1948) in Israel and Caroline MacGillavry (1950) in the Netherlands. It was at this point that the paper by Karle and Hauptman appeared. It showed that the Harker-Kasper inequalities are special cases of a more general inequality, the non-negativity of certain determinants constructed from the structure factors.

Were it possible to solve the full system of Karle-Hauptman inequalities, retaining only those sets of phases consistent with every inequality, the determinantal inequalities would probably provide more accurate phases on a wider range of real structures than the other direct-method relation systems currently available. Unfortunately, no good solution techniques for large systems of determinantal inequalities are known at present, and the possibilities for use of the inequalities in practice are accordingly much reduced. A partial step forward was taken when G. Tsoucaris (1970) in France showed, at least for the equal-atom case, that the phases which solve the full system can be approximated by the phases which maximize the value of a single determinant, thus allowing the problem to be treated approximately as a maximization problem, for which reasonably good solution methods are available. In this form the determinantal relations are again a subject of research, both as a practical technique of phase refinement, and in recent work by Karle (1980) as a tool in the development of the probabilistic theory of the structure invariants and seminvariants.

Convolutional relations. The first convolutional relations were discovered by D. Sayre (1952), an American graduate student with Dorothy Hodgkin at Oxford. An account of the discovery has recently been written (Sayre 1980). His relations, the squaring-method equations, hold for structures composed of equal resolved atoms, and are exact (non-statistical). Equation systems of similar form, which are only of statistical validity but in which the requirements for atom equality and resolution are removed or reduced, have also been found: by W. H. Zachariasen (1952) at the University of Chicago (requirement on equality removed); by E. W. Hughes (1953) at Cal Tech (requirement on resolution reduced); by Hauptman and Karle (1953) at the Naval Research Laboratory (requirement on equality removed and on resolution reduced, centrosymmetric structures only); by Karle and Hauptman (1956) (as above, noncentrosymmetric structures); by Karle and Karle (1966) (as above but with requirement on resolution further reduced); and by H. Hauptman (1971 ), now at the Medical Foundation of Buffalo (similar to Karle and Karle, but with requirement on equality reinstated).

The convolutional relations derive their importance from the fact that it is considerably simpler to devise solution techniques for them than for the determinantal relations, even though the inherent phasing power of the two types of system is not greatly different. An important reason for this simplicity is that the convolutional equations express the value of each structure factor directly in terms of the full set of structure factors. This was observed by Karle and Hauptman (1956) and made the basis for tangent-formula refinement, a method of solving the convolutional equations by successive approximations once a starting set of phases is available. For many years direct-method systems have used this technique for the phase extension and refinement section of the system. A more reliable but costlier method of solution was introduced by Sayre (1972).

Probabilistic linear relations. The early history of these relations is somewhat complex. Gillis (1948) noted that the Harker-Kasper inequalities frequently held in cases where they were not strictly phase-determining, but did not place a probabilistic interpretation on this observation; also he did not at that time have the determinantal form of the inequalities, which under a probabilistic interpretation would have yielded the zero-estimate of the 3-phase structure invariant, to work with. Karle and Hauptman (1950) stated the determinantal form of the inequalities, but treated them purely algebraically. Sayre (1952) gave an informal probabilistic interpretation to both the determinantal inequalities and the convolutional relations, obtained the zero-estimate of the invariant as a result, and used it to solve the structure of hydroxyproline by a method somewhat similar to the modern multisolution technique. With knowledge of Sayre's result, Cochran (1952) and Zachariasen (1952) derived it by explicit probabilistic methods which were forerunners of the mathematical techniques used today, and also used it for the first time in the solution of unknown structures.

The value of the relations of this class is that they give rise to a system of equations which, being linear, can be solved for the phases ab initio. Thus a direct-method system can form an approximate solution using linear relations, and then extend and refine this solution, usually by means of a convolutional relation system.

Shortly after the work of Zachariasen and Cochran in 1952, Hauptman and Karle (1953) began their long series of contributions to the study of the probabilistic linear relations, introducing the important concept of structure invariant and seminvariant, and bringing more sophisticated mathematical techniques to the study of the relations than had been used heretofore. With this began the further story which ultimately led in 1975, due largely to the efforts of Schenk in the Netherlands, Giacovazzo in Italy, and Hauptman in the US, to the start of a major new series of discoveries of relations involving the higher-order structure invariants (quartets, quintets, etc.) and the structure seminvariants. In this volume J. Karle and H. Hauptman will tell the earlier and later parts of this story, respectively.

* * *

At the present time, structure determination is almost always carried out either by direct methods or by a combination of Patterson and heavy-atom methods, i.e. by methods which, although much refined since their inception, and made much more powerful by the advent of the computer, date back nearly 30 and 50 years, respectively. We are, in other words, in a period of considerable stability insofar as structure determining technique is concerned. But there are signs now that basic innovations may again be entering the field. It is too early to state it with positiveness, but it seems possible that the major developments may be physical, rather than mathematical and computational, in nature, and may involve imaging the structure directly in the physical experiment, or at least capturing the phases of the diffraction pattern in the experiment. Indications which suggest that the wind may be blowing in this direction, and indications also of the involvement of American scientists in the work, are the recent demonstration of the use of dynamical theory to provide an experimental means of measuring phases (Post 1977), the perceptible drawing-together of diffraction techniques and microscopy (McLachlan 1958, Kuo and Glaeser 1975), and the development of important new X-ray sources, optics, and detectors (Parsons 1980).

References

Avrami, M. (1938) Phys. Rev. 54, 300.

Bragg, W. H. (1915) Phil. Trans. Roy. Soc. A215, 253.

Bragg, W. L. (1929) Proc. Roy. Soc. (London) A123, 537.

Buerger, M. J . (1950) Acta Cryst. 3, 87 and 243.

Buerger, M. J. (1951) Acta Cryst. 4, 531.

Cochran, W. (1952) Acta Cryst. 5, 65.

Duane, W. (1925) Proc. Natl. Acad. Sci. U.S.A. 11, 489.

Gillis, J. (1948) Acta Cryst. 1, 76 and 174.

Harker, D. (1936) J. Chern. Phys. 4, 381.

Harker, D. and Kasper, J. S. (1948) Acta Cryst. 1, 70.

Hauptman, H. (1971) Am. Crystallogr. Assoc. Program Abstr. Ser. 2.

Hauptman, H. and Karle, J. (1953). ACA Monograph #3.

Havighurst, R. J. (1925) Proc. Natl. Acad. Sci. U.S.A. 11, 502 and 507.

Hughes, E. W. (1953) Acta Cryst. 6, 871.

Karle, J. ( 1980) Proc. Natl. Acad. Sci. U.S.A. 77, 5.

Karle, J. and Hauptman, H. (1950) Acta Cryst. 3, 181.

Karle, J. and Hauptman, H. (1956} Acta Cryst. 9, 635.

Karle, J. and Karle, I. (1966) Acta Cryst. 21, 849.

Kasper, J. S., Lucht, C. M. and Harker, D. (1950) Acta Cryst. 3, 436.

Kuo, I. A.M. and Glaeser, R. M. (1975) Ultramicroscopy 1, 53.

MacGillavry, C. H. (1950) Acta Cryst. 3, 214.

McLachlan, D. (1958) Proc. Natl. Acad. Sci. U.S.A. 44, 948.

Nordman, C. E. (1966) Trans. Am. Crystallogr. Assoc. 2, 29.

Ott, H. (1927) Z. Kristallogr. 66, 136.

Parsons, D. F. (1980) Ann. N .Y. Acad. Sci. 342.

Patterson, A. L. (1934) Phys. Rev. 46, 372.

Patterson, A. L. (1944) Phys. Rev. 65, 195.

Patterson, A. L. (1962) In "Fifty Years of X-Ray Diffraction" edited by P. P. Ewald, Utrecht: A. Oosthoek, Utrecht.

Post, B. (1977) Phys. Rev. Lett. 39, 760.

Robertson, J. M. (1933) Proc. Roy. Soc. (London)A140, 79.

Robertson, J. M. (1936) J. Chem. Soc. p. 1195.

Rossmann, M. (1972) The Molecular Replacement Method. New York: Gordon and Breach.

Sayre, D. (1952) Acta Cryst. 5, 60.

Sayre, D. (1972) Acta Cryst. A28, 210.

Sayre, D. (1980) In "Structural Studies on Molecules of Biological interest" edited by G. Dodson, J. P. Glusker, and D. Sayre. Oxford: Oxford University Press.

Tsoucaris, G. (1970) Acta Cryst. A26, 492.

Zachariasen, W. H. (1952) Acta Cryst. 5, 68.

Reprinted from Sayre, D. In Crystallography in North America; McLachlan, D., Glusker, J.P., Eds.; Am. Cryst. Assn., 1983, pp 273-276.