Efficient correlation-corrected vibrational self-consistent field computation of OH-stretch frequencies using a low-scaling algorithm

The authors present a new computational scheme to perform accurate and fast direct correlation-corrected vibrational self-consistent ﬁeld (cid:1) CC-VSCF (cid:2) computations for a selected number of vibrational modes, which is aimed at predicting a few vibrations in large molecular systems. The method is based on a systematic selection of vibrational mode-mode coupling terms, leading to the direct ab initio construction of a sparse potential energy surface. The computational scaling of the CC-VSCF computation on the generated surface is then further reduced by using a screening procedure for the correlation-correction contributions. The proposed method is applied to the computation of the OH-stretch frequency of ﬁve aliphatic alcohols. The authors investigate the inﬂuence of different pseudopotential and all-electron basis sets on the quality of the correlated potential energy surfaces computed and on the OH-stretch frequencies calculated for each surface. With the help of these test systems, the authors show that their method offers a computational scaling that is two orders of magnitude lower than a standard CC-VSCF method and that it is of equal accuracy. © 2006 American Institute of Physics . (cid:3) DOI: 10.1063/1.2423006 (cid:4)


I. INTRODUCTION
One of the leading methodologies for computing accurate vibrational spectra of polyatomic molecules is the vibrational self-consistent field ͑VSCF͒ method developed by Carney et al. 1 and Bowman, 2 and its correlation-corrected extension ͑CC-VSCF͒ by Norris et al. 3 The VSCF method provides an initial approximate solution to the vibrational wave function that takes into account the anharmonicity of the potential energy surface ͑PES͒ and the coupling between normal modes using a mean-field approach. The VSCF wave function obtained is then further refined using perturbation corrections. This scheme has been implemented in a few ab initio programs, 4 thus allowing the calculation of the potential energy of the system using ab initio electronic structure theory while the CC-VSCF calculation is being performed. This direct CC-VSCF approach dispenses with the need to construct a functional form of the PES prior to solving the vibrational Schrödinger equation, and provides a direct route from the electronic structure to the vibrational spectrum of a molecule.
However, in order to apply such methods efficiently to large molecular systems, two aspects of direct CC-VSCF calculations need to be considered: the potential energy surface calculation and the CC-VSCF computation itself. In a previous paper, 5 we addressed the first aspect by reducing the number of computationally expensive mode-mode coupling grids taken into account when devising the PES on which the vibrational dynamics take place ͑fast-VSCF method͒. We showed that the removal of carefully selected weak modemode couplings from the PES does not impact strongly upon the accuracy of the final CC-VSCF results. This method en-ables the investigation of larger systems that would otherwise be out of reach of a standard direct CC-VSCF calculation. The speed at which the PES can be computed is then directly related to the number of couplings taken into account and the amount of computing power allocated to the ab initio calculation of each grid point. This method was shown to provide accurate results for simple molecular systems but required the prior computation of the coupling grids at a low level of theory to select suitable mode-mode couplings.
In order to remove the need to precalculate the modemode couplings, a fast-VSCF-type scheme can be used with an imposed choice of mode-mode couplings. More precisely, a particular vibrational mode ͑or set of modes͒ can be coupled systematically to all modes in the system rather than having to perform a selection of mode-mode couplings on the whole system. This approach provides the full accuracy of a CC-VSCF treatment for the selected modes without the need to perform an elaborate computation for the less important modes in the system.
In this paper, we describe in detail this new approach, named single-to-all CC-VSCF ͑STA-CC-VSCF͒, and use it to compute selected vibrational frequencies of a molecular system that are fully coupled to all other modes in the system, while the rest of the molecule is treated as a series of almost noncoupled anharmonic oscillators. We also suggest a screening scheme for the correlation-correction ͑CC͒ part of the CC-VSCF method that is particularly suited to sparse PES containing only a few coupled modes, and show that the selected coupling method can lead to large speed improvements in the treatment of correlation corrections of direct CC-VSCF solutions.

II. THEORY AND METHODS
Since the theoretical background of the vibrational selfconsistent field method is well known 1,2 and the specifics of our implementation have already been described in a previous paper, 5 we shall give only a very brief outline of the technique below.
The vibrational Schrödinger equation for a molecular system, expressed for a set of n normal coordinates ͕Q i ͖, can be formulated as follows: where we neglect rotational coupling. The wave function of the system ͑⌿ k ͒ for each vibrational state k is described by a product of single-mode wave functions ͑ i k i ͒: Note that k = ͕k i ͖ is a collective index representing the vibrational excitation quanta for each mode.
To achieve the separation of Eq. ͑1͒ into a collection of single-mode equations, we introduce an effective potential,

͑3͒
This mean-field potential is then used in Eq. ͑1͒ instead of the real potential, leading to the following set of singlemode equations: Since the effective potential depends implicitly on the wave function, the set of equations needs to be solved iteratively until self-consistency is achieved. The solutions of each onedimensional equation ͑one for each mode i͒ can then be used to form the basis functions ͑ i k i ͒ used to construct the vibrational wave function of the system ⌿ k for each vibrational state, as described in Eq. ͑2͒.
In the current implementations of the direct VSCF algorithm, 4 the potential energy surface of the system is explored systematically using a hierarchical many-body expansion limited to second order in V ͑pairwise approximation͒: The separation of the potential energy term into a sum of single-mode contributions, V i diag ͑Q i ͒, and pairwise contributions, V ij coup ͑Q i , Q j ͒, greatly simplifies the evaluation of the integrals needed for the calculation of the effective potential in Eq. ͑3͒, and is used in the GAMESS package 6 and in other VSCF implementations. [7][8][9] A. Single-to-all method Previously, 5 we have described a method that enables the fast computation of the CC-VSCF solutions for a given molecular system. This method is based on a careful selection of the parts of the potential energy surface that are relevant to vibrational dynamics. Although this method offers a considerable speed gain for the computation of the whole vibrational spectrum of a system, this spectrum is rarely measured in its entirety in experimental studies. Most studies focus instead on a narrow part of the vibrational spectrum, be it the OH-stretch region when looking at hydrogen-bonding phenomena or the amide bands in proteins. Consequently, it is desirable to provide a method that enables a fast and accurate simulation of the vibrational bands ͑or modes͒ observed in practice, even if this occurs at the expense of the rest of the molecular vibrational spectrum. This method can be readily constructed as a fast-VSCF-type technique by making the following assumptions. The vibrational modes participating in the observed spectra are defined as nЈ active modes ͑Q A ͒, while the rest of the vibrational modes of the system are defined as nЉ inactive spectator modes ͑Q I ͒, so that n = nЈ + nЉ is the total number of normal modes present in the system. The nЈ active modes are then freely coupled to all n normal modes in the system by means of potential energy coupling terms and the inactive modes are described by a diagonal anharmonic potential only. The resulting potential energy surface is then defined as In this way, we achieve the maximum accuracy for the active modes, since all of their vibrational mode-mode couplings are taken into account, while saving considerable computational time by neglecting the mode-mode couplings of the inactive modes, namely, the V vw coup ͑Q v I , Q w I ͒ terms. As an example, a map of the potential energy coupling terms for methanol is shown in Fig. 1 to illustrate the selection scheme used in our single-to-all method. In contrast to the method described in our previous paper, 5 we do not select a part of the PES based on the predicted coupling strength obtained at a lower level of theory, but instead restrict the PES to include only the couplings of a few selected normal modes with the rest of the system. Note that, for large systems, the singleto-all method and the fast-VSCF approach can be combined to provide a fast, if slightly less accurate, solution to the vibrational problem in a given vibrational frequency window. 10 It is worth mentioning that the computational effort needed to generate the mode-mode coupling part of the potential energy surface in the single-to-all technique scales formally as n · nЈ, and thus depends directly on the number of active modes selected ͑usually one or two͒. For the same computational step, the full CC-VSCF algorithm as implemented in GAMESS requires a computational effort growing formally in n 2 , and is therefore slower for large n.

B. Screened correlation corrections
An efficient way to go beyond the mean-field approximation used in the VSCF approach and correct for the lack of explicit correlation between modes is to apply a perturbation correction to the SCF solutions. This correlationcorrected VSCF method, described by Norris et al., 3 uses the following perturbation potential: Assuming that the effect of the nonseparability of modes is small and that there are no degenerate energy levels, the energy correction is then given by an expectation value of the perturbing potential, in a similar fashion to the usual Møller-Plesset perturbation theory: where Ẽ s ͑0͒ and Ẽ t ͑0͒ are the sum of the single-mode energies ͑⑀ i k i ͒ for the states s and t, respectively, of the unperturbed VSCF solutions.
The Gerber group has shown that such correlation corrections are necessary for a reasonable prediction of vibrational frequencies using the VSCF technique. 3,11,12 However, Eq. ͑8͒ requires the computation of a large number of integrals over various vibrational excited states. This can become a serious issue for systems containing a large number of normal modes. An analysis of a typical CC-VSCF run as implemented in the GAMESS package shows that, once the PES has been calculated, the next speed bottleneck is the computation of the correlation corrections using the perturbative approach outlined above. Indeed, in the extreme case of a single mode being coupled to a large number of system modes, the PES stage of the computation can become a minor part of the calculation, with most of the computational effort being devoted to solving the CC-VSCF problem.
In order to reduce the computational cost of correlation corrections for the CC-VSCF method for a sparse PES spanning many dimensions, we suggest using a screening scheme resembling the techniques used in local-MP2 ͑or local-CCSD͒ methodologies. [13][14][15] Given that the uncoupled modes do not contribute to the summation in expression ͑8͒ and are therefore well described at the VSCF level, we only need to compute the correlation corrections for the mode-mode couplings present on the sparse PES. Thus we obtain a simplified expression for the correlation correction, which can be evaluated more efficiently than in the original CC-VSCF method since it contains fewer contributions to the summation. Note that the correlation corrections computed using this simplified expression are identical to those obtained by including the uncoupled modes.
We define a subset of vibrational states ͕c͖ that will contribute to the energy corrections in Eq. ͑8͒. This subset corresponds to all virtual states built within the ͕A , A͖ ഫ ͕A , I͖ space. The simplified correlation-correction expression is then given by where we take into account only the states that represent coupled modes.

C. Computational details
For each system studied, we first perform a full optimization at the chosen level of theory and then calculate the harmonic frequencies and corresponding normal modes at the minimum obtained. The VSCF routine evaluates the ab initio potential energy for a fixed number of points along each normal coordinate Q i for each diagonal anharmonicity term and then on a grid of points for each mode-mode coupling term. These points form a discretized version of the potential energy surface of the system V͑Q 1 , ... ,Q n ͒, as formulated in Eq. ͑5͒. The self-consistent vibrational calculation is then performed and includes a perturbative correction of the coupling between modes as described by Eqs. ͑4͒ and ͑8͒.
We use the latest implementation of the direct CC-VSCF algorithm available in the GAMESS computer program. 6 Within this implementation, we choose to use 16 regularly spaced points to represent the diagonal potential energy curve along each normal mode, V i diag ͑Q i ͒, in Eq. ͑5͒, and to calculate the second-order pair-coupling potentials, V ij coup ͑Q i , Q j ͒, on a set of 256 regular grid points ͑16ϫ 16͒. In this paper, the results indicated by "full" are obtained using the direct CC-VSCF method as implemented in the GAMESS package, while the others are obtained with our locally modified version of the GAMESS package. All results using pseudopotentials are obtained either with the pseudo- potentials developed by Stevens and co-workers, [16][17][18] identified by "SBK," or with the shape-consistent pseudopotentials developed by Pacios and Christiansen, 19 labeled "CRENBL." The polarization functions used with the CRENBL pseudopotentials are taken from the 6-311G ** basis set, as suggested by Liang et al. 20 The all-electron CC-VSCF calculations are performed using the triple-zeta basis set of Dunning, Jr. with polarization functions 21 ͑TZP͒ and the augcc-pVTZ family of basis sets. 22 All MP2 and CCSD͑T͒ calculations are performed using the frozen-core approximation for each all-electron basis set. All calculations are performed on a cluster of Apple Xserve computers.

A. Computational efficiency
In order to assess the observed computational scaling of the vibrational part of the single-to-all method, with and without the correlation-correction screening described earlier, compared to the original GAMESS implementation of the CC-VSCF, we perform a series of calculations of the OHstretch vibrational frequency of simple aliphatic alcohols of varying carbon-chain length using each method. Note that each molecule is assumed to be in its all-trans ͑anti͒ conformation. The calculations are performed using a PES computed at the PM3 level of theory, in order to focus on the time spent on the actual CC-VSCF computation. The relative timings of the vibrational part of the calculations are shown in Table I along with the observed computational scaling of each method. The full CC-VSCF computation on methanol used as a reference takes approximately 45 s on a single PowerPC G5 / 2.3 GHz processor. We see that the time taken by the CC-VSCF part of the GAMESS algorithm increases rapidly with the size of the alcohol molecules and therefore with the number of normal modes present in the system. A linear fit of the log-log plot of these results reveals an observed scaling approximately equal to n 6.5 , where n is the number of normal modes as defined earlier ͑see also Table I͒. The single-to-all method without any CC screening is faster in the vibrational part of the calculation than the GAMESS algorithm and exhibits a lower computational scaling as a function of the number of normal modes present in the system. Due to the zeroing of a large portion of the PES used in the vibrational calculation, the VSCF solutions are obtained faster but, without any efficient screening mechanism for the CC part of the program, the observed computational scaling remains fairly high ͑close to n 6 ͒. The inclusion of CC screening in the single-to-all method leads to a further reduction in computational time during the vibrational calculation part of the program. The removal of the zero contributions to the correlation corrections arising from uncoupled states leads to an observed computational scaling close to n 4.5 . This scaling is two orders lower than the original CC-VSCF algorithm implemented in GAMESS and one and a half orders lower than the STA-CC-VSCF method without CC screening. For obvious reasons, we use CC screening for the remaining STA-CC-VSCF calculations in this paper.

B. Accuracy of the method
In order to determine the accuracy of the single-to-all ͑STA͒ method, we compare the computed value of the OHstretch frequency for the set of aliphatic alcohols described earlier using both STA-CC-VSCF and full CC-VSCF as implemented in GAMESS. These results are shown in Table II for two PES, namely, PM3 and MP2 / SBK͑d , p͒. At first glance, we see that the difference between the two methods is of the order of a few wave numbers for all systems and levels of theory presented. The largest difference is obtained for methanol at the MP2 / SBK͑d , p͒ level of theory, where we see that the vibrational frequency computed using STA is 5 cm −1 lower than the fully coupled CC-VSCF result. The reason for this difference is that the coupling between the torsional modes of the methyl group and the CH-stretch modes has been neglected. Nonetheless, the difference between the STA and full CC-VSCF results for the vibrational frequency investigated is well within the typical accuracy of CC-VSCF calculations. We also note that the small difference between STA and full CC-VSCF seems to remain low and constant as the system size is increased, at least for the set of alcohols presented here. I. Representative timings for the vibrational part of the direct calculation of the OH-stretch vibrational frequency for aliphatic alcohols of varying carbon-chain length using the full CC-VSCF method, the single-toall ͑STA͒ method without any CC screening ͑full CC͒, and the single-to-all method with CC screening ͑screened CC͒. All timings are relative to methanol at the full CC-VSCF level. The observed scaling was determined by linear regression of ln͑time͒ vs ln͑n͒, n being the total number of normal modes in each system.

VSCF timing
Full CC-VSCF The greater speed of the single-to-all method allows us to perform efficiently a series of benchmark calculations to investigate the influence of the quality of the PES on the computed vibrational frequency of the OH stretch at the CC-VSCF level of vibrational theory. We focus on methanol and ethanol, using potential energy surfaces computed at the PM3, MP2, and CCSD͑T͒ levels of theory using two types of pseudopotential and two all-electron basis sets of triple-zeta quality. Our results are summarized in Table III.
First of all, we observe that, as noted previously for other systems, 5 PM3 gives surprisingly good results. The vibrational frequency computed for methanol is only about 25 cm −1 lower than the experimental value and the computed value for ethanol is a mere 4 cm −1 above experiment. Obviously, the reliability of a potential energy surface computed at the PM3 level is far from comparable to that obtained using MP2 or CCSD͑T͒ and should therefore be used with caution. Nonetheless it is surprising that this semiempirical method is able to reproduce vibrational frequencies successfully at the CC-VSCF level, and this could be an advantage in connection with applications to large molecular systems. 26 MP2 remains the workhorse of most direct VSCF computations on medium-sized molecular systems, owing to its reasonable computational cost in relation to the quality of PES produced, 12 and as such it deserves particular attention. We investigate the influence of pseudopotentials on the OHstretch frequency at this level of ab initio theory, since we have already shown 5 that these speed up the calculation of the PES for large molecular systems dramatically.
Our results show that the inclusion of polarization functions in the valence basis set used with pseudopotentials is important for a realistic computation of the OH-stretch frequency. This is clearly illustrated by the difference between the frequency obtained at the MP2/SBK or MP2/CRENBL level and that obtained when a set of ͑d , p͒ functions has been added to each valence basis set. Further increasing the polarization functions to ͑2df ,2p͒ along with a set of diffuse functions does not produce a significant improvement in the predicted frequency and in some cases ͑CRENBL͒ leads to a worse prediction than the computationally less demanding ͑d , p͒-augmented set.
All-electron basis sets provide results of similar accuracy to those obtained with a polarized pseudopotential, but at a computational cost that can become prohibitive for large systems, even using the STA method. The best all-electron basis set is the TZP set as implemented in GAMESS, in agreement with the observations of Chaban et al. 4 for other systems. This basis set leads to a very accurate prediction of the OH-stretch frequency for the two examples considered and exhibits a much lower computational cost than the larger aug-cc-pVTZ basis set. We note that, for the cc-pVTZ basis set, the addition of diffuse functions does not lead to an improvement in the predicted frequency. It is worth mentioning that, at the MP2 level of theory, the large correlationconsistent Dunning basis set ͑with or without diffuse func-tions͒ is outperformed by both CRENBL͑d , p͒ and TZP basis sets for the systems investigated. Moreover, we observe that the computational cost of TZP and CRENBL͑d , p͒ is similar for most systems studied here, but we expect the pseudopotential basis to offer a more favorable computational scaling with increasing system size than the TZP set.
An improved treatment of electronic correlation using CCSD͑T͒ leads to a softer potential energy surface when compared to the MP2 results. We note that for all three basis sets tested, either all-electron or pseudopotential, the vibrational frequency computed is lower with CCSD͑T͒ than with MP2. This trend seems to indicate that MP2 remains a good compromise for efficient VSCF calculations on large systems. Nevertheless, CCSD͑T͒ is known to be a more robust method than MP2 27 and we can expect that, for some systems, the agreement obtained at the MP2 level of theory can vanish. We see, for example, that a CCSD͑T͒/TZP PES produces results that are about 20 cm −1 lower than the experimental value. In practice, high-level correlation methods, such as CCSD͑T͒, are more demanding in basis-set size 28 and we expect that a much larger set would be needed to obtain an agreement comparable to that of MP2/TZP results. 29

C. Application
The single-to-all method enables us to perform vibrational computations of a particular vibrational frequency in relatively short computational times, even for large molecular systems. As an example of the application of our method, we investigate the behavior of the OH-stretch frequency in relation to the increase in alkane-chain length at the PM3 and MP2 levels of theory with several basis sets.
Our results are shown in Table IV. Generally there is good agreement between predicted and observed frequencies, keeping in mind that some of the experimental values ͑propanol, butanol, and pentanol͒ have an accuracy of ±5 cm −1 and that the accepted accuracy of CC-VSCF calculations on a MP2 PES lies in the region of ±30 cm −1 . 8 The semiempirical Hamiltonian PM3 is mostly correct but not systematic in its accuracy: methanol has a relatively low predicted frequency compared to ethanol, but the remaining alcohols are fairly well described. The MP2 / SBK͑d , p͒ PES predicts a similar evolution of the OH-stretch frequency ͑red-shift͒, but, as shown in the previous section, the actual values of the frequencies remain low when compared with experiment due to the limited accuracy of the SBK͑d , p͒ doublezeta valence basis set. The STA-CC-VSCF results are improved by using a larger valence basis set such as the CRENBL pseudopotential along with its accompanying basis set, and at the MP2 level, the method manages to achieve a root mean square deviation from experiment ͑RMSD͒ of 33 cm −1 . The best results are obtained with a MP2/TZP PES for all compounds except propanol, which is slightly underestimated at this level of theory. We see that this combination of method and basis set has a RMSD of 11 cm −1 compared to experimental results.
We see that MP2 predicts OH accurately at the STA-CC-VSCF level of vibrational theory with a slight tendency to underestimate the experimental stretch frequency. However, as pointed out recently, 8 a CCSD͑T͒ approach would provide a more reliable prediction of the PES for these systems, even if the experimental agreement is expected to be slightly worse according to our benchmark calculations ͑see Table III͒.
There is a general trend in our computed values towards a redshifting of the OH-stretch frequency with increasing alkyl-chain length for all methods. This trend is consistent with the indication that molecules possessing a longer alkyl chain exhibit a larger polarizability, 31 thus leading to a slightly softer OH-bond vibration. However, due to the magnitude of the experimental uncertainty in the measured OHstretch frequencies, this predicted trend is not seen unequivocally in the experimental data.

IV. CONCLUSIONS
We have shown using several examples that the singleto-all method can be used to compute an accurate prediction of a single vibrational frequency in large systems, given an appropriate potential energy surface. Moreover, there is only a negligible difference between STA-CC-VSCF results and values computed using the full CC-VSCF algorithm of GAMESS. Our new method offers a better computational scaling as a function of the number of normal modes ͑n 4.5 ͒ than the full CC-VSCF ͑n 6.5 ͒ and is thus able to compute vibrational solutions efficiently even for a system containing as many as 54 normal modes such as pentanol.
Concerning the direct generation of MP2 potential energy surfaces, we observed that a set of ͑d , p͒ polarization functions is mandatory for pseudopotential basis sets when computing the OH-stretch frequency. However, a further increase in the number of polarization functions is not computationally advantageous for this type of PES. We also noticed that the addition of diffuse functions to the basis set of isolated molecules does not improve the agreement between the STA-CC-VSCF results and the experimental data. We saw, as observed by Chaban et al. 4 for different systems, that a MP2/ TZP potential energy surface leads to CC-VSCF results that are in very good agreement with experimental data. Our calculations showed that the STA-CC-VSCF results obtained for a MP2 / CRENBL͑d , p͒ surface have a better agreement with the experimental data than those obtained for a MP2 / SBK͑d , p͒ surface, yet CRENBL͑d , p͒ does not provide the same experimental agreement as that obtained using MP2/TZP PES for OH-stretch frequencies. Nonetheless, it is expected that a pseudopotential basis should offer better computational scaling than the all-electron TZP basis with increasing system size. For all systems investigated, the MP2-level potential energy surfaces were found to be more efficient and, due to some error compensations between the Band center for the A torsional species ͑Ref. 24͒ ͑uncertainty below 1 cm −1 ͒. b method and basis sets, they led to results that were closer to the experimental data than CCSD͑T͒ PES with the same basis sets. Our application of the STA-CC-VSCF technique to the computation of the vibrational frequencies of the OH stretch of five aliphatic alcohols showed that there is, in general, a good agreement between the predicted values and the experimental data. We also observed a consistent redshift of the predicted OH-stretch frequency at all levels of theory for these systems.
Finally, we would like to mention that the STA-CC-VSCF methodology is also able to describe accurately any given set of vibrational modes in large molecules in relatively modest computational time. This opens the door to accurate calculations-beyond the harmonic approximation-of NH-stretch frequencies in model proteins, for example, where the experimental detection of particular conformational isomers 32 depends on the precise position of their amide-band vibrations. Moreover, the Gerber group has very recently developed a new low-scaling version of the full CC-VSCF method 33 that reduces its computational scaling by two orders of magnitude. Our STA approach can easily be implemented on top of their new method and we expect that a combination of both approaches will lead to CC-VSCF schemes with a computational scaling close to n 2 for correlation corrections. Note that our technique is also applicable to vibrational degenerate perturbation theory 34 ͑DPT2͒ and vibrational configuration interaction 35 ͑VCI͒ calculations. An implementation of the STA scheme within the VCI method is currently underway in our laboratory.

ACKNOWLEDGMENTS
Dr. D. M. Lauvergnat and Dr. B. Madebene are acknowledged for helpful discussions. R. C. Benoit is gratefully acknowledged for proofreading the manuscript.