LARGE-SCALE QUANTUM-MECHANICAL CALCULATIONS WITH A MODIFIED HARTREE-FOCK FORMALISM

Explanation of deposited files:

Large_Scale_Hartree_Fock_main
---> the main file of the Hartree-Fock program

Large_Scale_Hartree_Fock_Notebook
---> main file of the Hartree-Fock program in Jupyter notebook form

Basis_Sets_Fits
---> necessary for generating basis sets

single_atom_densities
---> necessary to build density matrices

Beta_Carotene_Test
Insulin_Test
DNA_Test
---> coordinate files for tests with small molecules

bacteriophage_coordinates
bacteriophage_elements
bacteriophage_density_grid_no_solvent
---> coordinates, elements used for a bacteriophage calculation and returned density on a grid

vault_coordinates
vault_elements
vault_density_grid_no_solvent
---> coordinates, elements used for a vault calculation and returned density on a grid

flagellar_motor_coordinates
flagellar_motor_elements
flagellar_motor_density_grid_no_solvent
---> coordinates, elements used for a flagellar motor calculation and returned density on a grid






The following part of this read-me file gives an overview over the program and how to use to it
by explaining its variables, which values should be chosen and how the output will look like.



======
NIMBLE - a large-scale Hartree-Fock program
===========================================
Nearsighted Integral-Optimized Matrix-Based Large-Scale Electronic Structure Prediction
=           =                  =      =     =           =
=======================================================================================
version 3.10

This Hartree-Fock program is focused on fast calculations for large systems.

The key priciples:
    - Nearsightedness: Based on this principle by Walter Kohn, irrelevant and long-range interactions can be ignored.
    - Integral-Optimized: The integral calculation algorithm can ignore integrals with only small contributions to the result.
    - Matrix-Based: A matrix formalism can be derived which enables an electron repulsion integral scheme which is highly parallelizable.
    - Large-Scale Systems: Sparse formalisms and memory constraints allow fast calculations for systems with multiple of hundereds of atoms.

The key features:
    - electronic structure calculations for large-scale systems with >1,000,000 atoms
    - very fast calculations for systems with a few hundred to a few thousand atoms
    - subsystem/clustering approach for linear scalability
    - visualization of electonic densities
    - UV/Vis spectra for systems containing hundreds or thousands of atoms

To run this program:
1. Read the program configurations below the variable selections and change variables according to your needs. The default values will work fine in most cases.
2. Read the section ---MAIN--- at the very end of the program and follow the instructions there.

Author/Contact information:
Luc Wieners
Institute of Physics, University of Kassel, Heinrich-Plett-Straße 40, 34132 Kassel, Germany
lucwieners@physik.uni-kassel.de







-----------------------------------
VARIABLES:
(written as variable=default_value)
-----------------------------------

SCF CONFIGURATIONS:
max_scf_steps=100
scf_tolerance_density=1.0e-6
max_DIIS_linear_equations=max_scf_steps
pulay_mixing_rate=0.7
DIIS_penalty=1.05
added_electrons=0
level_shift_enabled=False
level_shift_value=0.3

PRECISION CONFIGURATIONS:
density_threshold=1.0e-4
coulomb_threshold=10.0
coulomb_threshold_low=8.0

BASIS FUNCTION CONFIGURATIONS:
sto_precision=3
lobe_precision=6

VERBOSITY:
verbosity=1

THREADING:
num_threads=4
max_electron_integrals=100000000

TIME-DEPENDENT HARTREE-FOCK:
time_steps=2000
delta_t=0.25
pulse_standard_deviation=0.2
pulse_shift_factor=10
e_field_max=2.0e-5
update=1

DIVIDE-AND-CONQUER HARTREE-FOCK:
partition_length=12.5
partition_cut_off=8.0
section_cut_off=10.0

DENSITY PLOTTING CONFIGURATIONS:
pixel_size=0.5
basis_function_space=5.0
additional_space=7.0
hide_core_electrons=True

ALPHA-FOLD PREDICTIONS:
amino_acids_per_cluster=5



-------------------------------
ADDITIONAL VARIABLES
(changes usually not necessary)
-------------------------------


PRECISION CONFIGURATIONS:
density_threshold_2=density_threshold
densities_threshold=density_threshold
densities_threshold_2=density_threshold

THREADING:
num_threads_G=num_threads
num_threads_integrals=num_threads
e_tensor_datatype='float64'

BASIS SET CONFIGURATIONS:
overlap_sto_precision=6
overlap_part_precision=7

ADVANCED VERBOSITY:
print_ERI_relevance=False
display_runtimes=False
display_eigenenergies=False








================
Paramater Manual
================


--------------------
MAIN CONFIGURATIONS:
--------------------
Change these parameters according to the calculation. If in doubt, use the default values.



SCF CONFIGURATIONS:

max_scf_steps: the maximum number of scf steps after which the calculation is stopped regardless of the convergence value. 
Default of max_scf_steps is 100, much higher values are not recommended as convergence should usually occur at latest after this amount of steps if no other error is present.

scf_tolerance_density: The convergence threshold value which terminates the calculation if the threshold is passed.
For the convergence value the RMSD of the density matrix is used i.e. sqrt(mean((P_{i-1}-P_{i})^2)). 
The default value is 1.0e-6. For more precise calculations (especially for real-time Hartree-Fock!) a stricter convergence threshold is recommended as for example 1.0e-8 or 1.0e-9.
For less precise calculations a higher value (for example 1.0e-4) can be used.

max_DIIS_linear_equations: The number of DIIS (direct inversion in the iterative subspace also called Pulay mixing) equations which is used to construct a new density matrix via Pulay mixing. 
The default is max_scf_steps which leads to stable results even for large systems (>1,000 atoms) which are otherwise extremely difficult to converge.

pulay_mixing_rate: The mixing rate of the density matrices during Pulay mixing. The mixing is done as P_new = pulay_mixing_rate*P_Pulay + (1-pulay_mixing_rate)*P_old, 
where P_old is the old density matrix and P_Pulay the newly constructed density matrix via Pulay mixing. The default value is 0.7. A higher mixing rate accelerates the scf procedure
but can lead to instabilities as wells. The mixing rate can be increased if the system contains a lot of water which often stabilizes the calculation. 

DIIS_penalty: A paramter which pushes the DIIS towards a final solution by favouring the lowest-energy state. This parameter stabilizes calculations where a switching between two or more
states occurs (also sometimes called charge sloshing). Very helpful for systems with small HOMO-LUMO (highes/lowest occupied molecular orbital) gaps. The default and recommended value is: 1.05.
Can be disabled by setting the parameter to 1.0 which corresponds to normal mixing. Values >=1.0 are valid, higher values than 1.5 are discouraged. 
The use of this parameter is highly recommended since there are no downsides.

added_electrons: Adds electrons to the system. A negative value removes electrons from the system. 
Is used to add charges to systems which is often done in real-time time-dependet Hartree-Fock calculations. A good measure of added_electrons is the difference between the number of electrons
in the system and the calculated position of the HOMO-LUMO gap.

level_shift_enabled: Artificially increases the HOMO-LUMO gap by adding an energy shift to the Fock matrix. This can help to converge calculation with small HOMO-LUMO gaps.
The default is 'False' since using only the parameter DIIS_penalty works well in most cases.

level_shift_value: The value by which the virtual orbital energies are shifted if level_shift_enabled=True. This parameter is only used if level_shift_enabled=True.
Default and recommended value: 0.3. Values are given in Hartree. Note that 1 Ha = 27.211 eV (electron volts).



THRESHOLDS:

density_threshold: sets the value below which electronic density contributions will be ignored.
Recommended values: standard precision: 1.0e-4
                    higher precision: 1.0e-6
                    minimum: 1.0e-10 (lower values will likely not influence calculation results)
                    maximum: 3.0e-4 (high accuracy loss for higher values)

coulomb_threshold and coulomb_threshold_low: sets the values above which the Coulomb interaction will be ignored or dampened, respectively. Values are in Angstrom!
Recommended values: 10 Angstrom and 8 Angstrom



BASIS FUNCTION CONFIGURATIONS:

sto_precision: determines how from how many Gaussian functions a STO-nG basis function is built.
Currently supported are 2,3,6 (corresponding to STO-2G, STO-3G, STO-6G). Default and recommendend: 3.
Note that this value only influences s-type orbitals since orbitals of higher angular momenta are treated with the Gaussian lobe function expansion algorithm.

lobe_precision: Determines the amount of Gaussian functions in a Gaussian lobe function expansion of a p-type orbital.
Currently supported are 4,6 (approximately corresponds to STO-2G, STO-3G). Default and recommended: 6.
Note that double the amount of Gaussians are needed here since one p-type orbital contains two lobes which are treated independently.



VERBOSITY: 

verbosity: 1 for run-time and integral-count outputs, 0 for no outputs at all (default: 1)



THREADING:

num_threads: the number of threads which will be started for the parallelized computation of the electron repulsion integrals and the G-matrix
             should correspond to the number of CPU cores available

max_electron_integrals: the maximum number of electron repulsion integrals that are used in a calculation



OPTIONAL: TIME-DEPENDENT HARTREE-FOCK
Note: all parameters below are ONLY used if a real-time time-dependent Hartree-Fock (RT-TDHF) calculation is done and will be ignored otherwise.

time_steps: The amount of time steps over which a real-time time-dependent Hartree-Fock calculation is propagated. The default value is: 2000.
Multiplied with delta_t (see below) the total propagation time can be calculated. This time should be around 500 atomic units of time (=~12.1 femtoseconds).
If absorption peaks towards infra-red are computed a higher time might be needed for a better resolution of low-frequency oscillations. A value of 1000 atomic units of time should be enough.
For ultra-violet calculations lower values than 500 might be used.

delta_t: The time of one time step. Default and recommended values is: 0.25. Given in atomic units of time. See also 'time_steps'.
A higher time step can lead to numerical instabilities (starting at around 0.3) and should therefore be avoided!
Lower time steps are possible but have no advantage and should therefore not be used.

pulse_standard_deviation: The standard deviation of the Gaussian pulse with which the system is excited in the calculation. Default and recommended value: 0.2. Given in atomic units of time.
A higher value can lead to different results since the RT-TDHF approach uses only a very small excitation.

pulse_shift_factor: Shifts the pulse forward on the time axis. Default value: 10. Given as a multiple of pulse_standard_deviation.
A much lower value leads to parts of the pulse not being on the positive side of the time axis and a much higher value leads to a large time span before the pulse in which no dynamics occur.

e_field_max: The electric field strength of the pulse. Default and recommended: 1.0e-5 (given in atomic units). This is the maximum value of the electric field.
This value should be chosen dependent on the size and dipole moment of the system which is studied. 
The values of the dipole moment during the time evolution should be between 1.0e-3 and 1.0e-6. They scale proportionally to e_field_max and e_field_max should therefore be chosen accordingly.
Too high values lead the system from linear to non-linear dynamics which is undesired and too low values make the time evolution susceptible to numerical instabilities.

update: How often an update during the time evolution is displayed in form of a console output. Default: 1. Given in the amount of time steps after which in update is shown.
The update contains information about the coorindate (x,y or z) of the evolution, the time step and the corresponding percentage of completion, 
the current dipole moment and the numerical stability (trace deviation of the density matrix). Especially for testing update=1 is recommended.



OPTIONAL: DIVIDE-AND-CONQUER HARTREE-FOCK
Note: all parameters below are ONLY used if a divide-and-conquer calculation is done and will be ignored otherwise.

Additional information for the parameters:
All values are in the variable section should be given in Angstrom!
The divide-and-conquer scheme for dividing into clusters using a 3D grid:
xxxxxxxxxx|---------------------|-----------------------|-----------0----------|-----------------------|---------------------|xxxxxxxxxxx
(ignored) |<--section_cut_off-->|<--partition_cut_off-->|<--partition_length-->|<--partition_cut_off-->|<--section_cut_off-->| (ignored)

partition_length: atoms in this partition/box are determined by the partition num (default: 12.5 Angstrom, variable always given in Angstrom)

partition_cut_off: atoms that are neighbors to atoms in the partition/box (default: 10 Angstrom, variable always given in Angstrom)

section_cut_off: atoms that could be added to avoid bond breaking (default: 10 Angstrom, variable always given in Angstrom)



OPTIONAL: DENSITY PLOTTING CONFIGURATIONS
Note: all parameters below are ONLY used if a density plotting calculation is done and will be ignored otherwise.

pixel_size: The size of one pixel / length of the 3D grid used for the visualization of the electronic density. Default: 0.5 Bohr atomic units. 
For larger systems a value of 1.0 or even 1.5 should be considered due to the high memory requirements of density grids.
For small sytems values up to 0.1 might work. Memory requirements scale cubical with the inverse of pixel_size since we use a 3D grid.

basis_function_space: Space around basis functions on their local grids. Basis functions grids get precomputed for better performance.
Default value: 5.0 Bohr atomic units.
Important: Has to be a multiple of pixel_size!

additional_space: Additional space around the computed structure. Should be larger than basis_function_space and can be used to generate empty outer regions on the grid.
Default value: 7.0 Bohr atomic units.

hide_core_electrons: Core electron basis functions will not be plotted during the density grid calculation. Default and recommended: True.
Core electrons have their electronic density concentrated to a small region which is problematic for visualizing electronic densities since only the core electron densities would be visible
since valence electron densities occupy more space but have lower values.



OPTIONAL: ALPHA-FOLD PREDICTIONS
Note: the parameter below are ONLY used if an atomic energy calculation for comparison with AlphaFold is done and will be ignored otherwise.

amino_acids_per_cluster: The clustering for atomic energy calculations for comparison with AlphaFold is done by 





--------------------------
ADDITIONAL CONFIGURATIONS:
--------------------------
Configurations which usually do not need to changed. Defaults are recommended.



# PRECISION CONFIGURATIONS

density_threshold_2: A lower cut-off for the density relevance value. Can be used to apply a smooth cut-off function for density cut-offs which is usually not done 
since the cut-off values are already small.
Default: density_threshold

densities_threshold: A different threshold value for the combination of two densities. A threshold is used both for the relevance of single densities and for the product of two densities.
Usually the same threshold value is used but the value for the product can be chosen differently if wanted.
Default: density_threshold

densities_threshold_2: A lower cut-off for the relevance value of the product of two densities (i.e. densities_threshold, see above). See also the explanation of density_threshold_2.
Default: density_threshold



# THREADING

num_threads_G: Number of threads for the G matrix computation. In most cases identical to num_threads. 
Note that on some compute nodes a higher number of threads can slow down calculations for large systems since the memory usage during G matrix computations can get very high.
Default: num_threads.

num_threads_integrals: Number of threads for the integral computation. In most cases identical to num_threads. 
Note that on some compute nodes a higher number of threads can slow down calculations for large systems since the memory usage during the integral computation can get very high.
Default: num_threads.

e_tensor_datatype: Datatype of the electron repulsion tensor in which the electron repulsion intgrals (ERIs) are stored. 
This is usually done in double precision but can be changed to single precision if the memory for double precision is not high enough. This can however influence the accuracy of the calculation.
Default: 'float64'.



# BASIS SET CONFIGURATIONS

overlap_sto_precision: The precision of s-type basis functions used to estimate the relevance of electronic densities can be chosen.
Lower values increase the computational speed of this part slightly but the relevance computation part is not performance-critical.
Default: 6. Currently 2,3,6 is supported.

overlap_part_precision: The precision of individual lobes of p-type basis functions used to estimate the relevance of electronic densities can be chosen.
Default: 7. Currently only 7 is supported.



ADVANCED VERBOSITY:

display_runtimes: more detailed outputs for scf runtimes. Useful for runtime analysis.
Default: False

display_eigenenergies: displays eigenenergies around the Fermi edge. Useful to locate the HOMO-LUMO gap.
default: False

print_ERI_relevance: prints detailed information about the distribution of absolute values of ERIs which cooresponds to their relevance.
Can be used to see how many ERIs get ignored in the relevance calculations. 
Default: False.





-------
OUTPUT:
-------
The console output of Nimble consists of run-time logs and density/ERI counts.
Below is an examplary output for beta-carotene (96 atoms) on an intel i5 processor using default paramenters:

Preparations:           23.6352 s
Overlap/Kinetic matrix: 3.6987 s
Nuclei matrix:          7.5813 s
Electron-electron integrals:
    |Relevant densities: 7819
    |Relevant density combinations: 4504594
  >Relevance computations:        4.5151 s
  >ERI computations:              22.3655 s
  >Second relevance computations: 0.1991 s
    |Total integrals:         4294967296
    |Unique integrals:        541089856
    |With relevant densities: 30572290
    |With relevant values:    3946243
  >Process ERIs:                  4.7365 s
Electron tensor:        34.8972 s
Ionic energy:           1.1126 s
Starting SCF cycle: 
SCF step | convergence measure:
       1 | 5.72175e-2  | -4339.2166657470425
       2 | 1.75679e-2  | -4342.3203387113235
       3 | 7.31540e-3  | -4343.106737365173
       4 | 3.79263e-3  | -4343.323867045995
       5 | 1.87254e-3  | -4343.37383228906
       6 | 8.65668e-4  | -4343.383901291234
       7 | 3.69166e-4  | -4343.3856308814475
       8 | 1.43351e-4  | -4343.385883556393
       9 | 5.24774e-5  | -4343.385916932791
      10 | 1.88662e-5  | -4343.385921231835
      11 | 7.04501e-6  | -4343.385921828226
      12 | 2.96648e-6  | -4343.385921929285
      13 | 1.45412e-6  | -4343.385921950744
      14 | 7.58954e-7  | -4343.385921955811
SCF times:
    |Core Hamiltonian: 0.0 s
    |Square root of inverse: 0.0226 s
    |SAD guess: 0.0007 s
    |G tensor: 5.6212 s
    |Fock matrix construction: 0.0 s
    |Commutator SPF-FPS: 0.0483 s
    |Save Fock matrix: 0.0 s
    |Error matrix calculation: 0.0 s
    |Energy calculation: 0.001 s
    |Transform Fock matrix: 0.004 s
    |Eigenvalue calculation: 0.0831 s
    |Transform eigenorbitals: 0.005 s
    |Density matrix: 0.0351 s
    |DIIS coefficients: 0.0104 s
    |DIIS equations: 0.0023 s
    |Mix new Fock matrix: 0.005 s
    |Convergence criterium: 0.0162 s
SCF cycle:              6.1007 s
_______________________________________
Total time:             77.0256 s
Total energy: -1528.5474502578945 Ha





------------------
PROGRAM STRUCTURE:
------------------
Main sections for the different applications are seperated with multi-line comments containing two lines of '======='.
Subsections for functions with usage in similar program parts are declared with comments containing lines of '-------'.
Functions contain a comment with further explanation.





------------
DISCLAIMERS:
------------
- We cannot guarantee that NIMBLE runs smoothly on every device as it is still in a testing phase. Consider contacting us
via the given contact information at the start of the program in case of difficulties.
- There are currently several functions with unused/dead variables. These are mostly attributed to force calculations which were suppported
in previous versions but not in version 3.10. Code for force calculations is still left in some places as forces will be implemented in the future.
It was removed in all cases where it would reduce the performance of the code.