Overlap Matrix#

The overlap matrix is a defining component of a DFT data point in the atomic orbital basis, and is a requisite for DeepH inference:

\[ S_{i\alpha, j\beta}(\boldsymbol{R}_j-\boldsymbol{R}_i) = \int d^3 \boldsymbol{r} \phi^{*}_{i\alpha}(\boldsymbol{r}-\boldsymbol{R}_i-\mathcal{R}_i) \phi_{j\beta}(\boldsymbol{r}-\boldsymbol{R}_j-\mathcal{R}_j) \]

Here \(i,j\) label atoms, \(\alpha,\beta\) label orbitals on those atoms, \(\boldsymbol{R}_i,\boldsymbol{R}_j\) are lattice translation vectors to the unit cells containing atoms \(i,j\), and \(\mathcal{R}_i,\mathcal{R}_j\) are their positions within the unit cell. When SOC is present, \(\alpha\) and \(\beta\) have an additional spin degree of freedom.

In DeepH-dock, overlaps are computed using the efficient grid-based interpolation scheme from the HPRO package, originally proposed by Sankey and Niklewski.


CLI: Compute Overlap from Basis Files#

The user can use the terminal tool to compute overlap matrices given the basis files.

Supported DFT Codes#

DFT Code

Basis Format

CLI Command

SIESTA

.ion files

dock compute overlap ... siesta

OpenMX

.h5 (converted from .pao)

dock compute overlap openmx calc ...

SIESTA#

For the case where spin-orbit coupling (SOC) is present:

dock compute overlap path_to_poscar path_to_basis_files siesta -s

For the spinless case:

dock compute overlap path_to_poscar path_to_basis_files siesta

OpenMX: Compute Overlap from Input File#

OpenMX integration provides an automated workflow for:

  1. Converting PAO basis files to standardized HDF5 format

  2. Computing overlap matrices from OpenMX input files

Architecture#

Step 1: Basis Standardization
OpenMX PAO (.pao) → basis_convert.py → Standardized HDF5 (.h5)

Step 2: Overlap Computation (requires MPI)
OpenMX input (.dat) + basis.h5 → HPRO → overlap.h5, POSCAR, info.json

Requirements#

  • MPI Library: Required for HPRO integration

    # Ubuntu/Debian
    sudo apt-get install openmpi-bin libopenmpi-dev
    
    # Intel OneAPI (recommended for HPC)
    source /path/to/intel/oneapi/setvars.sh
    

CLI Commands#

Calculate Overlap Matrix#

# First time: provide PAO files for auto-conversion
dock compute overlap openmx calc openmx.in basis/ --raw-basis-dir pao_folder/

# Subsequent runs: use existing basis.h5
dock compute overlap openmx calc openmx.in basis/

# With MPI
mpirun -np 1 dock compute overlap openmx calc openmx.in basis/ --raw-basis-dir pao_folder/

Batch Convert PAO Files (Optional Pre-step)#

# Convert all PAO files in a directory
dock convert openmx convert-basis pao_folder/ basis/

# With pattern filter
dock convert openmx convert-basis pao_folder/ basis/ --pattern "*7.0.pao"

For detailed basis conversion workflow, see Convert OpenMX Demo.

Standardized Basis Format (basis.h5)#

The flat format is optimized for AI/ML tensor operations:

basis.h5
│
├── @element: str              # e.g., "Fe"
├── @basis_name: str           # e.g., "Fe6.0H"
├── @source: str               # e.g., "openmx"
├── @normalized: bool          # whether radial functions are normalized
├── @units_length: str         # "bohr"
│
├── radial_grid: [Nr]          # radial grid points (Bohr)
├── mul_list: [lmax+1] int     # number of orbitals per L
└── radial_basis: [total_orbitals, Nr]  # all radial functions stacked

Derived values:

  • lmax = len(mul_list) - 1

  • total_orbitals = sum(mul_list)

  • radial_cutoff = radial_grid.max()

Orbital Selection#

In OpenMX input files, specify which orbitals to use:

<Definition.of.Atomic.Species
 Mo   Mo7.0-s3p2d2   Mo_PBE19   # s3p2d2: 3 s + 2 p + 2 d orbitals
 Te   Te7.0-s3p2d2   Te_PBE19
Definition.of.Atomic.Species>

Syntax

Meaning

s2p2d2

2 s + 2 p + 2 d orbitals

s3p2d1f1

3 s + 2 p + 1 d + 1 f orbitals

Fe6.0H

No selection (use all orbitals)


Output Files#

Output files are saved in the same directory as the input:

output_dir/
├── overlap.h5      # Overlap matrix in DeepH format
├── info.json       # Metadata (atoms, orbitals, etc.)
└── POSCAR          # Crystal structure in VASP format

overlap.h5 Structure#

{
    'atom_pairs': np.ndarray,      # (N_pairs, 5): [Rx, Ry, Rz, i_atom, j_atom]
    'chunk_shapes': np.ndarray,    # (N_pairs, 2): shape of each block
    'chunk_boundaries': np.ndarray,# (N_pairs+1,): data boundaries
    'entries': np.ndarray,         # Flattened matrix elements
}

Python API#

For programmatic access within Python scripts.

SIESTA#

from pathlib import Path
from deepx_dock.compute.overlap.overlap import calc_overlap

# SIESTA overlap calculation
# poscar_path = Path('path/to/POSCAR')
# basis_dir = Path('path/to/basis_files/')
# overlap = calc_overlap(poscar_path, basis_dir, dft_code='siesta', Ecut=50.0)

OpenMX#

Note: OpenMX Python API requires MPI. Make sure MPI environment is available.

from pathlib import Path
from deepx_dock.convert.openmx.basis_convert import convert_pao_to_h5, parse_basis_definition

# Convert PAO to HDF5 (no MPI required)
pao_file = Path('/home/deeph/software/calc/OpenMX/build/openmx3.9/DFT_DATA19/PAO/Mo7.0.pao')
h5_file = Path('/tmp/Mo7.0.h5')

if pao_file.exists():
    convert_pao_to_h5(pao_file, h5_file)
    print(f"Converted: {pao_file.name} -> {h5_file.name}")
import h5py
import numpy as np

# Inspect the converted basis file
if h5_file.exists():
    with h5py.File(h5_file, 'r') as f:
        print("=== File Attributes ===")
        for key, val in f.attrs.items():
            print(f"  {key}: {val}")
        
        print("\n=== Datasets ===")
        for key in f.keys():
            print(f"  {key}: shape={f[key].shape}, dtype={f[key].dtype}")
        
        print("\n=== Derived Values ===")
        mul_list = f['mul_list'][:]
        print(f"  lmax: {len(mul_list) - 1}")
        print(f"  total_orbitals: {np.sum(mul_list)}")
        print(f"  radial_cutoff: {f['radial_grid'][-1]:.4f} Bohr")
# Load basis files using BasisLoader (requires MPI)
try:
    from deepx_dock.compute.overlap.openmx.loader import BasisLoader
    
    loader = BasisLoader(h5_file)
    print(f"Element: {loader.element}")
    print(f"lmax: {loader.lmax}")
    print(f"mul_list: {loader.mul_list}")
    
    # Get individual orbital
    func = loader.get_orbital(ell=0, mul=0)
    print(f"\nFirst s orbital shape: {func.shape}")
    
    # Get all orbitals for angular momentum l
    p_orbitals = loader.get_all_l(ell=1)
    print(f"All p orbitals shape: {p_orbitals.shape}")
    
except RuntimeError as e:
    if "MPI" in str(e):
        print("[Warning] MPI library not found. BasisLoader requires MPI.")
        print("Please source Intel OneAPI or install OpenMPI:")
        print("  source /path/to/intel/oneapi/setvars.sh")
    else:
        raise
# Compute overlap (requires MPI)
try:
    from deepx_dock.compute.overlap.openmx.calculator import OpenMXOverlapCalculator
    
    # Example: compute overlap from OpenMX input file
    # calculator = OpenMXOverlapCalculator(
    #     openmx_input='openmx.in',
    #     basis_dir='basis/',
    #     raw_basis_dir='pao_folder/',  # Optional, for auto-conversion
    #     ecut=50.0,
    # )
    # calculator.run()
    print("OpenMXOverlapCalculator loaded successfully.")
    print("Use calculator.run() to compute overlap matrices.")
    
except RuntimeError as e:
    if "MPI" in str(e):
        print("[Warning] MPI library not found. OpenMXOverlapCalculator requires MPI.")
        print("Please source Intel OneAPI or install OpenMPI:")
        print("  source /path/to/intel/oneapi/setvars.sh")
    else:
        raise

Troubleshooting#

MPI Library Not Found (OpenMX)#

RuntimeError: cannot load MPI library

Solution: Install MPI or source Intel OneAPI:

source /path/to/intel/oneapi/setvars.sh

PAO/Basis File Not Found (OpenMX)#

Solution: Provide --raw-basis-dir for auto-conversion:

dock compute overlap openmx calc openmx.in basis/ --raw-basis-dir /path/to/PAO/

For detailed OpenMX basis conversion workflow, see Convert OpenMX Demo.


Last Updated: 2025-03-24