OpenMX Interface#

Overview#

The OpenMX Interface module enables seamless conversion from OpenMX (Open source package for Material eXplorer) calculation outputs to the DeepH data format. As one of DeepH’s earliest collaborative software platforms, OpenMX is an open-source DFT software package for nanoscale material simulations developed primarily at the University of Tokyo, Japan. It utilizes norm-conserving pseudopotentials and pseudo-atomic localized basis functions to achieve efficient large-scale electronic structure calculations.

This conversion module empowers researchers with existing OpenMX datasets to leverage DeepH’s machine learning capabilities for accelerated electronic structure calculations while maintaining first-principles accuracy.

Note: This converter is designed for OpenMX 3.9 format, with automatic version detection from openmx.scfout headers. We welcome community contributions to expand support to additional OpenMX versions.

Preparing OpenMX Calculations#

Required OpenMX Settings#

To obtain the raw data required by DeepH-pack (Hamiltonian, overlap matrix, etc.) from OpenMX calculations, you must add the following line to your OpenMX input file (*.in):

HS.fileout ON

After the calculation completes, the final structure and physical properties are stored in the *.scfout file.

Important Considerations for OpenMX Calculations#

Atomic Position Constraints#

To avoid potential problems in determining atomic positions during conversion, it is recommended to place all atoms within the first unit cell when performing OpenMX calculations. Specifically:

  • All fractional coordinates should be strictly between 0 and 1

  • Avoid using exactly 0.00000 or 1.00000 for coordinates

Atomic Ordering#

In OpenMX input files, there are no restrictions on the order of atoms. During conversion, atom order may be rearranged to match the standard POSCAR format. Ensure your workflow can accommodate this potential reordering.

File Structure Organization#

Assuming you have already read the OpenMX documentation and are familiar with conducting DFT calculations using OpenMX, you must organize your OpenMX data for different material structures according to the following convention:

OpenMX Data Structure#

openmx_dataset/
├── structure_1/
│   ├── openmx.scfout
│   ├── openmx.out
│   └── other_output_files...
├── structure_2/
│   └── ...
└── ...

Here, structure_1, structure_2, etc., represent the names of individual datasets and can be any combination of characters. The DeepH-dock data conversion tool will automatically transform this organized data into the format recognized by DeepH-pack.

Converted DeepH Data Structure#

deeph_datasets/
├── structure_1/
│   ├── info.json
│   ├── POSCAR
│   ├── hamiltonian.h5      # Exported by default   ├── overlap.h5          # Exported by default   ├── density_matrix.h5   # Optional - requires --export-rho flag   └── position_matrix.h5  # Optional - requires --export-r flag
├── structure_2/
│   └── ...
└── ...

Command Line Interface#

Basic Conversion Command#

You can convert OpenMX format data to DeepH format using the command line interface:

dock convert openmx to-deeph ./openmx_data /tmp/deeph_data -p 2

Expected output:

Data: 2it [00:00, 425.99it/s]
[done] Translation completed successfully!

Complete Command Line Options#

For detailed parameter information, use the help command:

dock convert openmx to-deeph -h
Usage: dock convert openmx to-deeph [OPTIONS] OPENMX_DIR DEEPH_DIR

  Translate the OpenMX output data to DeepH DFT data training set format.

Options:
  --ignore-S                  Do not export overlap.h5
  --ignore-H                  Do not export hamiltonian.h5
  --export-rho                Export density_matrix.h5
  --export-r                  Export position_matrix.h5
  -p, --parallel-num INTEGER  The parallel processing number, -1 for using all
                              of the cores.  [default: -1]
  -t, --tier-num INTEGER      The tier number of the OpenMX source data, -1
                              for [openmx_dir], 0 for
                              <openmx_dir>/<data_dirs>, 1 for
                              <openmx_dir>/<tier1>/<data_dirs>, etc.
                              [default: 0]
  --force                     Force to overwrite the existing files.
  -h, --help                  Show this message and exit.

Parameter Details#

Export Options#

  • Default exports: hamiltonian.h5 and overlap.h5 are exported by default

  • Optional exports:

    • --export-rho: Exports density matrices (density_matrix.h5)

    • --export-r: Exports position matrices (position_matrix.h5)

    • --ignore-S: Disables overlap matrix export

    • --ignore-H: Disables Hamiltonian matrix export

Parallel Processing#

  • -p, --parallel-num: Controls parallel processing (default: -1, uses all cores)

  • Performance scales with core count, but memory usage increases proportionally

  • Use lower values for memory-constrained environments

Tier Number Specification#

  • -t, --tier-num: Defines the directory hierarchy level for data access

    • 0: Access to <openmx_dir>/<data_dirs> (default)

    • 1: Access to <openmx_dir>/<tier1>/<data_dirs>

    • Higher numbers for deeper nesting

Advanced Usage: Python Class API#

For programmatic access within Python scripts, you can use the OpenMXDatasetTranslator class:

from deepx_dock.convert.openmx.translate_openmx_to_deeph import OpenMXDatasetTranslator

translator = OpenMXDatasetTranslator(
    openmx_data_dir="./openmx_data",
    deeph_data_dir="/tmp/deeph_data",
    n_jobs=2,
)
translator.transfer_all_openmx_to_deeph()
Data: 2it [00:00, 57.31it/s]
from pathlib import Path
print([str(v) for v in list(Path("/tmp/deeph_data").iterdir())])
print([str(v) for v in list(Path("/tmp/deeph_data/MoTe2").iterdir())])
['/tmp/deeph_data/MoTe2', '/tmp/deeph_data/Bi2Se3_SOC']
['/tmp/deeph_data/MoTe2/POSCAR', '/tmp/deeph_data/MoTe2/hamiltonian.h5', '/tmp/deeph_data/MoTe2/overlap.h5', '/tmp/deeph_data/MoTe2/info.json']

Troubleshooting#

Common Issues#

  1. Missing *.scfout files: Ensure HS.fileout ON is set in OpenMX input

  2. Fractional coordinate errors: Verify all atomic positions are strictly between 0 and 1

  3. Version compatibility issues: Check that OpenMX version is 3.9 or compatible

Getting Help#

  • Consult the OpenMX documentation

  • Report issues and contribute to the project via the community repository


This conversion module represents a key component in bridging traditional quantum mechanical calculations with modern machine learning approaches in materials science, enabling researchers to leverage existing OpenMX datasets for cutting-edge machine learning applications.