OpenMX Interface#
Overview#
The OpenMX Interface module enables seamless conversion from OpenMX (Open source package for Material eXplorer) calculation outputs to the DeepH data format. As one of DeepH’s earliest collaborative software platforms, OpenMX is an open-source DFT software package for nanoscale material simulations developed primarily at the University of Tokyo, Japan. It utilizes norm-conserving pseudopotentials and pseudo-atomic localized basis functions to achieve efficient large-scale electronic structure calculations.
This conversion module empowers researchers with existing OpenMX datasets to leverage DeepH’s machine learning capabilities for accelerated electronic structure calculations while maintaining first-principles accuracy.
Note: This converter is designed for OpenMX 3.9 format, with automatic version detection from openmx.scfout headers. We welcome community contributions to expand support to additional OpenMX versions.
Preparing OpenMX Calculations#
Required OpenMX Settings#
To obtain the raw data required by DeepH-pack (Hamiltonian, overlap matrix, etc.) from OpenMX calculations, you must add the following line to your OpenMX input file (*.in):
HS.fileout ON
After the calculation completes, the final structure and physical properties are stored in the *.scfout file.
Important Considerations for OpenMX Calculations#
Atomic Position Constraints#
To avoid potential problems in determining atomic positions during conversion, it is recommended to place all atoms within the first unit cell when performing OpenMX calculations. Specifically:
All fractional coordinates should be strictly between
0and1Avoid using exactly
0.00000or1.00000for coordinates
Atomic Ordering#
In OpenMX input files, there are no restrictions on the order of atoms. During conversion, atom order may be rearranged to match the standard POSCAR format. Ensure your workflow can accommodate this potential reordering.
File Structure Organization#
Assuming you have already read the OpenMX documentation and are familiar with conducting DFT calculations using OpenMX, you must organize your OpenMX data for different material structures according to the following convention:
OpenMX Data Structure#
openmx_dataset/
├── structure_1/
│ ├── openmx.scfout
│ ├── openmx.out
│ └── other_output_files...
├── structure_2/
│ └── ...
└── ...
Here, structure_1, structure_2, etc., represent the names of individual datasets and can be any combination of characters. The DeepH-dock data conversion tool will automatically transform this organized data into the format recognized by DeepH-pack.
Converted DeepH Data Structure#
deeph_datasets/
├── structure_1/
│ ├── info.json
│ ├── POSCAR
│ ├── hamiltonian.h5 # Exported by default
│ ├── overlap.h5 # Exported by default
│ ├── density_matrix.h5 # Optional - requires --export-rho flag
│ └── position_matrix.h5 # Optional - requires --export-r flag
├── structure_2/
│ └── ...
└── ...
Command Line Interface#
Basic Conversion Command#
You can convert OpenMX format data to DeepH format using the command line interface:
dock convert openmx to-deeph ./openmx_data /tmp/deeph_data -p 2
Expected output:
Data: 2it [00:00, 425.99it/s]
[done] Translation completed successfully!
Complete Command Line Options#
For detailed parameter information, use the help command:
dock convert openmx to-deeph -h
Usage: dock convert openmx to-deeph [OPTIONS] OPENMX_DIR DEEPH_DIR
Translate the OpenMX output data to DeepH DFT data training set format.
Options:
--ignore-S Do not export overlap.h5
--ignore-H Do not export hamiltonian.h5
--export-rho Export density_matrix.h5
--export-r Export position_matrix.h5
-p, --parallel-num INTEGER The parallel processing number, -1 for using all
of the cores. [default: -1]
-t, --tier-num INTEGER The tier number of the OpenMX source data, -1
for [openmx_dir], 0 for
<openmx_dir>/<data_dirs>, 1 for
<openmx_dir>/<tier1>/<data_dirs>, etc.
[default: 0]
--force Force to overwrite the existing files.
-h, --help Show this message and exit.
Parameter Details#
Export Options#
Default exports:
hamiltonian.h5andoverlap.h5are exported by defaultOptional exports:
--export-rho: Exports density matrices (density_matrix.h5)--export-r: Exports position matrices (position_matrix.h5)--ignore-S: Disables overlap matrix export--ignore-H: Disables Hamiltonian matrix export
Parallel Processing#
-p, --parallel-num: Controls parallel processing (default: -1, uses all cores)Performance scales with core count, but memory usage increases proportionally
Use lower values for memory-constrained environments
Tier Number Specification#
-t, --tier-num: Defines the directory hierarchy level for data access0: Access to<openmx_dir>/<data_dirs>(default)1: Access to<openmx_dir>/<tier1>/<data_dirs>Higher numbers for deeper nesting
Advanced Usage: Python Class API#
For programmatic access within Python scripts, you can use the OpenMXDatasetTranslator class:
from deepx_dock.convert.openmx.translate_openmx_to_deeph import OpenMXDatasetTranslator
translator = OpenMXDatasetTranslator(
openmx_data_dir="./openmx_data",
deeph_data_dir="/tmp/deeph_data",
n_jobs=2,
)
translator.transfer_all_openmx_to_deeph()
Data: 2it [00:00, 57.31it/s]
from pathlib import Path
print([str(v) for v in list(Path("/tmp/deeph_data").iterdir())])
print([str(v) for v in list(Path("/tmp/deeph_data/MoTe2").iterdir())])
['/tmp/deeph_data/MoTe2', '/tmp/deeph_data/Bi2Se3_SOC']
['/tmp/deeph_data/MoTe2/POSCAR', '/tmp/deeph_data/MoTe2/hamiltonian.h5', '/tmp/deeph_data/MoTe2/overlap.h5', '/tmp/deeph_data/MoTe2/info.json']
Troubleshooting#
Common Issues#
Missing
*.scfoutfiles: EnsureHS.fileout ONis set in OpenMX inputFractional coordinate errors: Verify all atomic positions are strictly between 0 and 1
Version compatibility issues: Check that OpenMX version is 3.9 or compatible
Getting Help#
Consult the OpenMX documentation
Report issues and contribute to the project via the community repository
This conversion module represents a key component in bridging traditional quantum mechanical calculations with modern machine learning approaches in materials science, enabling researchers to leverage existing OpenMX datasets for cutting-edge machine learning applications.