DeepH Interface#
Upgrade#
This interface enables seamless conversion of data from legacy DeepH-E3/DeepH-2 formats to the updated DeepH-pack specification. While previous DeepH versions provided foundational functionality, their data formats introduced certain limitations in scalability, metadata richness, and interoperability—restrictions that became more apparent as both the framework and user needs evolved.
Recognizing these constraints, we have fundamentally upgraded and re-engineered the underlying data storage architecture. The new DeepH-pack format delivers enhanced performance, improved extensibility for complex workflows, and native support for evolving features like advanced descriptor systems and high-throughput computation pipelines. To ensure a smooth transition for our user community, this built-in converter maintains full backward compatibility, allowing researchers to systematically migrate existing datasets without interrupting ongoing projects or losing valuable computational investments.
Through this upgrade path, legacy DeepH users can seamlessly transition to the modernized specification, immediately gaining access to the latest optimizations and expanded capabilities while preserving their historical work.
dock convert deeph upgrade ./deeph_legacy_data ./deeph_updated_data -p 2
The description of the arguments can be shown by:
dock convert deeph upgrade -h
Usage: dock convert deeph upgrade [OPTIONS] LEGACY_DIR UPDATED_DIR
Convert data from legacy DeepH-E3/DeepH-2 formats to the updated DeepH-pack specification
Options:
-p, --parallel-num INTEGER The parallel processing number, -1 for using all of the cores.
[default: -1]
-t, --tier-num INTEGER The tier number of the source data, -1 for [legacy], 0 for
<legacy>/<data_dirs>, 1 for <legacy>/<tier1>/<data_dirs>, etc.
[default: 0]
--force Force to overwrite the existing files.
-h, --help Show this message and exit.
Downgrade#
This interface supports bidirectional conversion, enabling users to convert data from the current DeepH-pack format back to the legacy DeepH-E3/DeepH-2 specifications. We recognize that researchers may have established workflows, scripts, or analysis tools tightly coupled to the older data formats. To ensure continuity and prevent disruption—especially for projects under active development or nearing publication—this downgrade capability provides a practical bridge, allowing you to proceed with time-sensitive work without delay.
However, while this path offers immediate compatibility, we strongly encourage a gradual migration of your toolchains to the new DeepH-pack format. The legacy formats lack the extensibility, performance optimizations, and rich metadata support intrinsic to the updated specification. Future development, including performance enhancements and new features, will be focused exclusively on the DeepH-pack ecosystem.
Our long-term roadmap, centered on the DeepH-dock project, aims to build a robust, user-friendly production toolkit for a broader scientific community. This next-generation platform will be designed natively for the DeepH-pack format, offering streamlined workflows, advanced functionalities, and improved interoperability. Migrating your data now will ensure seamless access to these powerful upcoming tools and unlock the full potential of the DeepH framework for your future research.
dock convert deeph downgrade ./deeph_updated_data ./deeph_legacy_data -p 2
The description of the arguments can be shown by:
dock convert deeph downgrade -h
Usage: dock convert deeph downgrade [OPTIONS] UPDATED_DIR LEGACY_DIR
Convert data from updated DeepH-pack format to legacy DeepH-E3/DeepH-2 formats
Options:
-p, --parallel-num INTEGER The parallel processing number, -1 for using all of the cores.
[default: -1]
-t, --tier-num INTEGER The tier number of the updated source data, -1 for [updated], 0 for
<updated>/<data_dirs>, 1 for <updated>/<tier1>/<data_dirs>, etc.
[default: 0]
--force Force to overwrite the existing files.
-h, --help Show this message and exit.
Standardize#
The Standardize module addresses a fundamental gauge freedom inherent in the tight-binding Hamiltonian representation used by DeepH. In the localized orbital basis, the Hamiltonian \(H\) and overlap matrix \( S \) are not uniquely determined: the transformation \(H \rightarrow H + \mu S\) (where \(\mu\) is an arbitrary real constant) leaves all physical observables—such as band structures, densities of states, and total energies—invariant. This is known as the \(\mu\)-gauge freedom or scalar gauge symmetry.
While physically inconsequential, this arbitrary shift poses a significant challenge for machine learning. A model trained on data with inconsistent gauge choices may learn spurious features tied to the arbitrary offset \(\mu\), rather than the underlying electronic structure, leading to reduced generalization ability and prediction stability.
This tool is designed to systematically remove the µ-gauge freedom by applying a consistent, well-defined standardization procedure to the predicted \((H, S)\) matrices. By aligning all Hamiltonians to a common gauge reference (e.g., by enforcing a trace condition or aligning to a reference onsite energy), it produces a unique, canonical representation of the electronic Hamiltonian that is invariant under the original symmetry.
The standardized output is significantly more suited for downstream machine learning tasks, ensuring that models learn robust physical relationships, improving training convergence, and enhancing transferability across different systems and configurations. Use this module to preprocess DeepH-generated Hamiltonians before employing them in further property prediction, workflow automation, or dataset construction.
dock convert deeph standardize ./deeph_updated_data --overwrite -p 2
The description of the arguments can be shown by:
dock convert deeph standardize -h
Usage: dock convert deeph standardize [OPTIONS] DEEPH_DIR
Standardize DeepH Hamiltonian, so that it can eliminate the mu gauge.
Options:
-p, --parallel-num INTEGER The parallel processing number, -1 for using all of the cores.
[default: -1]
--overwrite Overwrite the existing Hamiltonian file.
-t, --tier-num INTEGER The tier number of the source data, -1 for [source], 0 for
<source>/<data_dirs>, 1 for <source>/<tier1>/<data_dirs>, etc.
[default: 0]
-h, --help Show this message and exit.
Minus core#
User can remove or add back the single atomic Hamiltonian, which significantly reduces the range of Hamiltonian values.
for d1 in $(ls updated); do
dock convert deeph minus-core ./deeph_updated_data/$d1 ./deeph_minus_core/$d1 ./deeph_single_atoms -t -1 -p 1
done
The description of the arguments can be shown by:
dock convert deeph minus-core -h
Usage: dock convert deeph minus-core [OPTIONS] FULL_DIR CORRECTED_DIR SINGLE_ATOMS_DIR
Remove or add back the single atomic Hamiltonian, which significantly reduces the range of
Hamiltonian values.
Options:
--transform-offsite-blocks Estimate and remove the offsite Hamiltonian blocks by single atomic
Hamiltonians and offsite overlaps.
--copy-other-files Copy other files to the input/output dir.
--backward Transform back.
-p, --parallel-num INTEGER The parallel processing number, -1 for using all of the cores.
[default: -1]
-t, --tier-num INTEGER The tier number of the source data, -1 for [source], 0 for
<source>/<data_dirs>, 1 for <source>/<tier1>/<data_dirs>, etc.
[default: 0]
-h, --help Show this message and exit.