
#########################
Acknowledgment of Dataset
#########################

If you use the model and/or any of the data, you must cite the associated preprint/article when available.

###############
MACE-UHTC Model
###############

- "mace_uhtc_stagetwo.model" and "mace_uhtc_stagetwo_compiled.model" are MACE model files for the finetuned MACE-UHTC model, the latter compiled

- "train.xyz", "valid.xyz", and "test.xyz" are the DFT-generated training, validation and test sets, respectively, used to finetune the model

- "run_mace" is an example bash script with appropriate settings to finetune the model; requires the MACE python library

######################
Crystal Structure Data
######################

- "equimolar_rocksalt_configs.xyz" and "equimolar_relaxed_configs.xyz" are the structures used in the calculation of elastic moduli for all equimolar mixtures with fixed rocksalt and fully relaxed structures, respectively

- The directory "structures_ternary" contains the crystal structures used in the calculations for non-equimolar HfCVCZrC, and are named as x_y_z.xyz where x = proportion V, y = proportion Zr, z = proportion Hf

######################
Physical Quantity Data
######################

- "Fig2_data.csv" and "Fig3_data.csv" contain the effective stabilisation temperatures and Young's moduli calculated for all equimolar mixtures and for nonequimolar HfCVCZrC in varying proportions of the transition metals and plotted in Fig. 2 and Fig. 3, respectively

- "rom_rocksalt_equimolar_elastic_constants.csv", "fixed_rocksalt_equimolar_elastic_constants.csv", and "full_relaxation_equimolar_elastic_constants.csv" give the elastic moduli obtained for all equimolar mixtures via the rule-of-mixtures approximation, using fixed rocksalt structures, and using fully relaxed structures, respectively; used in Fig. 4a and in SI Fig. 4

- "Fig4_b.txt" contains the Young's modulus data for HfCVCZrC along a slice of the full non-equimolar composition space, used for Fig. 4

- "222_binary_DFT.csv", "222_binary_MPA0.csv", and "222_binary_UHTC.csv" are the elastic moduli for mixtures calculated in 2x2x2 cells using DFT, MACE-MPA0 and MACE-UHTC respectively, and plotted in Fig. 2 of the SI

#####################
Model Embeddings Data
#####################

- The "mace_descriptors.npz" file contains second order embeddings of the MACE-UHTC model evaluated on the training set, used for Fig. 1. It can be read and plotted using the python script at the end of this file

- The following script can be used to plot "mace_descriptors.npz" mentioned above. Requires the numpy, maytplotlib, and umap python libraries:



import numpy as np

data = np.load("mace_descriptors.npz", allow_pickle=True)

X = data["descriptors"]
labels = data["labels"]   #formulas
config_types = data["config_types"]  #configuration types: 2TM-6TM

And projected in a 2D UMAP representation using:

import umap.umap_ as umap
import matplotlib.pyplot as plt

reducer = umap.UMAP(
    n_neighbors=15,
    min_dist=0.1,
    metric="cosine",
    random_state=1,
)

emb_2d = reducer.fit_transform(X)

plt.figure(figsize=(6, 5))
for cfg in sorted(set(config_types)):
    mask = config_types == cfg
    plt.scatter(
        emb_2d[mask, 0],
        emb_2d[mask, 1],
        label=cfg,
        s=30,
        alpha=0.7,
    )

plt.xlabel("UMAP-1")
plt.ylabel("UMAP-2")
plt.legend(frameon=False, ncol=2)
plt.tight_layout()
plt.show()
