Join Chemetrian now and get started for free!

Physics-based Descriptor Calculation

Updated 10/12/2025

Calculate physics-based molecular descriptors using computational chemistry methods. Upload SMILES for global properties or CDXML files for atom/bond-level analysis including atomic charges, steric parameters, and dihedral angles.

Table of Contents (Estimated reading time: 12-15 minutes)

  1. Descriptor calculation overview
  2. Calculating global molecular properties via SMILES upload
  3. Building a molecule library
  4. Library naming and description
  5. Upload SMILES
  6. Explore library
  7. Calculating physics-based descriptors
  8. View descriptors
  9. Calculating atom/bond-level and global molecular properties via CDXML upload
  10. Preparing the CDXML file
  11. Upload the file
  12. CDXML calculations
  13. CDXML results
  14. Feature formats

Descriptor calculation overview

There are main pipelines here, one for calculation of global molecular properties which can be done via SMILES upload, and another for calculation of global + atom/bond-level properties (atomic charges, steric parameters) which requires .cdxml file upload. This document first covers global calculations and second covers atom/bond-level calculations via .cdxml upload.

Calculating global molecular properties via SMILES upload

If you only want global molecular properties, you can upload molecules in SMILES format.

Building a molecule library

Start by pressing "new library". You will be prompted to proceed via smiles or ChemDraw file upload. We will start with SMILES upload.

New Library Interface

Library naming and description

You will be prompted to name the library and provide a description.

Library Naming Interface

Upload SMILES

Next, you can either upload a csv containing smiles OR simply copy/paste them in. To copy paste them, simply copy the whole row of SMILES that you have from a CSV, or a list of SMILES separated by commas, and paste them in. Press upload SMILES. The SMILES are automatically saved as CSV for your future reference, and their formats will be validated. We uploaded 286 SMILES strings.

SMILES Upload Interface

Explore library

You can now see that your library has been created in the left panel. Click the library to load it.

Library Panel Interface

Press the "molecules" tab to be able to explore independent molecules in the library. Structures will automatically be converted to 3D via optimization with MMFF and RDkit 2D descriptors will be automatically computed and displayed in the panel on the right.

Molecules Tab Interface

Calculating physics-based descriptors

To compute 3D, physics-based descriptors press "calculate descriptors" or "pipeline view". Here you can select which molecules from the library you want to submit to calculations. Pressing the topmost check box will select all of them.

Then, choose computational chemistry parameters. In the beta version, these parameters are quite limited, but conformational analysis and higher levels of theory will be available soon. Press "calculate" under the "calculate features" header to run the pipeline. This will take varying amounts of time based on number and size of molecules, type of molecules, and level of theory selected. When the run is completed, it will say "complete" in the job table.

Descriptor Calculation Interface

View descriptors

Press "view descriptors" or "library view" to reopen the molecule library. Now, if you click the "3D" tab in the rightmost box, you can view the descriptors that were calculated via the computational chemistry pipeline for each molecule. Pressing the download icon will provide an organized CSV with descriptors (physics-based and/or RDKit2D) that can be used for predictive modeling (after adding data labels ie yields, selecivity, etc), chemical space analysis, or other tasks.

View Descriptors Interface

Calculating atom/bond-level and global molecular properties via CDXML upload

This process will give the same whole molecule properties as above but will also provide atomic charges, sterimol B1, B5, and L parameters for defined bonds, and dihedral angles if the molecules in the ChemDraw are annotated.

Preparing the CDXML file

First, you will need to prepare a file in ChemDraw. If you want atom/bond-level properties, you must label the atoms of the core conserved structure of your molecules. For example, I have a group of about 29 BOX ligands I want to featurize. You can see how I labeled the core structures:

ChemDraw Labeling Example

To add atom labels, we recommend giving them "letter" assignments. You can do this by hovering over an atom and pressing the apostrophe (') key. It will automatically label it as something, but you can edit the text box to be a letter like in the example. We also recommend putting a text box below each chemical structure with a molecule number, ID, or name. This will be saved with the molecule when the ChemDraw is parsed. See the sample CDXML file:

Sample CDXML File

Upload the file

After pressing "new library", choose CDXML upload.

CDXML Upload Interface

Now, you will be asked to name the library, provide a description, and show the contents of the file. This includes the SMILES of the molecules, as well as the conserved atom labels that you provided. The bonds and dihedrals will also be parsed.

CDXML File Contents

CDXML calculations

Now, navigate to the new library that you created. 3D structures and 2D descriptors will be automatically computed like the SMILES upload workflow. The difference comes when you go to calculate descriptors. After selecting the molecules you want descriptors for, an option to turn on atom and bond level features will appear in the pipeline. If you want these descriptors, turn the toggle on like in the example below. Then, run the calculation.

CDXML Calculation Interface

CDXML results

Now, you will see a descriptor library on the right side under the "3D" tab that contains the atom and bond level features along with whole molecule features. These features can all be downloaded as a CSV in an organized, ML-ready format by pressing the download button.

CDXML Results Interface

Feature formats

Features are in the formats below:

Whole molecule properties:

Whole Molecule Properties

Sterimol L parameter for the bond between atoms A and B:

Sterimol L Parameter

Sterimol B1 parameter for the bond between atoms A and B:

Sterimol B1 Parameter

Sterimol B5 parameter for the bond between atoms A and B:

Sterimol B5 Parameter

Partial charge on atom A:

Partial Charge

Dihedral angle of dihedral defined by atoms A-B-C-D:

Dihedral Angle