schrodinger.active_learning.al_ligand_ml_utils module

schrodinger.active_learning.al_ligand_ml_utils.addExtraFeatures(training_csv, batched_csv_files, extra_features)

Read extra features from the batched csv input files. Add them to the training data. This only works for AL-FEP since the batched csv files can be read in RAM.

Args:

training_csv (str): Path to the training CSV file. batched_csv_files (list): List of paths to the batched CSV files. extra_features (str): Name of the extra features column.

Returns:

None

schrodinger.active_learning.al_ligand_ml_utils.check_license()

Check for the existence of the AUTOQSAR license and exit with an error message if it doesn’t.

schrodinger.active_learning.al_ligand_ml_utils.setup_large_dataset_workers()

Determines the number of cores available on a host and deducts 1 CPU from the maximum number of available cores to avoid monopolizing all cores. If only one core is available, returns 1.

schrodinger.active_learning.al_ligand_ml_utils.write_sorted_results(results, output_csv, sorted_by_uncertainty=None, sample_ratio=0.1, ascending=True)

Sort the results from ligand_ml by prediction value and optionally create uncertainty-sorted output.

Parameters:
  • results (pandas.core.frame.DataFrame) – dataframe that contains the pre-sorted ligand_ml prediction

  • output_csv (str) – file that contains the sorted by predict results.

  • sorted_by_uncertainty (str or None) – file that contains the sorted by uncertainty results.

  • sample_ratio (float) – top sample ratio that ranges in (0, 1).

  • ascending (bool) – sort predictions from smallest to largest if True