Pixel Classification#

class detectree.ClassifierTrainer(*, sigmas=None, num_orientations=None, neighborhood=None, min_neighborhood_range=None, num_neighborhoods=None, tree_val=None, nontree_val=None, classifier_class=None, **classifier_kwargs)[source]#

Train binary tree/non-tree classifier(s) of the pixel features.

__init__(*, sigmas=None, num_orientations=None, neighborhood=None, min_neighborhood_range=None, num_neighborhoods=None, tree_val=None, nontree_val=None, classifier_class=None, **classifier_kwargs)[source]#

Initialize the classifier.

See the background example notebook for details.

Parameters:
  • sigmas (list-like, optional) – The list of scale parameters (sigmas) to build the Gaussian filter bank that will be used to compute the pixel-level features. The provided argument will be passed to the initialization method of the PixelFeaturesBuilder class. If no value is provided, the value set in settings.GAUSS_SIGMAS will be taken.

  • num_orientations (int, optional) – The number of equally-distributed orientations to build the Gaussian filter bank that will be used to compute the pixel-level features. The provided argument will be passed to the initialization method of the PixelFeaturesBuilder class. If no value is provided, the value set in settings.GAUSS_NUM_ORIENTATIONS is used.

  • neighborhood (array-like, optional) – The base neighborhood structure that will be used to compute the entropy features. Theprovided argument will be passed to the initialization method of the PixelFeaturesBuilder class. If no value is provided, a square with a side size of 2 * min_neighborhood_range + 1 is used.

  • min_neighborhood_range (int, optional) – The range (i.e., the square radius) of the smallest neigbhorhood window that will be used to compute the entropy features. The provided argument will be passed to the initialization method of the PixelFeaturesBuilder class. If no value is provided, the value set in settings.ENTROPY_MIN_NEIGHBORHOOD_RANGE is used.

  • num_neighborhoods (int, optional) – The number of neigbhorhood windows (whose size follows a geometric progression starting at min_neighborhood_range) that will be used to compute the entropy features. The provided argument will be passed to the initialization method of the PixelFeaturesBuilder class. If no value is provided, the value set in settings.ENTROPY_NUM_NEIGHBORHOODS is used.

  • tree_val (int, optional) – The value that designates tree pixels in the response images. The provided argument will be passed to the initialization method of the PixelResponseBuilder class. If no value is provided, the value set in settings.RESPONSE_TREE_VAL is used.

  • nontree_val (int, optional) – The value that designates non-tree pixels in the response images. The provided argument will be passed to the initialization method of the PixelResponseBuilder class. If no value is provided, the value set in settings.RESPONSE_NONTREE_VAL is used.

  • classifier_class (class, optional) – The class of the classifier to be trained. It can be any scikit-learn compatible estimator that implements the fit, predict and predict_proba methods and that can be saved to and loaded from memory using skops. If no value is provided, the value set in settings.CLF_CLASS is used.

  • classifier_kwargs (key-value pairings, optional) – Keyword arguments that will be passed to the initialization of classifier_class. If no value is provided, the value set in settings.CLF_KWARGS is used.

train_classifier(*, split_df=None, response_img_dir=None, img_filepaths=None, response_img_filepaths=None, img_dir=None, img_filename_pattern=None, method=None, img_cluster=None)[source]#

Train a classifier.

See the background example notebook for more details.

Parameters:
  • split_df (pandas DataFrame, optional) – Data frame with the train/test split.

  • response_img_dir (str representing path to a directory, optional) – Path to the directory where the response tiles are located. Required if providing split_df. Otherwise response_img_dir might either be ignored if providing response_img_filepaths, or be used as the directory where the images whose filename matches img_filename_pattern are to be located.

  • img_filepaths (list-like, optional) – List of paths to the input tiles whose features will be used to train the classifier. Ignored if split_df is provided.

  • response_img_filepaths (list-like, optional) – List of paths to the binary response tiles that will be used to train the classifier. Ignored if split_df is provided.

  • img_dir (str representing path to a directory, optional) – Path to the directory where the images whose filename matches img_filename_pattern are to be located. Ignored if split_df or img_filepaths is provided.

  • img_filename_pattern (str representing a file-name pattern, optional) – Filename pattern to be matched in order to obtain the list of images. If no value is provided, the value set in settings.IMG_FILENAME_PATTERN is used. Ignored if split_df or img_filepaths is provided.

  • method ({'cluster-I', 'cluster-II'}, optional) – Method used in the train/test split.

  • img_cluster (int, optional) – The label of the cluster of tiles. Only used if method is ‘cluster-II’.

Returns:

clf – The trained classifier.

Return type:

scikit-learn-like classifier

train_classifiers(split_df, response_img_dir)[source]#

Train a classifier for each first-level cluster in split_df.

See the background example notebook for more details.

Parameters:
  • split_df (pandas DataFrame) – Data frame with the train/test split, which must have an img_cluster. column with the first-level cluster labels.

  • response_img_dir (str representing path to a directory) – Path to the directory where the response tiles are located.

Returns:

clf_dict – Dictionary mapping a scikit-learn-like classifier to each first-level cluster label.

Return type:

dictionary

class detectree.Classifier(*, clf=None, clf_dict=None, tree_val=None, nontree_val=None, refine=None, refine_beta=None, refine_int_rescale=None, **pixel_features_builder_kwargs)[source]#

Use trained classifier(s) to predict tree pixels.

__init__(*, clf=None, clf_dict=None, tree_val=None, nontree_val=None, refine=None, refine_beta=None, refine_int_rescale=None, **pixel_features_builder_kwargs)[source]#

Initialize the classifier instance.

See the background example notebook for more details.

Parameters:
  • clf (scikit-learn-like classifier, optional) – Trained classifier. If no value is provided, the latest detectree pre-trained classifier is used. Ignored if clf_dict is provided.

  • clf_dict (dictionary, optional) – Dictionary mapping a trained scikit-learn-like classifier to each first-level cluster label.

  • tree_val (int, optional) – Label used to denote tree pixels in the predicted images. If no value is provided, the value set in settings.CLF_TREE_VAL is used.

  • nontree_val (int, optional) – Label used to denote non-tree pixels in the predicted images. If no value is provided, the value set in settings.CLF_NONTREE_VAL is used.

  • refine (bool, optional) – Whether the pixel-level classification should be refined by optimizing the consistence between neighboring pixels. If no value is provided, the value set in settings.CLF_REFINE is used.

  • refine_beta (int, optional) – Parameter of the refinement procedure that controls the smoothness of the labelling. Larger values lead to smoother shapes. If no value is provided, the value set in settings.CLF_REFINE_BETA is used.

  • refine_int_rescale (int, optional) – Parameter of the refinement procedure that controls the precision of the transformation of float to integer edge weights, required for the employed graph cuts algorithm. Larger values lead to greater precision. If no value is provided, the value set in settings.CLF_REFINE_INT_RESCALE is used.

  • pixel_features_builder_kwargs (dict, optional) – Keyword arguments that will be passed to detectree.PixelFeaturesBuilder, which customize how the pixel features are built.