Evaluation#

detectree.compute_eval_metrics(*, pred_img_filepaths=None, metrics=None, metrics_kwargs=None, clf=None, clf_dict=None, hf_hub_repo_id=None, hf_hub_clf_filename=None, hf_hub_download_kwargs=None, skops_trusted=None, refine_method=None, refine_kwargs=None, split_df=None, img_dir=None, response_img_dir=None, img_filepaths=None, response_img_filepaths=None, img_filename_pattern=None, **classifier_kwargs)[source]#

Compute evaluation metrics for the validation images.

Parameters:
  • pred_img_filepaths (list-like, optional) – List of paths to precomputed predicted images. If provided, classification is skipped and metrics are computed directly from these files. Only predictions with a matching response image (by basename) are used. Requires response_img_dir or response_img_filepaths.

  • metrics (str, func or list of str or func) – The metrics to compute, must be either a string with a function of the sklearn.metrics, a function that takes a y_true and y_pred positional arguments with the true and predicted labels respectively or a list-like of any of the two options. If no value is provided, the values set in settings.EVAL_METRICS are used.

  • metrics_kwargs (dict or list of dict) – Additional keyword arguments to pass to each of the metric functions.

  • clf (scikit-learn-like classifier, optional) – Trained classifier. If no value is provided, the classifier is loaded from HuggingFace Hub using the values provided in hf_hub_repo_id and hf_hub_clf_filename.

  • clf_dict (dictionary, optional) – Dictionary mapping a trained scikit-learn-like classifier to each first-level cluster label.

  • hf_hub_repo_id (str, optional) – HuggingFace Hub repository id (string with the user or organization and repository name separated by a /) and file name of the skops classifier respectively. If no value is provided, the values set in settings.HF_HUB_REPO_ID and settings.HF_HUB_CLF_FILENAME Ignored if clf or clf_dict are provided.

  • hf_hub_clf_filename (str, optional) – HuggingFace Hub repository id (string with the user or organization and repository name separated by a /) and file name of the skops classifier respectively. If no value is provided, the values set in settings.HF_HUB_REPO_ID and settings.HF_HUB_CLF_FILENAME Ignored if clf or clf_dict are provided.

  • hf_hub_download_kwargs (dict, optional) – Additional keyword arguments (besides “repo_id”, “filename”, “library_name” and “library_version”) to pass to huggingface_hub.hf_hub_download.

  • skops_trusted (list, optional) – List of trusted object types to load the classifier from HuggingFace Hub, passed to skops.io.load. If no value is provided, the value from settings.SKOPS_TRUSTED is used. Ignored if clf or clf_dict are provided.

  • refine_method (callable or bool, optional) – Method to refine the pixel-level classification. If False is provided, no refinement is performed. If None is provided, the default behavior of detectree.classifier.Classifier is used.

  • refine_kwargs (dict, optional) – Keyword arguments that will be passed to refine_method. Ignored if no refinement is performed.

  • split_df (pandas DataFrame, optional) – Data frame with the validation images.

  • img_dir (str representing path to a directory, optional) – Path to the directory where the images from split_df are located. Required if split_df is provided. Ignored if img_filepaths is provided.

  • response_img_dir (str representing path to a directory, optional) – Path to the directory where the response tiles are located. Ignored if providing response_img_filepaths. Only images with a matching response (by basename) are evaluated.

  • img_filepaths (list-like, optional) – List of paths to the tiles that will be used for validation. Ignored if split_df is provided.

  • response_img_filepaths (list-like, optional) – List of paths to the binary response tiles that will be used for evaluation. Ignored if split_df is provided. Only images with a matching response (by basename) are evaluated.

  • img_filename_pattern (str representing a file-name pattern, optional) – Filename pattern to be matched in order to obtain the list of images. If no value is provided, the value set in settings.IMG_FILENAME_PATTERN is used. Ignored if split_df or img_filepaths is provided.

  • classifier_kwargs (dict, optional) – Additional keyword arguments to pass to the initialization of detectree.classifier.Classifier class.

Returns:

metric_dict – Values of the metrics computed for the validation images. If only one metric is provided, a single value is returned. If multiple metrics are provided, a dict with a key for each metric is returned. The metric values can be of different types depending on the metric function used, e.g., precision_score returns a single float value, precision_recall_curve returns a tuple of arrays, and confusion_matrix returns a two-dimensional array.

Return type:

numeric, dict

detectree.eval_refine_params(*, refine_method=None, refine_params_list=None, metrics=None, metrics_kwargs=None, clf=None, clf_dict=None, hf_hub_repo_id=None, hf_hub_clf_filename=None, hf_hub_download_kwargs=None, skops_trusted=None, tree_val=None, nontree_val=None, split_df=None, img_dir=None, img_filepaths=None, img_filename_pattern=None, response_img_dir=None, **classifier_kwargs)[source]#

Evaluate a refinement procedure for different parameters.

Parameters:
  • refine_method (callable, optional) – Refinement method that takes a probability image as the first positional argument followed by tree and non-tree values, e.g., refine_method(p_tree_img, tree_val, nontree_val, **kwargs). If no value is provided, the value from settings.CLF_REFINE_METHOD is used.

  • refine_params_list (list of dict, optional) – Parameters to evaluate for the refinement method, as a list of keyword arguments. The metrics will be computed for each item of this list. If no value is provided, the value from settings.EVAL_REFINE_PARAMS is used.

  • metrics (str, func or list of str or func) – The metrics to compute, must be either a string with a function of the sklearn.metrics, a function that takes a y_true and y_pred positional arguments with the true and predicted labels respectively or a list-like of any of the two options. If no value is provided, the values set in settings.EVAL_METRICS are used.

  • metrics_kwargs (dict or list of dict) – Additional keyword arguments to pass to each of the metric functions.

  • clf (scikit-learn-like classifier, optional) – Trained classifier. If no value is provided, the classifier is loaded from HuggingFace Hub using the values provided in hf_hub_repo_id and hf_hub_clf_filename.

  • clf_dict (dictionary, optional) – Dictionary mapping a trained scikit-learn-like classifier to each first-level cluster label.

  • hf_hub_repo_id (str, optional) – HuggingFace Hub repository id (string with the user or organization and repository name separated by a /) and file name of the skops classifier respectively. If no value is provided, the values set in settings.HF_HUB_REPO_ID and settings.HF_HUB_CLF_FILENAME Ignored if clf or clf_dict are provided.

  • hf_hub_clf_filename (str, optional) – HuggingFace Hub repository id (string with the user or organization and repository name separated by a /) and file name of the skops classifier respectively. If no value is provided, the values set in settings.HF_HUB_REPO_ID and settings.HF_HUB_CLF_FILENAME Ignored if clf or clf_dict are provided.

  • hf_hub_download_kwargs (dict, optional) – Additional keyword arguments (besides “repo_id”, “filename”, “library_name” and “library_version”) to pass to huggingface_hub.hf_hub_download.

  • skops_trusted (list, optional) – List of trusted object types to load the classifier from HuggingFace Hub, passed to skops.io.load. If no value is provided, the value from settings.SKOPS_TRUSTED is used. Ignored if clf or clf_dict are provided.

  • tree_val (int, optional) – The values that designate tree and non-tree pixels respectively in the response images. If no values are provided, the values set in settings.TREE_VAL and settings.NON_TREE_VAL are respectively used.

  • nontree_val (int, optional) – The values that designate tree and non-tree pixels respectively in the response images. If no values are provided, the values set in settings.TREE_VAL and settings.NON_TREE_VAL are respectively used.

  • split_df (pandas DataFrame, optional) – Data frame with the validation images.

  • img_dir (str representing path to a directory, optional) – Path to the directory where the images from split_df are located. Required if split_df is provided. Ignored if img_filepaths is provided.

  • img_filepaths (list-like, optional) – List of paths to the tiles that will be used for validation. Ignored if split_df is provided.

  • img_filename_pattern (str representing a file-name pattern, optional) – Filename pattern to be matched in order to obtain the list of images. If no value is provided, the value set in settings.IMG_FILENAME_PATTERN is used. Ignored if split_df or img_filepaths is provided.

  • response_img_dir (str representing path to a directory, optional) – Path to the directory where the response tiles are located. Ignored if providing response_img_filepaths.

  • classifier_kwargs (dict, optional) – Additional keyword arguments to pass to the initialization of detectree.classifier.Classifier class.

Returns:

results – A DataFrame with the computed values for each metric (row) and each refinement keyword argument set (column, stringified).

Return type:

pandas DataFrame

detectree.get_true_pred_arr(*, pred_img_filepaths=None, clf=None, clf_dict=None, hf_hub_repo_id=None, hf_hub_clf_filename=None, hf_hub_download_kwargs=None, skops_trusted=None, refine_method=None, refine_kwargs=None, split_df=None, img_dir=None, response_img_dir=None, img_filepaths=None, response_img_filepaths=None, img_filename_pattern=None, **classifier_kwargs)[source]#

Get true and predicted values for the validation images.

Parameters:
  • pred_img_filepaths (list-like, optional) – List of paths to precomputed predicted images. If provided, classification is skipped and predictions are read directly from these files. Only predictions with a matching response image (by basename) are used, and all arguments except response_img_dir or response_img_filepaths are ignored.

  • clf (scikit-learn-like classifier, optional) – Trained classifier. If no value is provided, the classifier is loaded from HuggingFace Hub using the values provided in hf_hub_repo_id and hf_hub_clf_filename.

  • clf_dict (dictionary, optional) – Dictionary mapping a trained scikit-learn-like classifier to each first-level cluster label.

  • hf_hub_repo_id (str, optional) – HuggingFace Hub repository id (string with the user or organization and repository name separated by a /) and file name of the skops classifier respectively. If no value is provided, the values set in settings.HF_HUB_REPO_ID and settings.HF_HUB_CLF_FILENAME Ignored if clf or clf_dict are provided.

  • hf_hub_clf_filename (str, optional) – HuggingFace Hub repository id (string with the user or organization and repository name separated by a /) and file name of the skops classifier respectively. If no value is provided, the values set in settings.HF_HUB_REPO_ID and settings.HF_HUB_CLF_FILENAME Ignored if clf or clf_dict are provided.

  • hf_hub_download_kwargs (dict, optional) – Additional keyword arguments (besides “repo_id”, “filename”, “library_name” and “library_version”) to pass to huggingface_hub.hf_hub_download.

  • skops_trusted (list, optional) – List of trusted object types to load the classifier from HuggingFace Hub, passed to skops.io.load. If no value is provided, the value from settings.SKOPS_TRUSTED is used. Ignored if clf or clf_dict are provided.

  • refine_method (callable or bool, optional) – Method to refine the pixel-level classification. If False is provided, no refinement is performed. If None is provided, the default behavior of detectree.classifier.Classifier is used.

  • refine_kwargs (dict, optional) – Keyword arguments that will be passed to refine_method. Ignored if no refinement is performed.

  • split_df (pandas DataFrame, optional) – Data frame with the validation images.

  • img_dir (str representing path to a directory, optional) – Path to the directory where the images from split_df are located. Required if split_df is provided. Ignored if img_filepaths is provided.

  • response_img_dir (str representing path to a directory, optional) – Path to the directory where the response tiles are located. Required if providing split_df. Otherwise response_img_dir might either be ignored if providing response_img_filepaths, or be used as the directory where the images whose filename matches img_filename_pattern are to be located. Only images with a matching response (by basename) are evaluated.

  • img_filepaths (list-like, optional) – List of paths to the tiles that will be used for validation. Ignored if split_df is provided.

  • response_img_filepaths (list-like, optional) – List of paths to the binary response tiles that will be used for evaluation. Ignored if split_df is provided. Only images with a matching response (by basename) are evaluated.

  • img_filename_pattern (str representing a file-name pattern, optional) – Filename pattern to be matched in order to obtain the list of images. If no value is provided, the value set in settings.IMG_FILENAME_PATTERN is used. Ignored if split_df or img_filepaths is provided.

  • classifier_kwargs (dict, optional) – Additional keyword arguments to pass to the initialization of detectree.classifier.Classifier class.

Returns:

true_pred – Array with two rows respectively containing the true and predicted values for the provided images.

Return type:

numpy.ndarray