Utils Module
Utility Functions
The Utils module within the ExSeq Toolbox provides a comprehensive suite of utility functions to support the preprocessing, retrieval, manipulation, and visualization of expansion microscopy data.
- exm.utils.utils.chmod(path)[source]
Sets permissions so that users and the owner can read, write and execute files at the given path.
- Parameters:
path (pathlib.Path) – Path in which privileges should be granted.
- Return type:
None
- exm.utils.utils.display_img(img)[source]
Displays an image using the Image module from the Python Imaging Library (PIL).
The function supports images of type boolean and other numpy data types. For boolean images, the function multiplies the image by 255 to create an 8-bit grayscale image. For non-boolean images, the function simply converts the image to an 8-bit grayscale image without scaling.
- Parameters:
img (Union[np.ndarray, bool]) – The input image to display. This can be a boolean or non-boolean numpy array.
- Return type:
None
- exm.utils.utils.downsample_volume(array, factors)[source]
Reduces the size of an array by downsampling along each dimension using specified factors.
- exm.utils.utils.enhance_and_filter_volume(volume, low_percentile=0, high_percentile=100, acclerated=False)[source]
Enhances the contrast of a volume using specified percentiles and applies a median filter to reduce noise. Optionally uses GPU acceleration for the median filtering step if accelerated is set to True.
- Parameters:
volume (np.ndarray) – The input volume to be processed.
low_percentile (float Default is 0.) – The lower percentile to use for contrast adjustment. Values below this percentile will be adjusted to the minimum intensity.
high_percentile (float Default is 100.) – The higher percentile to use for contrast adjustment. Values above this percentile will be adjusted to the maximum intensity.
accelerated (bool, optional Default is False.) – If True, uses GPU acceleration to perform the median filtering. Requires CuPy to be installed.
acclerated (bool)
- Returns:
The volume after contrast enhancement and median filtering.
- Return type:
np.ndarray
- Raises:
ValueError – If the percentiles are out of the [0, 100] range or if high_percentile is not greater than low_percentile.
TypeError – If the input volume is not a numpy ndarray or if percentiles are not numeric.
ImportError – If accelerated is True but CuPy is not installed.
- exm.utils.utils.gene_barcode_mapping(args)[source]
Loads a CSV file containing gene symbols and corresponding barcodes, and creates mappings between them.
This function reads a CSV file specified by args.gene_digit_csv, which contains gene symbols and their corresponding barcodes. It converts the barcodes into digit representations and creates two mappings: ‘digit2gene’ for mapping from digit representation to gene symbol, and ‘gene2digit’ for mapping from gene symbol to digit representation. These mappings are useful for identifying genes associated with puncta barcodes in a field of view.
- Parameters:
args (Args) – Configuration options. This should be an instance of the Args class.
- Returns:
A tuple containing: - A pandas DataFrame with the original CSV data and an additional column for digit representations. - A dictionary mapping from digit representation to gene symbol (‘digit2gene’). - A dictionary mapping from gene symbol to digit representation (‘gene2digit’).
- Return type:
- exm.utils.utils.generate_debug_candidate(args, gene=None, fov=None, num_missing_code=1)[source]
Generates a candidate puncta for debugging purposes.
The function first randomly selects a gene if not provided and retrieves all corresponding puncta. It then filters the puncta based on the number of missing codes in their barcodes. Finally, it randomly selects one puncta from the filtered list.
- Parameters:
args (Args) – Configuration options. This should be an instance of the Args class.
gene (Optional[str]) – The gene of interest, if none is provided a gene is randomly selected.
fov (Optional[int]) – The field of view (FOV) to consider. If none is provided, all FOVs are considered.
num_missing_code (int) – The number of missing codes in the barcode of the puncta to be retrieved. Default is 1.
- Returns:
A single randomly chosen puncta that satisfies all the criteria (matching gene, within FOV, correct number of missing codes).
- Return type:
Optional[Dict]
- exm.utils.utils.get_offsets(filename)[source]
Given the filename for the BDV/H5 XML file, returns the stitching offset as an (N,3) array in (X,Y,Z) order.
The offsets are expressed in micrometers (µm) and are extracted from the XML file produced by the Big Stitcher plugin of Fiji.
- Parameters:
filename (str) – The file name of the BDV/H5 XML file.
- Returns:
An array of stitching offsets in the format of (X, Y, Z).
- Return type:
np.ndarray
- Raises:
FileNotFoundError – If the XML file cannot be found.
ET.ParseError – If there is an error parsing the XML file.
ValueError – If the XML file has an unexpected structure or if the affine transformation cannot be read.
- exm.utils.utils.retrieve_all_puncta(args, fov)[source]
Returns all identified puncta for a given field of view.
This function loads and returns all puncta data from a pickle file for the specified field of view. The path to the pickle file is constructed using the configuration options provided in the args parameter.
- exm.utils.utils.retrieve_complete(args)[source]
Retrieves a complete summary of barcodes present in both the gene-barcode mapping and the overall barcode summary.
- Parameters:
args (Args) – Configuration options. This should be an instance of the Args class.
- Returns:
A pandas DataFrame containing the complete summary of barcodes, indexed by barcode with columns for total count (‘number’) and count per fov (e.g., ‘fov1’, ‘fov2’, …), and a ‘gene’ column mapping each barcode to its corresponding gene. Sorted by gene names in ascending order.
- Return type:
pd.DataFrame
- exm.utils.utils.retrieve_digit(args, digit)[source]
Retrieves all puncta with a specified barcode (represented as a digit) across all fields of view.
This function iterates over all provided fields of view (FOVs) and retrieves puncta that match the specified barcode. Each matching puncta, along with its FOV information, is appended to a list.
- Parameters:
- Returns:
A list of dictionaries where each dictionary contains information about a puncta and the FOV it was found in.
- Return type:
List[Dict]
- exm.utils.utils.retrieve_gene(args, gene)[source]
Retrieves all puncta associated with a specific gene across all fields of view (FOVs).
- Parameters:
- Returns:
A list of dictionaries, each representing a puncta associated with the gene, including puncta’s properties and the FOV in which it is found.
- Return type:
List[Dict]
- exm.utils.utils.retrieve_img(args, fov, code, channel, ROI_min, ROI_max)[source]
Returns the middle slice of a specified volume chunk.
This function retrieves a middle z-slice from a 3D volume chunk specified by its field of view, code, and channel. The ROI (Region of Interest) is defined by minimum and maximum coordinates.
- Parameters:
args (Args) – Configuration options. This should be an instance of the Args class.
fov (int) – The field of view of the volume slice to be returned.
code (int) – The code of the volume slice to be returned.
channel (int) – The channel of the volume slice to be returned.
ROI_min (List[int]) – Minimum coordinates of the volume chunk in the format of [z, y, x].
ROI_max (List[int]) – Maximum coordinates of the volume chunk in the format of [z, y, x].
- Returns:
A 2D numpy array representing the middle z-slice of the specified volume chunk.
- Return type:
np.ndarray
- exm.utils.utils.retrieve_one_puncta(args, fov, puncta_index)[source]
Retrieves information about a specific puncta from a given field of view.
This function uses the provided configuration options to access and return data for a single puncta, identified by its index, within the specified field of view.
- Parameters:
- Returns:
A dictionary containing information about the puncta.
- Return type:
Dict
- exm.utils.utils.retrieve_summary(args)[source]
Retrieves a summary of all puncta for each field of view (FOV).
This function iterates over the provided list of FOVs, retrieves all puncta for each FOV, and aggregates the count of each barcode across all FOVs and individually per FOV. The summary is then saved to a CSV file.
- Parameters:
args (Args) – Configuration options. This should be an instance of the Args class.
- Returns:
A pandas DataFrame containing the summary of barcodes. The DataFrame is indexed by barcode with columns for total count (‘number’) and count per FOV (e.g., ‘fov1’, ‘fov2’, …). The DataFrame is sorted by total count in descending order.
- Return type:
pd.DataFrame
- exm.utils.utils.retrieve_vol(args, fov, code, c, ROI_min, ROI_max)[source]
Returns a specified volume chunk from a dataset.
- Parameters:
args (Args) – Configuration options. This should be an instance of the Args class.
fov (int) – The field of view of the volume chunk to be returned.
code (int) – The code of the volume chunk to be returned.
c (int) – The channel of the volume chunk to be returned.
ROI_min (List[int]) – Minimum coordinates of the volume chunk in the format of [z, y, x].
ROI_max (List[int]) – Maximum coordinates of the volume chunk in the format of [z, y, x].
- Returns:
A numpy array representing the retrieved volume chunk.
- Return type:
h5py.Dataset
- exm.utils.utils.subtract_background_rolling_ball(volume, radius=50, num_threads=40)[source]
Performs background subtraction on a volume image using the rolling ball method.
- Parameters:
- Returns:
The volume image after background subtraction.
- Return type:
np.ndarray
- exm.utils.utils.subtract_background_top_hat(volume, radius=50, use_gpu=True)[source]
Performs top-hat background subtraction on a volume image.
- Parameters:
- Returns:
The volume image after background subtraction.
- Return type:
np.ndarray
- exm.utils.utils.visualize_progress(args)[source]
Visualizes the progress of the ExSeq Toolbox.
This function creates a heatmap visualizing the completion status of different steps in the ExSeq Toolbox for each field of view (FOV) and each code.
- Parameters:
args (Args) – Configuration options. This should be an instance of the Args class.
- Return type:
None
Package Logger
- exm.utils.log.configure_logger(name, log_file_name='ExSeq-Toolbox_logs.log')[source]
Configures and returns a logger with both stream and file handlers.
This function sets up a logger to send log messages to the console and to a log file. The console will display messages with a level of INFO and higher, while the file will contain messages with a level of DEBUG and higher.
- Parameters:
- Returns:
Configured logger object.
- Return type:
- Raises:
OSError – If there is an issue with opening or writing to the log file.