Order Trace Algorithm
- class modules.order_trace.src.alg.OrderTraceAlg(data, poly_degree=None, expected_traces=None, orders_ccd=-1, do_post=False, config=None, logger=None)[source]
Order trace extraction.
This module defines class ‘OrderTraceAlg’ and methods to extract order trace from 2D spectral fits image. The extraction steps include
locate clusters: smooth the image and convert image pixels to be either black or white (‘1’ or ‘0’).
form clusters: find cluster units (each unit containing connected pixels with value ‘1’).
clean the clusters: remove noisy clusters, trim noise from the clusters, split the clusters and clean the clusters along the top and bottom borders.
merge clusters: merge broken clusters to form order trace based on the closeness and polynomial curve fitting.
model order trace: approximate each order trace by using least square polynomial fit.
find top and bottom widths: compute the top and bottom widths along the order trace by using normal distribution to model the distribution of the spectral data along the order trace and approximate the top and bottom widths based on the magnitude of standard deviation from the mean. If the width is unresolved by the use of normal distribution, it is either assigned by a default number or further estimated based on widths of the surrounding orders.
- Parameters:
data (numpy.ndarray) – 2D spectral data.
poly_degree (int) – Order of polynomial for order trace fitting.
config (configparser.ConfigParser) – config context.
logger (logging.Logger) – Instance of logging.Logger.
- flat_data
Numpy array storing 2d image data.
- Type:
numpy.ndarray
- config_ins
Related to ‘PARAM’ section or section associated with the instrument if it is defined in the config file.
- Type:
ConfigHandler
- orders_ccd
Total orders of the ccd. Defaults to -1.
- Type:
number, options
- Raises:
AttributeError – The
Raisessection is a list of all exceptions that are relevant to the interface.TypeError – If there is type error for data or config.
Exception – If the size of data is less than 20 pixels by 20 pixels.
- advanced_cluster_cleaning_handler(index: ndarray, x: ndarray, y: ndarray, start_cluster: int = None, stop_cluster: int = None)[source]
Remove or clean noisy clusters.
This removal process uses polynomial fitting on all or selected clusters formed by form_clusters().
- Parameters:
index (numpy.ndarray) – Array of cluster id on cluster pixels.
x (numpy.ndarray) – Array of x coordinates on cluster pixels.
y (numpy.ndarray) – Array of y coordinates on cluster pixels.
start_cluster (int, optional) – Cluster id of the first cluster to process. Defaults to None.
stop_cluster (int, optional) – Cluster id of the last cluster to process. Defaults to None.
- Returns:
cleaning status on clusters:
index_p (numpy.ndarray): Array of cluster id on cluster pixels after cleaning.
all_status (dict): Cleaning status on processed clusters, like:
{ <cluster_id_i> int: <cleaning status> dict, # <cluster_id_i> is cluster id of i-th cluster. # <cleaning status> is cleaning status for the cluster # See Returns in handle_noisy_cluster() : }
- Return type:
- Raises:
AttributeError – The
Raisessection is a list of all exceptions that are relevant to the interface.TypeError – If there is type error for x, y or index.
Exception – If the size of x, y, or index are not the same.
- approximate_width_of_default(cluster_widths: list, cluster_points: ndarray, cluster_coeffs: ndarray, poly_fit_power: int = 2)[source]
Approximate unresolved width using least square polynomial fit to determined widths.
- Parameters:
cluster_widths (list) – Top and bottom widths of all clusters, like [{‘top_edge’: <number>, ‘bottom_edge’: <number>},…].
cluster_points (numpy.ndarray) – Arrays contains cluster points (y values) along the trace based on the polynomial fitting. Each row includes y values along x axis of one cluster.
cluster_coeffs (numpy.ndarray) – Coefficients of Polynomial fit and area of all order traces.
poly_fit_power (int, optional) – Degree of polynomial fit for width estimation, degree 2 or 3 is suggested. Defaults to 2.
- Returns:
top and bottom widths of all clusters after using polynomial approximation, like:
[ { 'top_edge': float, # top width of first cluster, 'bottom_edge': float # bottom width of first cluster }, : { 'top_edge': float, # top width of last cluster, 'bottom_edge': float # bottom width of last cluster } ]
- Return type:
- clean_clusters_on_border(x: ndarray, y: ndarray, index: ndarray, border_y: int)[source]
Clean clusters crossing the top or bottom boundary based on the given border position along y axis.
- Parameters:
x (array) – Array of x coordinates of cluster pixels.
y (array) – Array of y coordinates of cluster pixels.
index (array) – Array of cluster id on cluster pixels.
border_y (int) – The vertical position (y coordinate) of the border to check.
- Returns:
Cluster pixels after cleaning:
(numpy.ndarray): Array of x coordinates of cluster pixels after cleaning.
(numpy.ndarray): Array of y coordinates of cluster pixels after cleaning.
(numpy.ndarray): Array of cluster id on cluster pixels after cleaning.
- Return type:
- clean_clusters_on_borders(x: ndarray, y: ndarray, index: ndarray, top_border: int = None, bottom_border: int = None)[source]
Clean clusters crossing the top and bottom boundaries of the image.
- Parameters:
x (array) – Array of x coordinates of cluster pixels.
y (array) – Array of y coordinates of cluster pixels.
index (array) – Array of cluster id on cluster pixels.
top_border (int, optional) – Top border vertical position (along y axis). Defaults to None.
bottom_border (int, optional) – Bottom border vertical position (along y axis). Defaults to None.
- Returns:
Cluster pixels after cleaning:
new_x (numpy.ndarray): Array of x coordinates of cluster pixels after cleaning.
new_y (numpy.ndarray): Array of y coordinates of cluster pixels after cleaning.
new_index (numpy.ndarray): Array of cluster id on cluster pixels after cleaning.
- Return type:
- Raises:
AttributeError – The
Raisessection is a list of all exceptions that are relevant to the interface.TypeError – If there is type error for x, y or index.
Exception – If the size of x, y, or index are not the same.
- collect_clusters(c_x: ndarray, c_y: ndarray)[source]
Identify cluster units per positions of cluster pixels.
The cluster units are identified by checking into the set of cluster pixels and there is no pixels connected among the resultant cluster units.
- Parameters:
c_x (numpy.ndarray) – Array of x coordinates for cluster pixels.
c_y (numpy.ndarray) – Array of y coordinates for cluster pixels.
- Returns:
identified cluster units from the image, like:
{ <y_1>: <clusters_1> list, <y_2>: <clusters_2> list,..., <y_n>: <clusters_n> list }, ''' where <y_n> is vertical location (value along y axis) <clusters_n> is list of cluster units ending at <y_n>, like: [ cluster_1, cluster_2, ..., cluster_n], where cluster_i (dict) contains area of the cluster and horizontal segments it covers, like: { 'x1': int, # left of the cluster. 'x2': int, # right of the cluster. 'y1': int, # top of the cluster. 'y2': int, # bottom of the cluster. <y_i_1>: <segments_1> dict, ..., <y_i_n>: <segments_n> dict } where <y_i_t> is one of y location ranging from cluster_i['y1'] to cluster_i['y2']. <segments_i> contains horizontal segments at <y_i_t> like: { 'segments': [[x_0, x_1], [x_2, x_3], ....[x_i, x_i+1]] } where x_i and x_i+1 means the starting and ending index for array c_x. ex: clusters units end at y = 10 and y = 11, { 10: [ { 'x1': 20, 'x2': 30, 'y1': 9, 'y2': 10, 9:{'segments': [[4, 8], [12, 13]]}, 10:{'segments': [[100, 107], [109, 118]]} }, { 'x1': 50, 'x2': 77, 'y1': 7, 'y2': 10, 7:{'segments': [...]}, 8:{'segments': [....]}, 9:{'segments': [....]}, 10:{'segments: [....]} } ], 11: [ {<cluster unit ends at y = 11>}, {<cluster unit ends at y = 11>}... ] } '''
- Return type:
- cross_other_cluster(polys: ndarray, cluster_nos_for_polys: ndarray, cluster_nos: ndarray, x: ndarray, y: ndarray, index: ndarray, power: int, merged_coeffs: ndarray)[source]
Detect if there is another cluster that will prevent the merge of two given clusters.
- Parameters:
polys (numpy.ndarray) – Array contains coefficients of polynomial fit to the clusters and the area of the clusters. Each row contains the coefficients and the area for one cluster.
cluster_nos_for_polys (numpy.ndarray) – The map between the polys and cluster no. Value of cluster_nos_for_polys points to the row index for polys.
cluster_nos (numpy.ndarray) – Array containing the cluster id of two clustered to have the merge test.
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
index (numpy.ndarray) – Array of cluster id on cluster pixels.
power (int) – Degree of the polynomial to fit the cluster.
merged_coeffs (numpy.ndarray) – The coefficients of polynomial fit to and the area of the clusters in case the two clusters of cluster_nos are merged.
- Returns:
The merge is blocked by other cluster if True, or the merge is safe if False.
- Return type:
- curve_fitting_on_all_clusters(index: ndarray, x: ndarray, y: ndarray)[source]
Do polynomial fitting on cluster pixels for all clusters.
- Parameters:
index (numpy.ndarray) – Array of cluster id on cluster pixels.
x (numpy.ndarray) – Array of x coordinates on cluster pixels.
y (numpy.ndarray) – Array of y coordinates on cluster pixels.
- Returns:
Coefficients and errors from polynomial fit:
poly_all (numpy.ndarray): Array contains coefficients of polynomial fit and the area of all clusters. Each row contains the coefficients and the area for one cluster. Please see Returns in
curve_fitting_on_one_cluster()for the detail of each row.errors (numpy.ndarray): Array contains least square errors of polynomial fit to all clusters.
- Return type:
- static curve_fitting_on_one_cluster(cluster_no: int, index: ndarray, x: ndarray, y: ndarray, power: int, poly_info: ndarray = None)[source]
Finding polynomial to fit the cluster pixels.
- Parameters:
cluster_no (int) – cluster id
index (numpy.ndarray) – Array of cluster id of cluster pixels.
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
power (int) – Degree of fitting polynomial.
poly_info (numpy.ndarray, optional) – Array contains the coefficients of polynomial fit and the area of the cluster. Defaults to None.
- Returns:
Coefficients and errors from polynomial fit:
poly_info (numpy.ndarray): Array contains coefficients of fitting polynomial from higher degree to the lower and the area enclosing the cluster, minimum x, maximum x, minimum y and maximum y.
error (float): Polynomial fitting error.
area (list): Cluster area, like [min_x, max_x, min_y, max_y].
- Return type:
- static distance_between_clusters(cluster_nos: ndarray, x: ndarray, y: ndarray, index: ndarray)[source]
Find the horizontal and vertical distance between the clusters, the first cluster has smaller x position.
- Parameters:
cluster_nos (numpy.ndarray) – Array contains the cluster id of two clusters.
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
index (numpy.ndarray) – Array of cluster id on cluster pixels.
- Returns:
tuple containing:
dist_x (float): The horizontal gap between two clusters. The distance is 0 if there is horizontal overlap between two clusters.
dist_y (float): The vertical gap between two clusters. The distance is 0 if there is vertical overlap between two clusters.
- Return type:
- extract_order_from_cluster(cluster_no: int, index: ndarray, x: ndarray, y: ndarray)[source]
Get curve fitting result on specified cluster.
- Parameters:
cluster_no (int) – id of the cluster to find the curve fitting results.
index (numpy.ndarray) – Array of cluster id on cluster pixels.
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
- Returns:
Please see Returns of
curve_fitting_on_one_cluster().- Return type:
- extract_order_trace(power_for_width_estimation: int = -1, data_range=None, show_time: bool = False, print_debug: str = None, rows_to_reset=None, cols_to_reset=None, orderlet_gap_pixels=2)[source]
Order trace extraction.
The order trace extraction includes the steps to smooth the image, locate the clusters, form clusters, remove and trim noisy clusters, merge the clusters to form order traces, model the order trace using polynomial fit and find the top and bottom widths along the traces.
- Parameters:
power_for_width_estimation (int) – Degree of polynomial fit for trace width estimation. Defaults to -1.
data_range (list) – Area of the data, x1, x2, y1, y2, to be processed, where x1, y1 and x2, y2 are the corner coordinates of the area. x1, x2 or y1, y2 respectively represents the horizontal or vertical position relative to the first column or row of the image when it is greater than or equal to 0, otherwise the position relative to the last column or the last row.
show_time (bool, optional) – Show running time if True. Defaults to False.
print_debug (str, optional) – Print debug information to stdout if it is provided as empty string, a file with path print_debug if it is non empty string, or no print if it is None. Defaults to None.
rows_to_reset (list, optional) – Collection of rows to reset. Default to None.
cols_to_reset (list, optional) – Collection of columns to reset. Default to None.
orderlet_gap_pixels (number, optional) – number of pixels to ignore between orderlets during extraction.
- Returns:
order trace extraction and analysis result, like:
{ 'order_trace_result': Padas.DataFrame, # table storing coefficients of polynomial # fit, bottom/top width, and left/right boundary. 'cluster_index': numpy.ndarray, # Array of cluster id of cluster pixels. 'cluster_x': numpy.ndarray, # Array of x coordinates of cluster pixels. 'cluster_y': numpy.ndarray # Array of y coordinates of cluster pixels. }
- Return type:
- find_all_cluster_widths(index_t: ndarray, new_x: ndarray, new_y: ndarray, power_for_width_estimation: int = 3, cluster_set: list = None)[source]
Compute the top and bottom widths along the order trace.
- Parameters:
index_t (numpy.ndarray) – Array of cluster id on cluster pixels.
new_x (numpy.ndarray) – Array of x coordinates of cluster pixels.
new_y (numpy.ndarray) – Array of y coordinates of cluster pixels.
power_for_width_estimation (int, optional) – Degree of polynomial fit for width estimation, degree 2 or 3 is suggested. Defaults to 3. The estimation step skips if it is less than 0.
cluster_set (list, optional) – Set of selected cluster id for width finding. Defaults to None. Widths of all clusters are computed if None.
- Returns:
a list of width information for each order trace. Each element in the list is like:
{ 'top_edge': float, # top width along the trace. 'bottom_edge': float # bottom width along the trace. }
- Return type:
- Raises:
AttributeError – The
Raisessection is a list of all exceptions that are relevant to the interface.TypeError – If there is type error for new_x, new_y or index_t.
Exception – If the size of new_x, new_y, or index_t are not the same.
- find_cluster_width_by_gaussian(cluster_no: int, poly_coeffs: ndarray, cluster_points: ndarray)[source]
Find the width of the cluster using Gaussian to approximate the distribution of collected spectral data.
- Parameters:
cluster_no (int) – Cluster id.
poly_coeffs (numpy.ndarray) – Polynomial fitting information and the covered area of all clusters.
cluster_points (numpy.ndarray) – Pixel position (y values) along the polynomial fit of every cluster.
- Returns:
cluster width information like:
{ 'cluster_no': int, # cluster id. 'avg_pwidth': float, # bottom width of cluster. 'avg_nwidth': float # top width of cluster. }
- Return type:
- static find_mean_from_histogram(vals: ndarray, bin_no: int = 4, c_range: list = None, cut_at: float = None)[source]
Find the mean value based on the histogram of the data set.
Calculate the mean of the data selected from the given data set based on the histogram of the set.
- static fit_width_by_gaussian(x_set: ndarray, y_set: ndarray, center_y: float, xs: int, sigma: float = 3.0)[source]
Find the width using Gaussian fitting.
Fit the x, y set of data using Gaussian and find the width by looking at sigma of the Gaussian fit.
- Parameters:
- Returns:
Gaussian fit results:
gaussian_fit: Gaussian fit object.
width (float): Width found by Gaussian fit.
gaussian_center (float): Mean of Gaussian fit.
- Return type:
- form_clusters(c_x: ndarray, c_y: ndarray, th=None)[source]
Form clusters and assign id to each formed cluster.
Form the cluster units and remove the small size cluster units. There is no pixel connected between different cluster units.
- Parameters:
c_x (numpy.ndarray) – Array of x coordinates for cluster pixels.
c_y (numpy.ndarray) – Array of y coordinates for cluster pixels.
th (int, optional) – Size threshold used for removing noisy cluster. Defaults to None.
- Returns:
Information of cluster pixels after cluster units are formed,
new_x (numpy.ndarray): Array of x coordinates of cluster pixels.
new_y (numpy.ndarray): Array of y coordinates of cluster pixels.
new_index (numpy.ndarray): Array of cluster id on cluster pixels.
- Return type:
- Raises:
AttributeError – The
Raisessection is a list of all exceptions that are relevant to the interface.TypeError – If there is type error for c_x or c_y.
Exception – If the size of c_x or c_y are not the same.
- get_cluster_points(polys_coeffs: ndarray)[source]
Compute cluster points (y values) along the fitting curve within x range of the cluster.
- Parameters:
polys_coeffs (numpy.ndarray) – Polynomial fit coefficients and area on clusters.
- Returns:
Arrays contains cluster points (y values) along the trace based on the polynomial fitting. Each row includes y values along x axis of one cluster.
- Return type:
numpy.ndarray
- static get_cluster_size(c_id: int, index: ndarray, x: ndarray, y: ndarray)[source]
Compute the width, height, total pixels and pixel index collection per specified cluster id.
- Parameters:
c_id (int) – Cluster id.
index (np.ndarray) – Array of cluster id on cluster pixels.
x (np.ndarray) – Array of x coordinates of cluster pixels.
y (np.ndarray) – Array of y coordinates of cluster pixels.
- Returns:
Size information of the cluster,
w (int): width of the cluster c_id.
h (int): height of the cluster c_id.
total_pixel (int): total pixel contained in the cluster c_id.
crt_idx (numpy.ndarray): Array contains the index from index for all pixels belonging to cluster c_id.
- Return type:
- get_config_value(param: str, default)[source]
Get defined value from the config file.
Search the value of the specified property from config section. The default value is returned if no found.
- Parameters:
param (str) – Name of the parameter to be searched.
default (str/int/float) – Default value for the searched parameter.
- Returns:
Value for the searched parameter.
- Return type:
int/float/str
- get_fit_error_threshold()[source]
Get polynomial fitting mean square error threshold
- Returns:
error threshold
- Return type:
- get_poly_degree()[source]
Order of polynomial for order trace fitting.
- Returns:
Order of polynomial.
- Return type:
- static get_segments_from_index_list(id_list: ndarray, loc: ndarray)[source]
Find horizontal segments at some y location.
Horizontal segment means a segment containing continuous cluster pixels at the same y position. The finding is based on index list associated with an array of x coordinates.
- Parameters:
id_list (numpy.ndarray) – Array of index for the array of loc.
loc (numpy.ndarray) – Array of x coordinates of cluster pixels.
- Returns:
List of horizontal segments, like:
[[<start_idx>_i, <end_idx>_i], ..., [<start_idx>_n, <end_idx>_n]] ''' where <start_idx>_i and <end_idx>_i represent the starting and ending index of the i-th segment and the index is associated with parameter loc. ex. [[1, 3], [7, 10], ..., [150, 160]] means the following segments are included, 1st segment is from loc[1] to loc[3] along x-axis. 2nd segment is from loc[7] to loc[10] along x-axis. last segment is from loc[150] to loc[160] along x-axis. '''
- Return type:
- get_sigma_for_width_fititng()[source]
Get the deviation number to estimate the width of the order
- Returns:
number of sigma
- Return type:
- static get_sorted_index(poly_coeffs: ndarray, cluster_no: int, power: int, x_loc: int)[source]
Get sorted index for a cluster.
Do sorting on the list with cluster id based on the cluster’s position (y values) at x_loc and find the index of the cluster with cluster_no in the newly sorted list.
- Parameters:
- Returns:
contains the sorted information, like:
{ 'idx': int, # index of the cluster `cluster_no` from the new sorted list. 'index_v_pos': numpy.ndarray # sorted list of cluster id based on the y position at `x_loc`. }
- Return type:
- get_spectral_data()[source]
Get spectral information including data and dimension.
- Returns:
Information of spectral data,
(numpy.ndarray): 2D spectral data.
nx (int): Width of the data.
ny (int): Height of the data.
- Return type:
- get_trace_vertical_gap()[source]
Get the estimated vertical gap between the traces
- Returns:
Vertical gap between traces
- Return type:
- get_width_default()[source]
Get the trace width default
- Returns:
number of width default
- Return type:
- handle_noisy_cluster(index_t: ndarray, x: ndarray, y: ndarray, num_set: list)[source]
Handle the cluster which is not well fitted by polynomial curve.
- Parameters:
index_t (numpy.ndarray) – Array of cluster id on cluster pixels.
x (numpy.ndarray) – Array of x coordinates on cluster pixels.
y (numpy.ndarray) – Array of y coordinates on cluster pixels.
num_set (list) – The cluster with the specified id (currently, the first member in the list) is handled.
- Returns:
Status after processing:
new_index_t (np.ndarray): updated version of index_t after processing
status (dict): One of the following possible process results is returned:
the cluster is to be deleted.
the cluster pixels is to be changed.
the cluster is to be split into multiple cluster units.
the cluster remains the same.
it is like:
{ 'msg': 'delete'/'change'/'split'/'same', 'cluster_id': <target_cluster_id> int, 'cluster_added': [<new_id_1>, <new_id_2>,...,<new_id_i>], 'poly_fitting':{ <cluster_id>: { 'errors': float, # Least square error by using polynomial fit. 'coeffs': numpy.ndarray, # Coefficients of polynomial fit. 'area': list, # Area of the cluster, like # [<min_x>, <max_x>, <min_y>, <max_y>] # for 4 borders of the cluster. } <new_id_1>: {'errors': ..., 'coeffs': ..., 'area': ...}, <new_id_n>: {'errors': ..., 'coeffs': ..., 'area': ...}} } # <new_id_i> is the id for newly created cluster, if 'split'.
- Return type:
- locate_clusters(img_rows_to_reset=None, img_cols_to_reset=None)[source]
Find cluster pixels from 2D data array.
Perform smoothing method to convert the pixels to be 1 and 0 and find cluster pixels. Cluster pixels mean a set of pixels with value 1 and each pixel connects to at least one neighbor pixel in vertical, horizontal or in diagonal direction.
- Parameters:
- Returns:
result of formed clusters, like:
{ 'x': numpy.ndarray, # Array of x coordinates of cluster pixels. 'y': numpy.ndarray, # Array of y coordinates of cluster pixels. 'cluster_image': numpy.ndarray # 2D image in which the cluster pixels are with # value 1 and non cluster pixels are with value 0. }
- Return type:
- make_2d_data(index: ndarray, x: ndarray, y: ndarray, selected_clusters: ndarray = None)[source]
Create 2D data based on cluster pixels related information.
- Parameters:
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
index (numpy.ndarray) – Array of cluster id on cluster pixels.
selected_clusters (numpy.ndarray, optional) – Make 2D data based on selected clusters only. Defaults to None.
- Returns:
2D data with pixels set as 1 on cluster pixels, or 0 on non cluster pixels.
- Return type:
numpy.ndarray
- merge_clusters(index: ndarray, x: ndarray, y: ndarray)[source]
Merge clusters based on the closeness between the clusters and the fitting quality by the same polynomial.
- Parameters:
index (numpy.ndarray) – Array of cluster id on cluster pixels.
x (numpy.array) – Array of x coordinates of cluster pixels.
y (numpy.array) – Array of y coordinates of cluster pixels.
- Returns:
- Information of cluster pixels after processing,
new_x (numpy.ndarray): Array of x coordinates of cluster pixels after processing.
new_y (numpy.ndarray): Array of y coordinates of cluster pixels after processing.
new_index (numpy.ndarray): Array of cluster id of cluster pixels.
m_coeffs (numpy.ndarray): Array containing polynomial fitting coefficients and the area of the clusters. Each row of the array has the data for one cluster.
- Return type:
- merge_clusters_and_clean(index: ndarray, x: ndarray, y: ndarray)[source]
Merge clusters and remove the clusters with big opening in the center.
- Parameters:
index (numpy.ndarray) – Array of cluster id on cluster pixels.
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
- Returns:
Information of cluster pixels after merging,
new_x (numpy.ndarray): Array of x coordinates of cluster pixels after processing.
new_y (numpy.ndarray): Array of y coordinates of cluster pixels after processing.
new_index (numpy.ndarray): Array of cluster id of cluster pixels after processing.
- Return type:
- Raises:
AttributeError – The
Raisessection is a list of all exceptions that are relevant to the interface.TypeError – If there is type error for ‘x`, y or index.
Exception – If the size of x, y, or index are not the same.
- merge_fitting_curve(poly_curves: ndarray, index: ndarray, x: ndarray, y: ndarray, threshold=2.5)[source]
Merge the cluster to the closest neighbor.
The merge iterates on cluster pairs and stops when one merge is made or all paris are tested.
- Parameters:
poly_curves (numpy.ndarray) – Array containing coefficients of polynomial fitting to all clusters and the area of the clusters. Each row contains the coefficients and the area for one cluster.
index (numpy.ndarray) – Array of cluster id on cluster pixels.
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
threshold (float) – error threshold to determine the polynomial fitting quality.
- Returns:
merge status, like:
{ 'status': 'changed'|'nochange'. 'index': numpy.ndarray, # Array of cluster id on cluster pixels after merge. 'kept_curves': list, # Array of cluster id of unchanged clusters. 'log': <messge>. } # 'status' means if clusters are 'changed' (if merge happens) or 'nochange'. # 'log' contains the message regarding any merge action if there is, # like 'remove id' or 'merge id_1 and id_2'.
- Return type:
- static merge_two_clusters(cluster_nos: ndarray, x: ndarray, y: ndarray, index: ndarray, power: int)[source]
Calculate the polynomial fitting error and distance in case two clusters are merged.
- Parameters:
cluster_nos (numpy.ndarray) – Two cluster id included and the first is the cluster located leftmost.
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
index (numpy.ndarray) – Array of cluster id on cluster pixels.
power (int) – Degree of polynomial to fit two clusters.
- Returns:
Information of polynomial fit to two clusters,
poly_info (numpy.ndarray): Array contains coefficients of fitting polynomial and area of the cluster after the merge.
errors (float): Least square error of polynomial fitting.
- Return type:
- static mirror_data(x_set: ndarray, y_set: ndarray, mirror_side: int)[source]
Mirror y value to the left or right side of x_set.
- Parameters:
x_set (numpy.ndarray) – Array of x values.
y_set (numpy.ndarray) – Array of y values paired to each of x_set.
mirror_side (int) – Mirror direction. Mirror to the left side of x_set at if 0, or to the right side of x_set if 1.
- Returns:
Data after mirroring,
x_new_set (numpy.ndarray): Array containing x coordinates from left to the right after mirroring.
y_new_set (numpy.ndarray): Array containing y coordinates relevant to x_new_set.
- Return type:
- one_step_merge_cluster(crt_coeffs: ndarray, crt_index: ndarray, crt_x: ndarray, crt_y: ndarray)[source]
Single step of cluster merging, at most one pair of clusters is merged.
- Parameters:
crt_coeffs (numpy.ndarray) – Coefficients of polynomial fit and the area of the clusters.
crt_index (numpy.ndarray) – Array of cluster id on cluster pixels.
crt_x (numpy.ndarray) – Array of x coordinates of cluster pixels.
crt_y (numpy.ndarray) – Array of y coordinates of cluster pixels.
- Returns:
Information of cluster pixels after merge and merge status:
(numpy.ndarray): Array of cluster id of cluster pixels after merge.
(numpy.ndarray): Array of x coordinates of cluster pixels after merge.
(numpy.ndarray): Array of y coordinates of cluster pixels after merge.
(numpy.ndarray): Coefficients of polynomial fit and the area of the clusters after the merge.
merge_status (dict): merge status, please see
merge_fitting_curve()for the detail.
- Return type:
- post_process(orig_coeffs, orig_widths, orderlet_gap=2)[source]
- post process and refine the calculated widths to make the widths located closer to the valley between two
consecutive orderlet traces and in the style of being more symmetric to the valley.
- Parameters:
orig_coeffs – coeffs from high order to low order
orig_widths – widths array of lower and upper widths
- Returns:
- new_coeffs with one extra row added to orig_coeffs as the same format of the parameter
cluster_coeffs to write_cluster_info_to_dataframe.
- list: orig_widths containing the width information in the same format of the parameter cluster_widths
to write_cluster_info_to_dataframe.
- Return type:
numpy.array
- remove_broken_cluster(index: ndarray, x: ndarray, y: ndarray)[source]
Remove the cluster which has big opening around the center of the image.
- Parameters:
index (numpy.ndarray) – Array of cluster id on cluster pixels.
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
- Returns:
Information of cluster pixels after processing,
new_x (numpy.ndarray): Array of x coordinates of cluster pixels after processing.
new_y (numpy.ndarray): Array of y coordinates of cluster pixels after processing.
new_index (numpy.ndarray): Array of cluster id on cluster pixels after processing.
- Return type:
- remove_cluster_by_size(clusters_endy_dict: dict, x_index: ndarray, y_index: ndarray, th=None)[source]
Remove noisy clusters.
The removal process is based on pixel number and the size of the cluster. Assign an id to non-noisy cluster.
- Parameters:
clusters_endy_dict (dict) – Collection of clusters collected by collect_clusters, please see Returns section of
collect_clusters()for more detail.x_index (numpy.ndarray) – Array of x coordinates of cluster pixels.
y_index (numpy.ndarray) – Array of y coordinates of cluster pixels.
th (int, optional) – Size threshold for removing the noisy cluster. Defaults to None.
- Returns:
cluster information containing assigned id, like:
{ 'index': numpy.ndarray, # array of cluster id associated with cluster pixels. 'n_regions': int # total cluster. }
- Return type:
- remove_noise_in_cluster(cluster_curves: list, x_index: ndarray, y_index: ndarray, crt_cluster_idx: ndarray, th=None)[source]
Remove noise cluster, trim noise from the cluster, or split the cluster into another clusters.
The removal works on the clusters collected by
handle_noisy_cluster(). Whether the cluster in the collection is kept or removed depends on the size and polynomial fitting result.- Parameters:
cluster_curves (list) – Array of clusters collected by
handle_noisy_cluster()are tested to be kept or removed.x_index (numpy.ndarray) – Array of x coordinates of cluster pixels.
y_index (numpy.ndarray) – Array of x coordinates of cluster pixels.
crt_cluster_idx (numpy.ndarray) – Set of index for the clusters included in cluster_curves and the index is for cluster pixels related array, like x_index or y_index.
th (float, optional) – Threshold for cluster size. Defaults to None.
- Returns:
Polynomial fit results and cluster id for not removed clusters,
index (np.npdarray): Array associated with cluster pixels in which the pixels covered by any not removed clusters of cluster_curves are marked by a cluster no. starting from 1.
poly_fitting_results (dict): Polynomial fitting results for not removed clusters in cluster_curves, like:
{ 'errors': float, # Least square errors of polynomial fitting. 'coeffs': numpy.ndarray, # Coefficients of polynomial fitting. 'area': list # area of the cluster, like # [<min_x>, <max_x>, <min_y>, <max_y>] # for 4 borders of the cluster. }
- Return type:
- static remove_unassigned_cluster(x: ndarray, y: ndarray, index: ndarray)[source]
Remove the cluster pixels which has no cluster number assigned.
- Parameters:
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
index (numpy.ndarray) – Array of cluster id on cluster pixels.
- Returns:
Information of cluster pixels after processing,
x_r (numpy.ndarray): Array of x coordinates of cluster pixels after processing.
y_r (numpy.ndarray): Array of y coordinates of cluster pixels after processing.
index_r (numpy.ndarray): Array of cluster id on cluster pixels after processing.
- Return type:
- reorganize_index(index: ndarray, x: ndarray, y: ndarray, return_map: bool = False)[source]
Remove cluster pixels with unsigned cluster no. and reorder the cluster.
Remove the cluster pixels with cluster number less than 1 and re-assign the cluster id to existing cluster pixels.
- Parameters:
index (numpy.ndarray) – Array of cluster id on cluster pixels.
x (numpy.ndarray) – Array of x coordinates of cluster pixels.
y (numpy.ndarray) – Array of y coordinates of cluster pixels.
return_map (bool, optional) – Return map between old cluster id and new cluster id if True.
- Returns:
Information of cluster pixels after processing,
new_x (numpy.ndarray): Array of x coordinates of cluster pixels after processing.
new_y (numpy.ndarray): Array of y coordinates of cluster pixels after processing.
new_index (numpy.ndarray): Array of cluster id on cluster pixels after processing.
return_map (dict): Map between old cluster id and new cluster id like:
{ <old cluster id> int: <new cluster id> int }
- Return type:
- static reset_row_or_column(imm: ndarray, reset_ranges: list = None, row_or_column: int = 0, val: int = 0)[source]
Set a value to columns or rows in 2D image array.
Assign a value to pixels of some columns or rows.
- Parameters:
- Returns:
Pixel information after resetting,
imm (numpy.ndarray): 2D array with reset value.
(numpy.ndarray): Array of x coordinate of pixels with value greater than 0.
(numpy.ndarray): Array of y coordinate of pixels with value greater than 0.
- Return type:
- set_data_range(data_range=None)[source]
Set data range to be processed
- Parameters:
data_range (list) – Area of the data, [x1, x2, y1, y2], to be processed. The column (or the row) is counted relatively from the first column (or the first row) in case the number is not less than 0, otherwise the column (or the row) is counted from the last one.
- Returns:
Data range position relative to the first column and first row of the raw image.
- Return type:
- sort_cluster_in_y(cluster_coeffs: ndarray)[source]
Sort cluster based on vertical position.
- Parameters:
cluster_coeffs (np.ndarray) – Array contains coefficients of polynomial fit and the area of the clusters.
- Returns:
Sorted list of cluster id based on the vertical position of the clusters.
- Return type:
np.ndarray
- static sort_cluster_on_loc(clusters: list, loc)[source]
Sort the clusters based on the specified location key.
- static sort_cluster_segments(segments: list)[source]
Sort a set of segments based on the first number contained in each segment.
- write_cluster_info_to_dataframe(cluster_widths: list, cluster_coeffs: ndarray)[source]
Write the coefficients of polynomial fit, area and top/bottom widths of order trace to DataFrame object.
- Parameters:
cluster_widths (list) –
Array contains the top and bottom widths of clusters, like:
[ { 'top edge': float, # top width of first cluster 'bottom edge': float # bottom width of first cluster }, ...., { 'top edge': float, # top width of last cluster 'bottom edge': float # bottom width of last cluster } ]
cluster_coeffs (numpy.ndarray) – Array contains coefficients of polynomial fit and the area of the clusters.
- Returns:
Instance of DataFrame containing columns (for polynomial of degree 3) like,
Coeff0, Coeff1, Coeff2, Coeff3, BottomEdge, TopEdge, X1, X2
to contain coefficients of polynomial fit from lower order to higher, bottom and top widths, and the left and right boundary of the orders.
- Return type:
Pandas.DataFrame