Practical Guide to Machine Learning, NLP, and Generative AI: Libraries, Algorithms, and Applications First Edition by unknow

Practical Guide to Machine Learning, NLP, and Generative AI: Libraries, Algorithms, and Applications First Edition by unknow

Author:unknow
Language: eng
Format: epub
ISBN: 9781040260050
Published: 2024-12-15T00:00:00+00:00


4.1.1 Gene expression data

Code 4.1:

import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.datasets import make_blobs from sklearn.cluster import AgglomerativeClustering from sklearn.metrics import silhouette_score from scipy.cluster.hierarchy import dendrogram, linkage # Generate synthetic gene expression data data, _ = make_blobs(n_samples=300, centers=4, random_state=42) # Perform hierarchical clustering n_clusters = 4#Number of clusters agg_clustering = AgglomerativeClustering(n_clusters=n_clusters, linkage=’ward’) agg_labels = agg_clustering.fit_predict(data) # Calculate silhouette score silhouette_avg = silhouette_score(data, agg_labels) print(f"Silhouette Score: {silhouette_avg}") # Plot dendrogram plt.figure(figsize=(12, 6)) dendrogram(linkage(data, method=’ward’), truncate_mode=’level’, p=3) plt.title(‘Hierarchical Clustering Dendrogram’, fontsize=16, fontweight=’bold’) plt.xlabel(‘Sample Index’, fontsize=14, fontweight=’bold’) plt.ylabel(‘Distance’, fontsize=14, fontweight=’bold’) plt.xticks(fontsize=12, fontweight=’bold’) plt.yticks(fontsize=12, fontweight=’bold’) plt.show() # Plot clustering results plt.figure(figsize=(10, 6)) scatter = plt.scatter(data[:, 0], data[:, 1], c=agg_labels, cmap=’viridis’, s=50, alpha=0.5) plt.title(‘Hierarchical Clustering’, fontsize=16, fontweight=’bold’) plt.xlabel(‘Feature 1’, fontsize=14, fontweight=’bold’) plt.ylabel(‘Feature 2’, fontsize=14, fontweight=’bold’) cbar = plt.colorbar(scatter) cbar.set_label(‘Cluster Label’, fontsize=14, fontweight=’bold’) cbar.ax.yaxis.set_tick_params(labelsize=12) plt.xticks(fontsize=12, fontweight=’bold’) plt.yticks(fontsize=12, fontweight=’bold’) plt.show() plt.show()

Hierarchical clustering on synthetic gene expression data is demonstrated in the Python program shown in Code 4.1. First, it loads the libraries needed to generate, cluster, evaluate, and display data. After that, we use the ‘make_blobs’ function to mimic data point clusters with defined centers and create synthetic gene expression data. We use the ‘AgglomerativeClustering’ class to do hierarchical clustering, with four clusters and ‘ward’ as our linkage condition. The compactness and separation of clusters can be better understood with the help of a silhouette score, which is computed to assess the clustering quality. The dendrogram visualization provides a visual depiction of the clustering structure, showcasing the hierarchical clustering process and cluster merging. Lastly, a scatter plot is used to display the clustering findings. The color of each data point is based on its cluster label. In order to analyze gene expression data and visualize clustering findings for additional analysis and interpretation, this program shows how hierarchical clustering is applied.

The silhouette score quantifies the degree to which an item resembles its own cluster in terms of both cohesiveness and separation. It can take on values between −1 and 1, with closer values indicating better cluster matching and lower values indicating worse cluster matching. Negative numbers imply that the item may have been mistakenly allocated to the incorrect cluster, while scores close to 0 show that the object is on or near the decision boundary between two nearby clusters. The clustering process yielded clearly defined clusters with respectable inter-cluster distance, as shown by the silhouette score of around 0.792. That the gene expression data was successfully clustered using the hierarchical clustering algorithm and that these clusters are unique from one another is supported by the results. The relation in between hierarchical clustering analysis and hierarchical clustering formation are shown in Figures 4.1 and 4.2.

Figure 4.1: Hierarchical clustering analysis. Figure 4.2: Hierarchical clustering formation.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Popular ebooks
Whisky: Malt Whiskies of Scotland (Collins Little Books) by dominic roskrow(56008)
What's Done in Darkness by Kayla Perrin(26587)
The Fifty Shades Trilogy & Grey by E L James(19075)
Shot Through the Heart: DI Grace Fisher 2 by Isabelle Grey(19055)
Shot Through the Heart by Mercy Celeste(18933)
Wolf & Parchment: New Theory Spice & Wolf, Vol. 10 by Isuna Hasekura and Jyuu Ayakura(17107)
Python GUI Applications using PyQt5 : The hands-on guide to build apps with Python by Verdugo Leire(16978)
Peren F. Statistics for Business and Economics...Essential Formulas 3ed 2025 by Unknown(16868)
Wolf & Parchment: New Theory Spice & Wolf, Vol. 03 by Isuna Hasekura and Jyuu Ayakura & Jyuu Ayakura(16815)
Wolf & Parchment: New Theory Spice & Wolf, Vol. 01 by Isuna Hasekura and Jyuu Ayakura & Jyuu Ayakura(16440)
The Subtle Art of Not Giving a F*ck by Mark Manson(14350)
The 3rd Cycle of the Betrayed Series Collection: Extremely Controversial Historical Thrillers (Betrayed Series Boxed set) by McCray Carolyn(14127)
Stepbrother Stories 2 - 21 Taboo Story Collection (Brother Sister Stepbrother Stepsister Taboo Pseudo Incest Family Virgin Creampie Pregnant Forced Pregnancy Breeding) by Roxi Harding(13612)
Scorched Earth by Nick Kyme(12765)
Drei Generationen auf dem Jakobsweg by Stein Pia(10961)
Suna by Ziefle Pia(10886)
Scythe by Neal Shusterman(10332)
International Relations from the Global South; Worlds of Difference; First Edition by Arlene B. Tickner & Karen Smith(9518)
Successful Proposal Strategies for Small Businesses: Using Knowledge Management ot Win Govenment, Private Sector, and International Contracts 3rd Edition by Robert Frey(9363)
This is Going to Hurt by Adam Kay(9169)