Active Machine Learning with Python by Margaux Masson-Forsythe
Author:Margaux Masson-Forsythe
Language: eng
Format: epub
Publisher: Packt Publishing Ltd.
Published: 2024-03-27T00:00:00+00:00
Applying uncertainty sampling to improve classification performance
We will choose the most informative images to label next from our dataset â namely, the frames where the model is least confident, a method discussed in Chapter 2, Designing Query Strategy Frameworks.
We first define a function to get the modelâs uncertainty scores:
def least_confident_score(predicted_probs): return 1 - predicted_probs[np.argmax(predicted_probs)]
Then, we define our data loader for the unlabeled set. We will use a batch size of 1 as we will loop through all the images to get the uncertainty scores:
unlabeled_loader = DataLoader(full_dataset, batch_size=1)
We collect the confidence scores for our set of unlabeled images:
least_confident_scores = [] for image, label in unlabeled_loader: probs = F.softmax(model(image), dim=1) score = least_confident_score(probs.detach().numpy()[0]) least_confident_scores.append(score) print(least_confident_scores)
This returns the following:
[0.637821763753891, 0.4338147044181824, 0.18698161840438843, 0.6028554439544678, 0.35655343532562256, 0.3845849633216858, 0.4887065887451172, ...]
These values represent the least confidence scores of the modelâs predictions. The higher the scores, the less confident the model is. Therefore, next, we want to know the indices of the images where the scores are highest. We decide that we want to select 200 images (queries):
num_queries = 200
Then, we sort by uncertainty:
sorted_uncertainties, indices = torch.sort( torch.tensor(least_confident_scores))
We get the original indices of the most uncertain samples and print the results:
most_uncertain_indices = indices[-num_queries:] print(f"sorted_uncertainties: {sorted_uncertainties} \ nmost_uncertain_indices selected: {most_uncertain_indices}")
This returns the following:
sorted_uncertainties: tensor([0.0000, 0.0000, 0.0000, ..., 0.7419, 0.7460, 0.7928], dtype=torch.float64) most_uncertain_indices selected: tensor([45820, 36802, 15912, 8635, 32207, 11987, 39232, 6099, 18543, 29082, 42403, 21331, 5633, 29284, 29566, 23878, 47522, 17097, 15229, 11468, 18130, 45120, 25245, 19864, 45457, 20434, 34309, 10034, 45285, 25496, 40169, 31792, 22868, 35525, 31238, 24694, 48734, 18419, 45289, 16126, 31668, 45971, 26393, ... 44338, 19687, 18283, 23128, 20556, 26325])
Now we have the indices of the images selected using our active ML least-confident strategy. These are the images that would be sent to our oracles to be labeled and then used to train the model again.
Letâs take a look at five of these selected images:
fig, axs = plt.subplots(1, 5) for i in range(5): image, label = full_dataset[most_uncertain_indices[i]] image = image.squeeze().permute(1, 2, 0) / 2 + 0.5 axs[i].imshow(image) axs[i].axis('off') plt.show()
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
AI & Machine Learning | Bioinformatics |
Computer Simulation | Cybernetics |
Human-Computer Interaction | Information Theory |
Robotics | Systems Analysis & Design |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8301)
Test-Driven Development with Java by Alan Mellor(6726)
Data Augmentation with Python by Duc Haba(6639)
Principles of Data Fabric by Sonia Mezzetta(6391)
Learn Blender Simulations the Right Way by Stephen Pearson(6290)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(6161)
Hadoop in Practice by Alex Holmes(5958)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5807)
RPA Solution Architect's Handbook by Sachin Sahgal(5559)
Big Data Analysis with Python by Ivan Marin(5366)
The Infinite Retina by Robert Scoble Irena Cronin(5251)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5147)
Pretrain Vision and Large Language Models in Python by Emily Webber(4328)
Infrastructure as Code for Beginners by Russ McKendrick(4090)
Functional Programming in JavaScript by Mantyla Dan(4038)
The Age of Surveillance Capitalism by Shoshana Zuboff(3955)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3804)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3608)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3579)
