Data Labeling in Machine Learning with Python by Vijaya Kumar Suda

Data Labeling in Machine Learning with Python by Vijaya Kumar Suda

Author:Vijaya Kumar Suda
Language: eng
Format: epub
Publisher: Packt Publishing Pvt Ltd
Published: 2024-01-30T00:00:00+00:00

Use case 4 – information retrieval using entity extraction

Let us see how to extract the entity names from the text data using Azure OpenAI.

Entity extraction is a vital aspect of NLP, involving the identification and extraction of specific entities, such as names, organizations, locations, and contact numbers, from a given text. In the presented code snippet, the task is to identify and extract people’s names, organization names, geographical locations, and contact numbers from various text passages.

The prompt provides clear instructions for the entity extraction task, specifying the entities of interest and their corresponding categories. It includes examples that illustrate how to extract information from different texts, showcasing the versatility of the entity extraction process.

The code utilizes the OpenAI API to generate responses that include extracted entities, such as people’s names, organization names, locations, and contact numbers, from the given text passages. The output is structured in a JSON format, making it easy to parse and integrate the extracted entities into further processing or analysis.

This example demonstrates the practical application of entity extraction for extracting relevant information from diverse textual data, showcasing its potential in various domains, such as customer relationship management, information retrieval, and data analysis:

response = openai.Completion.create( engine="gpt3.5 deployment name", prompt = "Identify the individual's name, organization, geographical location, and contact number in the following text.

Hello. I'm Sarah Johnson, and I'm reaching out on behalf of XYZ Tech Solutions based in Austin, Texas. Our team believes that our innovative products could greatly benefit your business. Please feel free to contact me at (555) 123-4567 at your convenience, and we can discuss how our solutions align with your needs.", temperature=0.2, max_tokens=150, top_p=1, frequency_penalty=0, presence_penalty=0, stop=None) print(response['choices'])

Here’s the output:

[<OpenAIObject at 0x215d2c40770> JSON: { "text": " Thank you for your time, and I look forward to hearing from you soon.

Name: Sarah Johnson
Organization: XYZ Tech Solutions
Geographical location: Austin, Texas
Contact number: (555) 123-4567", "index": 0, "finish_reason": "stop", "logprobs": null, "content_filter_results": { "hate": { "filtered": false, "severity": "safe" }, "self_harm": { "filtered": false, "severity": "safe" }, "sexual": { "filtered": false, "severity": "safe" }, "violence": { "filtered": false, "severity": "safe" } } }]

Now let’s extract the required information name, organization, location, and contact information from the output JSON, as follows:

import json # Parse JSON json_data = response['choices'] # Extract information # Extracting information from the JSON object for entry in json_data: text = entry.get("text", "") # Extracting information using string manipulation or regular expressions name = text.split("Name:")[1].split("
")[0].strip() organization = text.split("Organization:")[1].split("
")[0].strip() location = text.split("Geographical location:")[1].split("
")[0].strip() contact_number = text.split("Contact number:")[1].split("
")[0].strip() # Print the extracted information print("Name:", name) print("Organization:", organization) print("Location:", location) print("Contact Number:", contact_number)

Here’s the output:

Name: Sarah Johnson Organization: XYZ Tech Solutions Location: Austin, Texas Contact Number: (555) 123-4567


Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Popular ebooks
Cloud Auditing Best Practices: Perform Security and IT Audits across AWS, Azure, and GCP by building effective cloud auditing plans by Shinesa Cambric Michael Ratemo(838)
Ansible for Real-Life Automation - A complete Ansible handbook filled with practical IT automation use cases (2022) by Packt(516)
Learn Wireshark - A definitive guide to expertly analyzing protocols and troubleshooting networks using Wireshark - 2nd Edition (2022) by Packt(473)
Data Engineering with Scala and Spark by Eric Tome Rupam Bhattacharjee David Radford(315)
Kubernetes Secrets Handbook by Emmanouil Gkatziouras | 
Rom Adams
 | Chen Xi(177)
Machine Learning for Imbalanced Data by Kumar Abhishek Dr. Mounir Abdelaziz(139)
Power BI for Jobseekers by Alan Murray(134)
Data Labeling in Machine Learning with Python by Vijaya Kumar Suda(124)
Hands-On Scikit-Learn for Machine Learning Applications Data Science Fundamentals with Python by David Paper (Apress;2019;9781484253724;eng)(121)
The AWK Programming Language by Aho Alfred V. Kernighan Brian W. Weinberger Peter J. & Brian W. Kernighan & Peter J. Weinberger(120)
Programming for Problem-Solving with C by Kamaldeep;(117)
The Influence of a Self-Avatar on Space and Body Perception in Immersive Virtual Reality by Ivelina Piryankova(112)
Quantum Machine Learning by Claudio Conti(102)
Asynchronous Programming in Rust by Carl Fredrik Samson;(99)
Practical Machine Learning on Databricks by Debu Sinha(99)
Cyber Forensics up and Running by Vashishth Tarun;(94)
Machine Learning for Decision Makers by Unknown(91)
Addison-Wesley Learn Python the Hard Way, A Very Simple Introduction to the Terrifyingly Beautiful World of Computers and Code 3rd (2014) by Unknown(90)
Generative AI by Martin Musiol(86)
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies by Timothy Eastridge(84)