Home > New, Used & Rental Textbooks > Computer Science > Database Storage & Design

Machine Learning Upgrade: a Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure by Kehrer Kristen;Kaiser Caleb; & Caleb Kaiser

Author:Kehrer, Kristen;Kaiser, Caleb; & Caleb Kaiser [Kehrer, Kristen & Kaiser, Caleb] , Date: August 3, 2024 ,Views: 176

Machine Learning Upgrade: a Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure by Kehrer Kristen;Kaiser Caleb; & Caleb Kaiser

Author:Kehrer, Kristen;Kaiser, Caleb; & Caleb Kaiser [Kehrer, Kristen & Kaiser, Caleb]
Language: eng
Format: epub
Publisher: John Wiley & Sons, Incorporated
Published: 2024-07-04T00:00:00+00:00

The second thing you'll need to improve your inference pipeline is a metric to optimize. With LLMs, this is a tricky task. Many of the traditional natural language processing metrics, like Recall-Oriented Understudy for Gisting Evaluation (ROUGE) scores, are just too simple in their heuristics to accurately score LLMs. ROUGE is a set of metrics used to evaluate the quality of automatic summarization systems. There are multiple variations, including ROUGE-N, ROUGE-L, ROUGE-W, and more. One of the best approaches research teams have taken recently is to use humans as direct evaluators, but this too creates problems, not the least of which is the associated cost of manually scoring samples.

Because of this, most researchers are stuck implementing custom scoring functions for their particular task, often combining different metrics like BERTScore, ROUGE, and custom benchmarks. With code generation, you have the advantage of being able to use unit tests to evaluate whether the code works, and that is exactly what you'll be doing in this next exercise.

Your task is to build a pipeline that, given a description of a Python function and some associated unit tests, will generate an acceptable piece of code.

To test your pipeline, you'll use the following prompt template:

code_gen_template = """#INSTRUCTION: Write a Python function named {name} that {description}. Make sure to include all necessary imports. #RESPONSE """ code_gen_template_w_tests = """#INSTRUCTION: Write a Python function named {name} that {description}. Make sure to include all necessary imports. The function {name} will be evaluated with the following unit tests: {tests} #RESPONSE """

You'll also need some prompts and associated unit tests. The full code for the unit tests is available at this book's GitHub, but in general, the unit tests look like this:

class TestGenerateImage(unittest.TestCase): def test_valid_input(self): width, height = 200, 300 image = generate_image(f'{width}x{height}') self.assertEqual(image.size, (width, height))

They are accompanied by a variable containing all of the code for the unit tests as a string. You can store all of this information, along with your prompts, in a list like so:

TESTS = [ { "name": "generate_image(dimensions)", "description": "takes a string containing the dimensions of an image, like '200x300', and generates an image of those dimensions using 3 random colors, before finally returning the image object.", "tests": image_tests, "tests_class": TestGenerateImage }, { "name": "evaluate_expression(expression)", "description": "takes a string containing a mathematical equation, parses the equation, and returns its evaluated result.", "tests": math_tests, "tests_class": TestEvaluateExpression }, { "name": "merge_k_lists(lists)", "description": "takes an array of k linked-lists lists, each sorted in ascending order, and merges all the linked-lists into one sorted linked-list, returning the final sorted linked-list.", "test": merge_k_tests, "tests_class": TestMergeKLists } ]

Now, to perform inference, you'll need a pipeline, including nodes for your prompt and for evaluating your output, as shown in Listing 4.3.

Download

Machine Learning Upgrade: a Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure by Kehrer Kristen;Kaiser Caleb; & Caleb Kaiser.epub

Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.

Categories

other	Arts & Photography
Biographies & Memoirs	Business & Money
Calendars	Christian Books & Bibles
Comics & Graphic Novels	Computers & Technology
Cookbooks, Food & Wine	Crafts, Hobbies & Home
Education & Teaching	Engineering & Transportation
Health, Fitness & Dieting	Humor & Entertainment
Law	Lesbian, Gay, Bisexual & Transgender Books
Literature & Fiction	Medical Books
Mystery, Thriller & Suspense	Parenting & Relationships
Politics & Social Sciences	Reference
Religion & Spirituality	Romance
Science & Math	Science Fiction & Fantasy
Self-Help	Sports & Outdoors
Teen & Young Adult	Test Preparation
Travel	Children's Books
History