Large Language Models (1.1.0)
Large Language Models (1.1.0)

Function Calling

You can connect NIM to external tools and services using function calling (also known as tool calling). By providing a list of available functions, NIM can choose to output function arguments for the relevant function(s) which you can execute to augment the prompt with relevant external information.

Function calling is controlled using the tool_choice, tools, and parallel_tool_calls parameters. Only the following models support function calling, and only a subset of those models support parallel tool calling.

Model

Parallel Tool Calls Supported

Llama-3.1-8b-Instruct No
Llama-3.1-70b-Instruct No

To use function calling, modify the tool_choice, tools, and parallel_tool_calls parameters.

Parameter

Description

tool_choice Specifies how the model should choose tools. Has four options: "none", "auto", "required", or named tool choice. Requires that tools is also set.
tools The list of tool objects that define the functions the model can call. Requires that tool_choice is also set.
parallel_tool_calls Boolean value (True or False) specifying whether to make tool calls in parallel. Default is False. Requires that the model supports it.

tool_choice options

  • "none": Disables the use of tools.

  • "auto": Enables the model to decide whether to use tools and which ones to use.

  • "required": Forces the model to use a tool, but the model chooses which one.

  • Named tool choice: Forces the model to use a specific tool. It must be in the following format:

    Copy
    Copied!
                

    { "type": "function", "function": { "name": "name of the tool goes here" } }

Note: tool_choice can only be set when tools is also set, and vice versa. These parameters work together to define and control the use of tools in the model’s responses. For further information on these parameters and their usage, see the OpenAI API documentation.

These examples showcase various ways to use function calling with NIM:

  1. Basic Function Calling: Demonstrates how to use a single function with automatic tool choice.

  2. Multiple Tools: Shows how to provide multiple tools, including one without parameters.

  3. Forced Tool Usage: Illustrates how to force the model to use a specific tool.

  4. Parallel Tool Calling: Exemplifies how to use parallel tool calling with a supporting model.

1. Basic Function Calling

This example shows how to use a single function with automatic tool choice.

Copy
Copied!
            

from openai import OpenAI client = OpenAI(base_url="http://0.0.0.0:8000/v1", api_key="not-used") MODEL_NAME = "meta/llama-3.1-70b-instruct" # Define available function weather_tool = { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "format": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the user's location." } }, "required": ["location", "format"] } } } messages = [ {"role": "user", "content": "Is it hot in Pittsburgh, PA right now?"} ] chat_response = client.chat.completions.create( model=MODEL_NAME, messages=messages, tools=[weather_tool], tool_choice="auto", stream=False ) assistant_message = chat_response.choices[0].message messages.append(assistant_message) print(assistant_message) # Example output: # ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_abc123', function=Function(arguments='{"location": "Pittsburgh, PA", "format": "fahrenheit"}', name='get_current_weather'), type='function')]) # Simulate external function call tool_call_result = 88 tool_call_id = assistant_message.tool_calls[0].id tool_function_name = assistant_message.tool_calls[0].function.name messages.append({"role": "tool", "content": str(tool_call_result), "tool_call_id": tool_call_id, "name": tool_function_name}) chat_response = client.chat.completions.create( model=MODEL_NAME, messages=messages, tools=[weather_tool], tool_choice="auto", stream=False ) assistant_message = chat_response.choices[0].message print(assistant_message) # Example output: # ChatCompletionMessage(content='Based on the current temperature of 88°F (31°C) in Pittsburgh, PA, it is indeed quite hot right now. This temperature is generally considered warm to hot, especially if accompanied by high humidity, which is common in Pittsburgh during summer months.', role='assistant', function_call=None, tool_calls=None)

2. Multiple Tools

You can also define more than one tool for tools, including tools with no parameters, like the time_tool below.

Copy
Copied!
            

weather_tool = { # ... (same as in the previous example) } time_tool = { "type": "function", "function": { "name": "get_current_time_nyc", "description": "Get the current time in NYC.", "parameters": {} } } messages = [ {"role": "user", "content": "What's the current time in New York?"} ] chat_response = client.chat.completions.create( model="meta/llama-3.1-70b-instruct", messages=messages, tools=[weather_tool, time_tool], tool_choice="auto", stream=False ) assistant_message = chat_response.choices[0].message print(assistant_message) # Example output: # ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ # ChatCompletionMessageToolCall(id='call_ghi789', function=Function(arguments='{}', name='get_current_time_nyc'), type='function') # ]) # Process tool calls and generate final response as in the previous example

3. Named Tool Usage

This example forces the model to use a specific tool.

Copy
Copied!
            

chat_response = client.chat.completions.create( model="meta/llama-3.1-70b-instruct", messages=[{"role": "user", "content": "What's the weather in New York City like?"}], tools=[weather_tool], tool_choice={ "type": "function", "function": { "name": "get_current_weather" } }, stream=False ) assistant_message = chat_response.choices[0].message print(assistant_message) # Example output: # ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_jkl012', function=Function(arguments='{"location": "New York, NY", "format": "fahrenheit"}', name='get_current_weather'), type='function')]) # Process tool call and generate final response as in the previous examples

4. Parallel Tool Calling

Some models are able to respond with multiple tool calls in one message. This example demonstrates parallel tool calling using a model that supports it.

Copy
Copied!
            

chat_response = client.chat.completions.create( model="mistralai/Mistral-7B-Instruct-v0.3", messages=[{"role": "user", "content": "What's the weather and time in New York?"}], tools=[weather_tool, time_tool], tool_choice="auto", parallel_tool_calls=True, stream=False ) assistant_message = chat_response.choices[0].message print(assistant_message) # Example output: # ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ # ChatCompletionMessageToolCall(id='call_mno345', function=Function(arguments='{"location": "New York, NY", "format": "fahrenheit"}', name='get_current_weather'), type='function'), # ChatCompletionMessageToolCall(id='call_pqr678', function=Function(arguments='{}', name='get_current_time'), type='function') # ]) # Process multiple tool calls in parallel tool_results = [] for tool_call in assistant_message.tool_calls: if tool_call.function.name == "get_current_weather": # Simulate weather API call weather_result = "75°F" tool_results.append({"role": "tool", "content": weather_result, "tool_call_id": tool_call.id, "name": tool_call.function.name}) elif tool_call.function.name == "get_current_time": # Simulate time API call time_result = "2:30 PM EDT" tool_results.append({"role": "tool", "content": time_result, "tool_call_id": tool_call.id, "name": tool_call.function.name}) # Add tool results to messages messages.extend(tool_results) # Generate final response based on all tool call results # Note that not all models support parallel tool calls chat_response = client.chat.completions.create( model="mistralai/Mistral-7B-Instruct-v0.3", messages=messages, tools=[weather_tool, time_tool], tool_choice="auto", stream=False ) final_response = chat_response.choices[0].message print(final_response) # Example output: # ChatCompletionMessage(content="In New York, the current weather is 75°F (23.9°C), which is quite pleasant. It's not too hot or cold. The current time in New York is 2:30 PM EDT (Eastern Daylight Time). It's mid-afternoon there right now.", role='assistant', function_call=None, tool_calls=None)

Previous API Reference
Next Llama Stack API (Experimental)
© Copyright © 2024, NVIDIA Corporation. Last updated on Sep 9, 2024.