Function Calling: Receiving structured data from LLMs
In the previous Function Calling example, Converting Natural Language into Structured Function Parameters, we used the power of LLMs to parse a natural-language based user input into structured data to use by our own personal API.
However, as LLMs have become more and more powerful, they can now be used as the backend processing engine itself. Imagine you have an app that lists out movies but instead of providing the full plots, which can be long and hard to read on mobile, you want to provide the key points of the plot. If you do a basic completion request asking for key points of the movie Matrix, for example, you will get text similar to this:
"1. Introduction to the Matrix and Key Characters: The plot begins with Trinity exhibiting superhuman abilities and evading capture by the police and Agents, setting the stage for the introduction of the Matrix. Thomas Anderson, also known as Neo, is introduced as a computer programmer and hacker who is searching for answers about the Matrix. Key characters include Trinity, Neo, Morpheus, and Agent Smith. 2. Neo's Awakening: Neo is led to Morpheus, who offers him a choice between two pills. Choosing the red pill leads Neo to awaken in a real-world dystopia where humans are harvested by machines. Neo learns that the Matrix is a simulated reality created by intelligent machines to pacify and control humans while their bodies are used as an energy source. 3. Training and Prophecy: Neo undergoes rigorous virtual training aboard Morpheus's ship, the Nebuchadnezzar, learning to manipulate the Matrix. Morpheus believes Neo is "the One," prophesied to liberate humanity. The Oracle, a prophet within the Matrix, subtly suggests otherwise but indicates Neo will have to make a crucial choice involving Morpheus's life. 4. Betrayal and Capture: Crew member Cypher betrays Morpheus's team, leading to a confrontation with the Agents. Morpheus is captured and interrogated for access codes to Zion's mainframe. Cypher's betrayal results in the deaths of several crew members, but Tank saves Neo and Trinity, allowing them to plan a rescue mission for Morpheus. 5. Neo's Transformation and Liberation: Neo and Trinity rescue Morpheus, with Neo demonstrating growing confidence and control over the Matrix, performing feats on par with the Agents. Neo is seemingly killed but is revived by Trinity's love, realizing his full potential as "the One". He defeats Agent Smith, escapes the Matrix, and vows to expose its reality to other humans, symbolized by his concluding act of flying away, representing his newfound freedom and power."
This is really great, but it is all one text block. As Apple Developers, we will want to style our code so that it shows each Key Point more beautifully, maybe with a screen that only shows the title of each key point and a detail view when the user clicks on the key point. We may also want to include user interactions. Maybe the user can star their favorite key point or rearrange them. Maybe they can even go deeper into each key point and have a discussion about it either with the AI or other users.
Sure - you can parse the text and take out the individual components, but the LLM does not guarantee that the text will be returned in the exact same format every single time. Although if you use a Few-Shot Prompting technique to set the exact return text, you will have more success. However, to process this much text along with the entire plot summary will take up a lot of tokens and is not ideal.
Instead, you can use Function Calling to get the same information as a JSON object. In this case, there is not actual local function to be called. The LLM response is the end result that you will display to the user. But since AI expects a function to be there, and we have to describe a function to the LLM as part of the function calling process, we will make up a function that adds a key point to our database. Maybe something like:
// Note: keep the function naming closer to what would be used in a web-based API or SQL databases to match the LLM training data.
func addMovieKeyPointToDB(_ keyPoint: String, description: String) {
// this function does not exist in our app, but we pretend that it does for the purpose of using function calling to get a JSON response of the function parameters.
}
Our app will be providing movie plot summaries from Wikipedia to the LLM for key point extraction. In other word, there will be no inconsistent data (such as with user input as in the previous example). So the parameters of our addMovieKeyPointToDB
function can be required output for the LLM.
The object for the addMovieKeyPointToDB
function parameters could be set as follows:
struct AddMovieKeyPointFunctionParameters: Codable, Hashable, Sendable {
let keyPoint: String
let description: String
}
We will now use the AddMovieKeyPointFunctionParameters
object to set our JSON schema. In this case, we expect an array of Movie Key Points:
import OpenAI
import CorePersistence
do {
let movieKeyPointParameterSchema: JSONSchema = try JSONSchema(
type: AddMovieKeyPointFunctionParameters.self,
description: "Key Points of a Movie Plot",
propertyDescriptions: [
"keyPoint": "A short few-word overview of the key point of the movie plot",
"description": "A paragraph with more specific details about the key point.",
],
required: true
)
} catch {
print(error)
}
Since we want an array of key points as a response back from the LLM, we will now set the properties of the call function with an array of movieKeyPointSchema
objects:
let movieKeyPointPoperties: [String : JSONSchema] =
["movie_key_points_parameters" : .array(movieKeyPointParameterSchema)]
We will also set the decodable object with the same movie_key_points_parameters
key (note this will be converted automatically to camel case for Swift):
struct MovieKeyPointsResult: Codable, Hashable, Sendable {
let movieKeyPointsParameters: [AddMovieKeyPointFunctionParameters]
}
Now we are ready to specify the function call:
let addMovieKeyPointsFunction = AbstractLLM.ChatFunctionDefinition(
name: "add_movie_key_points_to_db",
context: "Adds key points of a movie plot to the database",
parameters: JSONSchema(
type: .object,
description: "Movie Plot Key Points",
properties: movieKeyPointPoperties
)
)
Our messages will include the system prompt giving the LLM the job of extracting up to 5 key points from the given movie plot, and a movie plot description:
let messages: [AbstractLLM.ChatMessage] = [
.system {
"You will be provided with a detailed movie plot. Your task is to identify and extract up to 5 key points from the plot. For each key point, you need to provide two pieces of information: 1) Key Point: A concise and descriptive phrase (a few words) summarizing the key point. 2) Description: A paragraph explaining the significance and context of the key point within the plot. Ensure that each key point captures a critical element of the movie, such as important events, character developments, or major twists. The descriptions should provide enough detail to understand why each key point is significant to the overall story."
},
.user {
// example: Matrix plot summary from Wikipedia https://en.wikipedia.org/wiki/The_Matrix
moviePlotText
}
]
Finally, the function can be called:
let client = OpenAI.Client(apiKey: "YOUR_API_KEY")
let functionCall: AbstractLLM.ChatFunctionCall = try await client.complete(
messages,
functions: [addMovieKeyPointsFunction],
as: .functionCall
)
let result = try functionCall.decode(MovieKeyPointsResult.self)
print(result.movieKeyPointsParameters)
The Full Function Call
The final code will be as follows:
import OpenAI
import CorePersistence
let client = OpenAI.APIClient(apiKey: "YOUR_API_KEY")
let messages: [AbstractLLM.ChatMessage] = [
.system {
"You will be provided with a detailed movie plot. Your task is to identify and extract up to 5 key points from the plot. For each key point, you need to provide two pieces of information: 1) Key Point: A concise and descriptive phrase (a few words) summarizing the key point. 2) Description: A paragraph explaining the significance and context of the key point within the plot. Ensure that each key point captures a critical element of the movie, such as important events, character developments, or major twists. The descriptions should provide enough detail to understand why each key point is significant to the overall story."
},
.user {
// example: Matrix plot summary from Wikipedia https://en.wikipedia.org/wiki/The_Matrix
moviePlotText
}
]
struct AddMovieKeyPointFunctionParameters: Codable, Hashable, Sendable {
let keyPoint: String
let description: String
}
struct MovieKeyPointsResult: Codable, Hashable, Sendable {
let movieKeyPointsParameters: [AddMovieKeyPointFunctionParameters]
}
do {
let movieKeyPointParameterSchema: JSONSchema = try JSONSchema(
type: AddMovieKeyPointFunctionParameters.self,
description: "Key Points of a Movie Plot",
propertyDescriptions: [
"keyPoint": "A short few-word overview of the key point of the movie plot",
"description": "A paragraph with more specific details about the key point.",
],
required: true
)
let movieKeyPointPoperties: [String : JSONSchema] =
["movie_key_points_parameters" : .array(movieKeyPointParameterSchema)]
let addMovieKeyPointsFunction = AbstractLLM.ChatFunctionDefinition(
name: "add_movie_key_points_to_db",
context: "Adds key points of a movie plot to the database",
parameters: JSONSchema(
type: .object,
description: "Movie Plot Key Points",
properties: movieKeyPointPoperties
)
)
let functionCall: AbstractLLM.ChatFunctionCall = try await client.complete(
messages,
functions: [addMovieKeyPointsFunction],
as: .functionCall
)
let result = try functionCall.decode(MovieKeyPointsResult.self)
print(result.movieKeyPointsParameters)
} catch {
print(error)
}
The result will look something like this:
[AddMovieKeyPointFunctionParameters(
keyPoint: "Trinity\'s Escape and Police Encounter",
description: "At the beginning of the movie, Trinity, a key member of the resistance, is cornered in an abandoned hotel by a police squad. She displays superhuman abilities to overpower them and flees, pursued by the police and sentient programs known as Agents. Her escape and the subsequent chase introduce the concept of the Matrix and the enhanced abilities its inhabitants can have."),
AddMovieKeyPointFunctionParameters(
keyPoint: "Neo\'s Choice and Awakening",
description: "Thomas Anderson, or Neo, a computer programmer intrigued by the Matrix, is offered a choice by Morpheus between a red pill, which will reveal the truth about the Matrix, and a blue pill, which will allow him to forget everything and return to his life. Choosing the red pill, Neo\'s reality distorts and he awakens in a liquid-filled pod, discovering the true nature of his existence and the world dominated by machines, setting the stage for his transformation and the central conflict of the story."),
AddMovieKeyPointFunctionParameters(
keyPoint: "Morpheus\' Capture and Betrayal by Cypher",
description: "During a mission to meet the Oracle, Morpheus is captured by the Agents after a crew member named Cypher betrays him, hoping to return to the illusionary comfort of the Matrix. Morpheus\' capture leads to heightened stakes as the survival of the rebellion\'s leader and their cause hangs in balance, prompting Neo and Trinity to plan a daring rescue."),
AddMovieKeyPointFunctionParameters(
keyPoint: "Neo\'s Resurrection and Newfound Powers",
description: "After seemingly being killed by Agent Smith, Neo is revived following Trinity\'s confession of love, reflecting the prophecy that she would fall in love with \'the One.\' Neo\'s resurrection and the discovery of his ability to see and manipulate the Matrix\'s code mark his full transformation into \'the One,\' equipped to challenge the machines\' dominance."),
AddMovieKeyPointFunctionParameters(
keyPoint: "Neo\'s Promise and Emergence as The One",
description: "Ending the movie, Neo, having defeated the Agents and with his new abilities, promises to the machines to show the people trapped in the Matrix \'a world where anything is possible.\' His powerful exit, flying away, underscores his evolution into a nearly omnipotent force within the Matrix and sets the stage for further confrontations with the machine overlords.")]
Using function calling to receive a structured JSON response from an LLM as an end in itself not only ensures consistent data format but also enhances data handling efficiency. This approach allows for greater flexibility in customizing the user interface and improving the overall user experience. With this technique, you can tailor the data display to suit your app's design aesthetics and functionality, thereby providing a more engaging and seamless user experience for your app users.