Conversation is one of the highest forms of intelligence. Once characters in virtual worlds can hold intelligent conversations, we naturally expect them to show other intelligent behaviors, such as showing the right body language or taking actions aligned with what they are saying and what’s happening around them. Characters become more lifelike when their conversations have meaningful consequences that involve responsive actions and behaviors.
To truly bring a character to life, it must understand the environment, communicate, and act. Multimodal models have opened the doors to perception, making it possible for these characters to interpret and interact with their world in more nuanced ways. However, for an experience within a 3D realm to resonate with genuine realism, it is required that these characters exhibit dynamic actions.
In this blog post we explore the mechanics behind NPC actions, the impact of modern AI technologies in making NPCs more dynamic, and the process of configuring character actions using the Convai Actions feature within a game engine.
Understanding Actions in Non-Playable Characters (NPC)
Actions allow characters to execute commands or requests from you (the user) or other characters in the virtual world if they align with the character's abilities. In a game, you could ask an NPC to bring you a specific weapon from the storage area, ask the in-game merchant to show you various objects from the shelf, or request a teammate to guard an area. It’s all about helping a character understand its environment and carry out complex actions.
Gamers have experienced Non-Player Characters (NPCs) that can carry out various animations and sequences that seem logical to their behavior. Let’s understand the traditional approaches to achieving NPC actions.
Approaches to Achieving NPC Actions
Traditionally, you can use various methods to bring NPC actions to life within a game. Each path has unique advantages and caters to different aspects of NPC behavior, complexity, and game design requirements. Let’s look at three:
- Scripted Behaviors, where game developers programmed NPCs with scripted behaviors to deliver pre-set responses. These scripts define a range of responses and actions based on specific triggers or conditions within the game.some text
- Instance: If a player approaches an NPC, the NPC should deliver a pre-set line of dialogue.
- Decision Trees: This approach gives NPCs a rudimentary decision-making capability, making their behavior more dynamic than simple scripted responses.some text
- Instance: An NPC guard in a stealth game uses a decision tree to react to noises—investigating loudly and pausing for soft sounds—demonstrating dynamic behavior based on game events.
- Finite State Machines (FSMs): This approach manages an NPC's state (idle, alert, and attack) and transitions between these states based on game events. With this method, you have a structured way to handle complex behaviors, which makes NPC actions more predictable and accessible to debug.some text
- Instance: In a stealth game, an NPC guard transitions between patrolling, investigating noises, alerting on player detection, and returning to patrol, which demonstrates adaptive behavior.
- Goal-Oriented Action Planning (GOAP): NPCs can achieve goals by planning a series of actions. An NPC using GOAP evaluates its environment, sets a goal (e.g., patrol an area, investigate a noise), and calculates a sequence of actions to achieve this goal. This approach allows for more flexible and intelligent NPC behavior.
- Behavioral Trees: This approach is a more advanced and flexible version of decision trees and FSMs. They structure NPC decision-making into a hierarchy of tasks, which can be sequences, selections, or more complex behaviors. This system is widely used for modeling complex AI behaviors in games, as it allows for easy adjustments and scaling.
For these traditional approaches, the limitation has always been that the NPC behaviors are hardcoded and can only adapt to a few handcrafted scenarios. Programming such behaviors is time-intensive, can only provide a certain level of immersion, and reveal their hardcoded nature upon deeper inspection by the gamer.
Moving from Traditional NPC Actions to LLM-Driven NPCs
You probably saw it at CES 2024: Video games have evolved into complex, immersive worlds. There is now a demand for NPCs that offer depth, realism, and unpredictability—. something that traditional approaches do not offer. This demand has shifted towards AI-driven NPCs capable of exhibiting behaviors that adapt and change in response to player actions and game dynamics.
At the core of this evolution is the concept of 'Actions'—the specific behaviors or tasks that NPCs can perform within the game environment. In the context of a game, for instance, users can issue commands like "fetch me a jetpack" or "aid me in battle” or show me a dance move." These commands showcase the NPCs' abilities to understand and react, leading to an improved gaming experience.
The Importance of LLM-Driven Actions in AI NPCs
Actions are the specific behaviors or tasks that NPCs can perform within the game environment. Actions in traditional game design were limited to a finite set of hardcoded options. For instance, a shopkeeper NPC in a role-playing game might only greet the player, sell items, and provide specific information when prompted. Fast forward to today, and AI-driven NPCs can perform many actions beyond these static interactions.
These NPCs can remember past encounters with the player, express emotions based on the game's events, and even engage in complex decision-making processes that lead to unforeseen outcomes.
To facilitate a clearer discussion regarding actions throughout this guide, we introduce the following terms:
- Atomic Actions: These are fundamental actions executed in their entirety. On the implementation front, game engines like Unreal or Unity are equipped to manage these atomic actions. Examples include Move, Dance, PickUp, and Drop.
Complex Actions: These actions materialize by combining multiple atomic actions. For instance, a command such as "fetch me a jetpack" might be executed through a series of atomic actions: [Move to Jetpack, PickUp Jetpack, Move to User, Drop Jetpack].
In the next section, you will see how we think about AI NPC actions at Convai and why our new Actions feature is what you need to create dynamic characters for your virtual worlds.
How We Think About Actions Within Convai
At Convai, we are excited to launch our Action feature that integrates LLMs into NPCs' actions in a virtual world or game engine. We incorporated advances in robotics, task planning methodologies, and different LLM function-calling approaches to discover the best solution for complex actions in 3D worlds. The Convai Actions feature uses the rich metadata from the game engine to enable perception and actions for the NPC.
Perception is the NPC understanding the scene from either camera inputs or metadata and
properly planning the correct set of tasks required for a complex action.
When an NPC encounters a scenario requiring action, the feature generates a series of potential actions relevant to the situation. These actions are filtered to determine if the character can feasibly execute the action, the state of mind of the action ('thoughts,' 'emotions,' and 'intentions'), and the narrative framework to ensure consistency and relevance. This results in the NPC executing a contextually appropriate (and potentially unpredictable) action to improve the game's immersion and dynamism.
Let’s understand how it works.
Actions Workflow With Convai
Configuring Actions within Convai, whether programmatically or via the website, initiates a comprehensive workflow designed for dynamic and authentic NPC engagement.
Upon submitting an action request, Convai uses a detailed action decision-making process. It begins when you submit an action request, which includes the desired action, contextual scene information, and, optionally, a predefined list of actions.
Here’s how Convai processes each request:
- Action and Scene Understanding: Upon receiving a request from the user, Convai uses scene information and other preconfigured metadata to construct an understanding about the Scene and list of possible Actions. This allows for a dynamic response that reflects the character's immediate situation and capabilities, going beyond static database information.
- Assessing Physical Feasibility: Convai evaluates if the requested action aligns with the character's physical capabilities, considering factors such as physics and the character's inherent abilities. some text
- For example, if a character is not programmed for actions like 'Climb,' these are immediately identified as unfeasible. This step also plans the steps necessary for action execution to ensure that each proposed action is actionable and logically sequenced. Actions unsupported by the character's programmed abilities (e.g., 'Climb') are flagged as unachievable. This step also determines the sequence of steps needed to perform the action.
- For example, if a character is not programmed for actions like 'Climb,' these are immediately identified as unfeasible. This step also plans the steps necessary for action execution to ensure that each proposed action is actionable and logically sequenced. Actions unsupported by the character's programmed abilities (e.g., 'Climb') are flagged as unachievable. This step also determines the sequence of steps needed to perform the action.
- Mind State and Personality Assessment: Beyond physical capabilities, the character's psychological profile—incorporating traits like agreeableness and adherence to ethical guidelines— plays a crucial role in determining their willingness to act. some text
- For instance, if an NPC is programmed to adhere to Asimov's Laws of Robotics, it should automatically decline commands that involve harming another player or user. Another character busy with a high-priority task might decline a new action request, showcasing the system's ability to prioritize based on the character's current focus and personality traits.
- For instance, if an NPC is programmed to adhere to Asimov's Laws of Robotics, it should automatically decline commands that involve harming another player or user. Another character busy with a high-priority task might decline a new action request, showcasing the system's ability to prioritize based on the character's current focus and personality traits.
- Feasibility and Execution: Upon evaluating the feasibility of the requested action, the server will proceed as follows:some text
- Feasible Actions: For actions deemed physically and psychologically viable, Convai outlines the necessary steps for execution.
- Infeasible Actions: If an action is not possible, Convai returns a null response for the action steps.
In both cases, Convai communicates the decision through a verbal response, ensuring you are informed about the action's outcome and the rationale behind it.
This structured approach allows for nuanced interactions with NPCs, enhancing the gaming experience by accounting for a range of factors that influence NPC behavior and decision-making.
Configuring Actions with Convai
Actions can be configured and activated through the Convai user interface (UI) or API. When you activate it for a character, Actions enables you to issue commands that prompt the character to perform specific tasks.
Actions can be configured for a character in either of the two ways:
Through the Convai User Interface (Website)
You can easily set up actions for each character through the UI on the Convai website. It provides a straightforward way to configure basic actions. Simply select a character and then navigate to the Actions tab.
Let’s see how to do it in 5 steps with your Convai Character:
- Step 1: Within your Character menu, find the Actions tab
- Step 2: Toggle the Enable Action Generation button on
- Step 3: Enter a few pre-determined actions you want the character to take
- Step 4: Click the Update button to refresh your character
Step 5: Prompt it to execute the chosen action.
Those are for atomic actions. If you want to add complex action sequences to your Character, enter them in the Action/State box and click 'Update':
Dynamic Action Configuration through the Game Engine
Alternatively, you can configure actions dynamically via the game engine for real-time interaction with the game environment.
If it is practical, whenever you issue a request that entails an action, Convai's NPC engine will produce a sequence of atomic actions. The game engine can then execute these actions to fulfill the desired task. This approach not only improves the NPC's responsiveness but also:
- Augments the Action List: Importantly, you can augment the action list with characters and objects from the scene or game, granting characters the ability to interact with various elements within the game world. Note that this feature enriches interaction with objects and other characters. It is currently exclusive to game engine configurations and unavailable through the website interface.
- Scene Perception-Based Actions: This method significantly benefits from considering scene context, interactive objects, other characters, and the player's focus, markedly improving NPC responsiveness to context-sensitive commands, such as "pick that up." This feature is notable for its contribution to creating a more immersive gameplay experience by emphasizing the NPC's perception of the scene to determine actions.
- Flexibility and Integration: Configure NPC actions directly within your preferred game engine. For detailed steps on implementing dynamic configurations, refer to our dedicated documentation for Unity and Unreal Engine setups.
Key Takeaways: Integrating Dynamic NPC Actions for Game Development with Convai
Let’s recap the key points on understanding and implementing NPC Actions:
Large Language Models (LLMs) and Robotic Task Planning are Improving NPC Interactivity
Convai integrates robotic task planning algorithms and different function-calling approaches into LLMs to discover the best solution possible for complex actions in 3D world, which significantly enhances NPC behavior. NPCs are now more dynamic, responsive, and realistic. They can interpret player actions and environmental cues more effectively, leading to more immersive and engaging gameplay experiences.
Convai’s Dual Configuration Approach: Web UI and API
The Convai platform allows you to configure static and dynamic NPC actions, so you create NPCs that follow predefined behaviors (from the web UI) and respond adaptively to the game environment and player interactions (through APIs and game engines). This dual approach ensures that NPCs exhibit consistent behaviors while adapting to context-specific player inputs and environmental cues for a richer gameplay experience.
See how Actions work in Convai
Check out our "AI NPCs Take Action from Conversation in Unreal Engine" tutorial. This guide on using AI NPCs to take actions from conversations in Unreal Engine illustrates how NPCs can actively interpret player dialogue and environmental cues to decide on the most appropriate actions—offering assistance, providing information, or guiding players to resources.
Following this tutorial, you'll learn how to combine NPC animations with Convai-configured actions, creating dynamic and immersive gameplay experiences. As demonstrated in our collaboration with NVIDIA, well-crafted actions are pivotal in engaging players and bringing the game world to life.
Integrating Convai’s dynamic action configuration opens up new dimensions of NPC interactivity to make your game worlds more alive and responsive to the your journey.