Building an AI-Powered SAP Ticket Triage Agent with Google GenKit and SAP-RPT-1

The Problem: In most SAP organizations, helpdesk teams manually categorize incoming tickets—assigning the right SAP module, setting priority levels, and flagging escalations. It’s repetitive work that slows down response times.

The Solution: Replace this manual triage with an AI agent.

In this guide, I’ll show you how to build an agent using Google GenKit and SAP-RPT-1 that automatically analyzes ticket descriptions and predicts module, priority, and escalation status. We’ll explore why a simple LLM prompt isn’t enough—and how a tool-based architecture makes this production-ready.

Note: This is a PoC-level implementation. Since most AI + SAP examples focus on CAP integrations, I wanted to explore a different orchestration framework. Let’s see what GenKit brings to the table!

Section 1: Your Development Toolkit & Setup

Step 1. Initialize Your Node.js & TypeScript Project

# Create and enter the project directory
mkdir my-genkit-app
cd my-genkit-app

# Initialize a Node.js project
npm init -y

# Set up the source directory and main file
mkdir src
touch src/index.ts

# Install and configure TypeScript
npm install -D typescript tsx
npx tsc --init

Step 2. Install Genkit and Dependencies

# Install Genkit CLI globally
npm install -g genkit-cli

# Install local project dependencies
npm install genkit @genkit-ai/google-genai dotenv

Step 3. Configure Your API Keys

Create .env file in the root directory:

# .env
GEMINI_API_KEY=your-actual-gemini-api-key-here
SAP_RPT_API_KEY=your-actual-rpt-sandbox-api-key-here

Get Your API Keys:

Gemini: https://aistudio.google.com/app/apikey
SAP-RPT-1: https://rpt.cloud.sap/dashboard

Why GenKit?

It is a lightweight, model-agnostic orchestration framework from Google. It’s perfect for implementing AI-powered experiences, provides built-in observability, and works across any cloud provider.

A Quick Note on SAP-RPT-1

Before we dive in, I need to give props to whoever at SAP decided to create the RPT-1 sandbox. Seriously—this person deserves a promotion.

Historically, the biggest barrier to experimenting with SAP’s AI/ML tools has been access. Everything was locked behind enterprise licenses, complex provisioning, or buried in AI Core documentation. You’d spend more time getting credentials than actually building anything.

SAP-RPT-1 flips this. You can log in to the sandbox with SSO or email, get an API key in 1 minute, and start making predictions.

What is SAP-RPT-1?

RPT-1 (Relational Pre-Trained Transformer) is SAP’s first open-source AI model, specifically designed for tabular data prediction. Unlike general-purpose LLMs that are trained on text, RPT-1 understands structured data—think spreadsheets, database tables, CSV files.

For our use case (ticket classification), this is perfect. We have historical tickets with columns like SAP_MODULE, PRIORITY, DESCRIPTION, and ESCALATION. RPT-1 can look at patterns in this structured data and predict missing values with surprising accuracy.

The model works by:

Taking your historical data as “context rows” (examples)
Taking your new ticket with [PREDICT] placeholders
Using transformer architecture to understand relationships between columns
Returning predictions based on learned patterns

Think of it as a specialized AI that’s really good at one thing: understanding how values in one column relate to values in other columns.

Why Use RPT-1 Instead of Just Gemini?

You might wonder: “Why bother with RPT-1 when Gemini can read tables too?” Valid question. Here’s why:

Token Efficiency: RPT-1 is designed for this. You’re not burning tokens on JSON formatting and structural overhead.
Specialized Training: It’s trained on relational data patterns, not just general text.
Consistent Predictions: For tabular data, RPT-1 gives more consistent, pattern-based results than trying to prompt-engineer a general LLM.

In practice, we’ll see that Gemini + RPT-1 as a tool beats Gemini-only approaches by a factor of 36x in token efficiency while improving accuracy.

Section 2: The Journey to an Intelligent Agent (in 3 Attempts)

Quick Note on GenKit Flows

Before we start, a quick word on GenKit’s core concept: Flows.

A Flow is a structured AI action with defined input/output schemas. You specify what goes in and what should come out, and GenKit handles the rest. This gives you type-safety, built-in UI for testing, and observability out of the box.

Basic structure:

export const myFlow = ai.defineFlow(
  {
    name: 'FlowName',
    inputSchema: z.object({ ... }),
    outputSchema: z.object({ ... }),
  },
  async (input) => {
    // Your logic here
    return output;
  }
);

That’s all you need to know for now. Let’s build.

Step 4. Attempt 1: The “Naive” LLM-Only Flow

File: src/index.ts

// src/index.ts - Attempt 1: Naive LLM-only approach
import { genkit, z } from 'genkit';
import * as dotenv from 'dotenv';
import { googleAI } from '@genkit-ai/google-genai';

dotenv.config();

const ai = genkit({
  plugins: [googleAI()],
  model: googleAI.model('gemini-2.5-flash', { temperature: 0.8 }),
});

// Input schema
const TicketInputSchema = z.object({
  description: z.string().describe('Ticket description'),
});

// Output schema
const TicketSchema = z.object({
  description: z.string(),
  sapModule: z.enum([
    "FI", "CO", "MM", "SD", "PP", "WM", "EWM",
    "HCM", "BW", "CRM", "PM", "QM", "BASIS",
    "SECURITY", "ADMIN"
  ]),
  priority: z.enum(["Critical", "High", "Medium", "Low"]),
  needEscalation: z.boolean(),
});

// Define the AI Flow
export const ticketPredictFlow = ai.defineFlow(
  {
    name: 'TicketPredictFlow',
    inputSchema: TicketInputSchema,
    outputSchema: TicketSchema,
  },
  async (input) => {
    const prompt = `Act as Service Desk expert.
Based on ${input.description} predict output schema fields (SAP Module, Priority).
If we need to act quickly raise escalation flag.`;

    const { output } = await ai.generate({
      prompt,
      output: { schema: TicketSchema },
    });

    if (!output) throw new Error('Failed to predict');
    return output;
  },
);

async function main() {}
main().catch(console.error);

Run it:

genkit start -- npx tsx --watch src/index.ts

GenKit comes with a built-in developer UI where you can test flows, debug tool calls, and trace execution in real-time. Navigate to localhost:4000 in your browser.

Test in Genkit UI:

{"description": "Can't login to the development system"}

Result:

Guessed sapModule: "BASIS"
Assigned priority: "Critical"

❌ The LLM lacks business context—it doesn’t know that “can’t login to DEV” is typically Medium priority, not Critical.

Step 5. Attempt 2: The “Token-Hungry” LLM + History Flow

The next logical step was to provide historical context to the LLM. I generated a 1,000-ticket history.json file to include in the prompt.

But here’s something that surprised me: JSON is actually not optimal for LLM prompts.

I know, I know—JSON is everywhere, it’s the standard for APIs, we use it without thinking. But when you’re feeding data into an LLM context window, JSON is incredibly wasteful. All those quotes, brackets, commas, and structural overhead? Those are tokens. And tokens cost money.

Think about it: {"name": "value"} uses way more tokens than just name: value. Multiply that by 1,000 rows of ticket data, and you’re burning through your token budget on syntax instead of actual content.

Enter TOON (Token-Oriented Object Notation). It’s a format specifically designed for LLM prompts—strips away JSON’s bloat while keeping data readable. The savings? 30-60% fewer tokens for the same data.

For our use case where we’re feeding 1,000 historical tickets on every request, this actually matters.

First, install TOON for token optimization:

npm install @toon-format/toon

Create a sample src/history.json file:

[
  {
    "TICKET_ID": "TKT_001",
    "SAP_MODULE": "BASIS",
    "PRIORITY": "Medium",
    "DESCRIPTION": "Cannot login to development system",
    "ESCALATION": false
  },
  {
    "TICKET_ID": "TKT_002",
    "SAP_MODULE": "FI",
    "PRIORITY": "Critical",
    "DESCRIPTION": "Payment posting failed in production",
    "ESCALATION": true
  },
  {
    "TICKET_ID": "TKT_003",
    "SAP_MODULE": "MM",
    "PRIORITY": "Low",
    "DESCRIPTION": "Purchase order print layout issue",
    "ESCALATION": false
  }
]

File: src/index.ts (Updated)

// src/index.ts - Attempt 2: LLM + Historical Context (inefficient)
import { genkit, z } from 'genkit';
import * as dotenv from 'dotenv';
import { googleAI } from '@genkit-ai/google-genai';
import { encode } from '@toon-format/toon';
import historicalDataJson from './history.json' with { type: 'json' };

dotenv.config();

const ai = genkit({
  plugins: [googleAI()],
  model: googleAI.model('gemini-2.5-flash', { temperature: 0.8 }),
});

// Convert historical data to TOON format (saves 30-60% tokens)
const historicalDataToon = encode(historicalDataJson);

const TicketInputSchema = z.object({
  description: z.string().describe('Ticket description'),
});

const TicketSchema = z.object({
  description: z.string(),
  sapModule: z.enum([
    "FI", "CO", "MM", "SD", "PP", "WM", "EWM",
    "HCM", "BW", "CRM", "PM", "QM", "BASIS",
    "SECURITY", "ADMIN"
  ]),
  priority: z.enum(["Critical", "High", "Medium", "Low"]),
  needEscalation: z.boolean(),
});

export const ticketPredictFlowWithHistory = ai.defineFlow(
  {
    name: 'TicketPredictFlowWithHistory',
    inputSchema: TicketInputSchema,
    outputSchema: TicketSchema,
  },
  async (input) => {
    const prompt = `Act as Service Desk expert.

Historical ticket data (TOON format):
${historicalDataToon}

Based on the historical patterns above and this new ticket description: "${input.description}"
Predict the SAP Module, Priority, and whether escalation is needed.`;

    const { output } = await ai.generate({
      prompt,
      output: { schema: TicketSchema },
    });

    if (!output) throw new Error('Failed to predict');
    return output;
  },
);

async function main() {}
main().catch(console.error);

Result:

⏱️ Time: 12.72 seconds
🪙 Tokens: 22,993 input tokens

Even with TOON optimization, this approach proved to be highly inefficient. Feeding 1,000 tickets to the LLM on every request is fundamentally the wrong architecture—no amount of token optimization can fix that. It’s slow, expensive, and doesn’t scale.

Section 3: The “Right Way” — LLM as Orchestrator, SAP-RPT-1 as Tool

After the token disaster in Attempt 2, I had to rethink the architecture. The problem wasn’t the prompt—it was the fundamental approach.

The Key Insight: Separation of Concerns

Instead of making the LLM do everything (understanding context + making predictions), what if we split responsibilities?

Gemini (LLM): Acts as the orchestrator—understands the user’s request, decides what needs to happen, formats the final response
SAP-RPT-1: Acts as the specialist tool—handles the actual prediction based on historical patterns

This is analogous to how you’d structure a real team. You don’t ask a senior architect to also manually check every ticket priority. The architect decides “we need a priority prediction for this ticket,” then delegates to a specialist who has the historical knowledge.

How SAP-RPT-1 API Actually Works

RPT-1 is designed specifically for this pattern. Here’s the mental model:

You send a POST request to https://rpt.cloud.sap/api/predict with a body containing:

Context Rows: Your complete historical data (all 1,000 tickets with known values)
Query Row: Your new ticket, but with [PREDICT] placeholders for fields you want the model to fill in

Example query row:

{
  "TICKET_ID": "TKT_NEW",
  "SAP_MODULE": "[PREDICT]",     // ← RPT-1 will predict this
  "PRIORITY": "[PREDICT]",        // ← RPT-1 will predict this
  "DESCRIPTION": "Can't login",   // ← We provide this
  "ESCALATION": "[PREDICT]"       // ← RPT-1 will predict this
}

The model looks at patterns in your context rows and predicts the missing values. It’s like asking: “Given all these examples where Description X correlated with Module Y and Priority Z, what should the values be for this new description?”

Why This Architecture Wins

Compare the two approaches:

Attempt 2 (Bad):

User Request → LLM with 1,000 tickets in prompt → Prediction
              ↑
        (22,993 tokens, 12.7s, $$$)

Attempt 3 (Good):

User Request → LLM decides: "I need a prediction"
            → LLM calls RPT-1 tool (sends 1,000 tickets to specialized API)
            → RPT-1 returns prediction
            → LLM formats response
              ↑
        (639 tokens, 9.3s, $)

The historical data never enters the LLM’s context window. It goes directly to the specialized prediction model that was built for this exact job.

GenKit’s Role: Tools

GenKit makes this pattern easy through its “Tools” concept. A tool is just a function that:

Has a defined input/output schema
Can be called by the LLM during generation
Returns structured data back to the LLM

When you give the LLM access to tools, it automatically decides when to use them based on the prompt. You’ll see this in the GenKit UI trace—the LLM literally “thinks”: “I need to predict ticket properties, I should call the predictTicket tool.”

This is the magic of agentic AI: the LLM becomes a coordinator, not a do-everything oracle.

How to Build the SAP-RPT-1 Tool

Now let’s implement this. We’ll create a GenKit tool that wraps the RPT-1 API call:

File: src/index.ts (Final Version)

// src/index.ts - Final: LLM Orchestrator + SAP-RPT-1 Tool
import { googleAI } from '@genkit-ai/google-genai';
import { genkit, z } from 'genkit';
import * as dotenv from 'dotenv';

dotenv.config();

const ai = genkit({
  plugins: [googleAI()],
  model: googleAI.model('gemini-2.5-flash', {
    temperature: 0.8,
  }),
});

// ===== SCHEMAS =====
const TicketInputSchema = z.object({
  description: z.string().describe('Ticket description')
});

const TicketSchema = z.object({
  description: z.string(),
  sapModule: z.enum([
    "FI", "CO", "MM", "SD", "PP", "WM", "EWM",
    "HCM", "BW", "CRM", "PM", "QM", "BASIS",
    "SECURITY", "ADMIN"
  ]),
  priority: z.enum(["Critical", "High", "Medium", "Low"]),
  needEscalation: z.boolean()
});

// ===== SAP-RPT-1 TOOL =====
export const predictTicketWithSAPRPT1 = ai.defineTool({
  name: 'predictTicket',
  description: 'Predicts SAP service ticket properties using SAP-RPT-1 AI model',
  inputSchema: TicketInputSchema,
  outputSchema: TicketSchema
}, async (input) => {
  try {
    // Load historical data (in production, this could come from a database)
    const historicalDataJson = (await import('./history.json', {
      with: { type: 'json' }
    })).default;

    // Create prediction row with [PREDICT] markers
    const predictionTicket = {
      TICKET_ID: "TKT_PREDICT",
      SAP_MODULE: "[PREDICT]",
      PRIORITY: "[PREDICT]",
      DESCRIPTION: input.description,
      ESCALATION: "[PREDICT]",
    };

    // Build request: historical context + new ticket
    const body = {
      rows: [...historicalDataJson, predictionTicket],
      index_column: "TICKET_ID",
    };

    // Call SAP-RPT-1 API
    const response = await fetch('https://rpt.cloud.sap/api/predict', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${process.env.SAP_RPT_API_KEY}`,
      },
      body: JSON.stringify(body)
    });

    if (!response.ok) {
      throw new Error(`SAP-RPT API error: ${response.status} ${response.statusText}`);
    }

    const data = await response.json();
    const prediction = data.prediction.predictions[0];

    // Map RPT-1 response to our schema
    return {
      description: input.description,
      sapModule: prediction.SAP_MODULE[0].prediction,
      priority: prediction.PRIORITY[0].prediction,
      needEscalation: prediction.ESCALATION[0].prediction === "1.0" ? true : false
    };

  } catch (error) {
    console.error('Error calling SAP-RPT API:', error);
    throw new Error(`Failed to get prediction from SAP-RPT: ${error instanceof Error ? error.message : 'Unknown error'}`);
  }
});

// ===== MAIN AGENT FLOW =====
export const serviceTicketAgent = ai.defineFlow(
  {
    name: 'ServiceTicketAgent',
    inputSchema: TicketInputSchema,
    outputSchema: TicketSchema,
  },
  async (input) => {
    const prompt = `Act as Service Desk expert. Analyze ticket description: ${input.description}.

Predict and return JSON with:
- SAP Module (FI, CO, MM, SD, etc.)
- Priority (Critical, High, Medium, Low)
- requiresEscalation: true if downtime, data loss, security risk, or critical response needed

Use historical patterns for consistent routing decisions.`;

    // Genkit automatically decides when to call the tool
    const { output } = await ai.generate({
      prompt,
      output: { schema: TicketSchema },
      tools: [predictTicketWithSAPRPT1]
    });

    if (!output) throw new Error('Failed to predict');
    return output;
  },
);


async function main() {

}

main().catch(console.error);

Complete File Structure

my-genkit-app/
├── src/
│   ├── index.ts          # Main code
│   └── history.json      # Historical tickets
├── .env                  # API keys (don't commit!)
├── .gitignore           # Add .env here
├── package.json
└── tsconfig.json

Results Comparison

Attempt	Time	Input Tokens	Cost/Request	Accuracy
1: Naive LLM	~3s	150	$	⭐⭐
2: LLM + History	12.7s	22,993	$$$	⭐⭐⭐
3: LLM + RPT-1 Tool	9.3s	639	$	⭐⭐⭐⭐

Winner: Attempt 3 - 36x more token-efficient than Attempt 2, with better accuracy!

When This Approach Makes Sense

Use This Architecture When:

You have historical tickets for training context
Historical data have good quality
You need consistent, explainable predictions
Token costs matter (high volume environments)

Don’t Use This For:

One-off tickets without historical patterns
Unclear or constantly changing categories
Simple keyword-based routing (regex is cheaper!)

Useful Links

SAP-RPT-1 Sandbox: https://rpt.cloud.sap/dashboard
SAP-RPT-1 GitHub: https://github.com/SAP-samples/sap-rpt-1-oss
SAP-RPT-1 Product Page: https://www.sap.com/products/artificial-intelligence/sap-rpt.html
GenKit Documentation: https://firebase.google.com/docs/genkit

Found this helpful? Share your thoughts in the comments or connect with me to discuss AI integration.