User Guide — Servantica AI Assistant

🚀 1. Getting Started

Open the Servantica AI Assistant at /assistant.html (or /voice-assistant).

1

Login — Enter your Username & Password (or paste a Bearer Token) in the auth card at the top, then click 🔗 Login.

2

Start chatting — Type a message in the text bar at the bottom and press Enter, or tap the 🎤 mic button to speak.

3

Get responses — The AI will reply with text and optionally speak the answer aloud via TTS.

💡 Tip: You can log out anytime with the 🚪 Logout button. Your session will be cleared.

💬 2. AI Chat (Text)

The assistant uses GPT-4o-mini for intelligent conversation. It remembers context within a session.

Try saying:

"What is Spring Boot?"
"Explain microservices architecture"
"Write a Java function to reverse a string"
"What can you do?" — shows all capabilities

Multi-turn Session memory Project-aware

🎤 3. Voice Assistant

Speak into your microphone and get both text and spoken responses.

1

Tap the 🎤 mic button next to the text input bar.

2

The voice recording panel opens with a waveform visualizer and timer.

3

Speak your command. Recording auto-stops after ~2 seconds of silence (VAD).

4

Choose a voice (alloy, echo, fable, onyx, nova, shimmer) from the voice selector.

Whisper STT OpenAI TTS 6 Voices WAV/MP3/WEBM

🗣️ 4. Wake Word — "Hey Servantica"

Enable the Hey Servantica toggle for hands-free activation. The assistant listens continuously and auto-starts recording when it hears the wake word.

Recognized variants (24 total):

"Hey Servantica" · "Hi Servantica" · "OK Servantica"
"Hey Servant" · "Hey Servantika" · "Hey Cervantic"

🎧 A green status bar at the top shows when wake-word listening is active.

📱 5. Send SMS

Send or schedule SMS messages to any phone number via natural language.

Examples:

"Send SMS to +917259510747 with message Hello from Servantica"
"Text +1234567890 saying Your order is ready"
"Schedule SMS to +917259510747 at 2026-03-22T10:00:00 with message Meeting reminder"

📡 SMS is delivered via the Servantica Notification Service (Twilio).

📞 6. Make a Call

Initiate voice calls with a spoken message to any phone number.

Examples:

"Call +917259510747 and say Your appointment is confirmed"
"Make a call to +1234567890 saying Please call back"
"Schedule a call to +917259510747 at 2026-04-15T14:00:00 and say Reminder"

📧 7. Send Email

Send emails with a subject and body through natural language commands.

Examples:

"Send email to user@example.com with subject Welcome and body Thanks for joining"
"Email anand@servantica.in subject Project Update body The sprint is complete"

🎨 8. Image Generation (DALL·E)

Generate images from text descriptions. Images appear inline in the chat and can be downloaded as PNG.

Examples:

"Generate an image of a sunset over mountains"
"Create a picture of a futuristic city at night"
"Draw a cartoon robot holding a coffee cup"

🖼️ Click the generated image to view it full-size in a new tab. Images are 1024×1024 PNG.

🎬 9. Video Generation (Sora-2)

Generate videos from text prompts using OpenAI's Sora-2 model. Video generation is asynchronous.

Examples:

"Create a video of ocean waves crashing on a beach"
"Generate a video of a rocket launching into space"

🌤️ 10. Live Weather

Ask about the weather in any city and get real-time forecasts.

Examples:

"What's the weather in Bangalore?"
"Weather forecast for New York"
"Is it raining in London?"

📰 11. Latest News

Get the latest headlines on any topic.

Examples:

"What's the latest news on AI?"
"Top headlines in technology"
"News about space exploration"

🔍 12. Web Search

Search the web for any topic and get summarized results.

Examples:

"Search the web for Java 21 new features"
"Google Spring Boot best practices 2026"

📝 13. Voice Notes

Save, list, and delete notes using voice or text. Notes appear in the slide-in side panel.

Examples:

"Save a note: Buy groceries tomorrow"
"Show my notes"
"Delete all notes"

📌 Click the floating notes button (bottom-right) to toggle the notes panel.

⏰ 14. Reminders

Set, list, and cancel reminders via natural language. You'll get a notification when it's time.

Examples:

"Remind me to call John at 3 PM"
"Set a reminder for meeting at 5:30 PM"
"List my reminders"
"Cancel all reminders"

🎙️ 15. Voice Cloning (ElevenLabs)

Clone your voice by uploading audio samples, then use the cloned voice for AI responses.

1

Open the Voice Clone Studio modal from the header.

2

Upload 1–25 audio samples of your voice.

3

Give your clone a name and click Clone. You'll receive a voiceId.

4

Use the cloned voice for TTS responses — the AI will speak with your voice!

📍 16. Live Location / Nearby Search

Ask about places near you and get Google Maps buttons for directions.

Examples:

"Find restaurants near me"
"Nearest hospital"
"ATMs around me"
"Closest pharmacy"

📍 The assistant auto-detects "near me" keywords and shows Find on Maps & Directions buttons.

🔬 17. Deep Search

Search across multiple engines simultaneously — YouTube, Google, Reddit, Stack Overflow, Maps, and more.

1

Click the 🔬 Deep Search chip in the header.

2

Select the engines you want to search across.

3

Type your query and click 🚀 Launch Deep Search.

🚨 18. Emergency Protocol

One-tap emergency activation — sends your GPS location via SMS and auto-calls your emergency contact.

1

Click the 🚨 Emergency chip (appears after login).

2

The system acquires your GPS location via the Geolocation API.

3

An SMS with a Google Maps link of your location is sent to your emergency contact.

4

An auto-call is placed to your emergency contact with a voice alert, and a siren alarm sounds.

⚠️ You can cancel the emergency before activation with the cancel button or Escape.

📄 19. PDF Report Generation

Ask the AI to generate detailed reports on any topic — delivered as a downloadable PDF.

Examples:

"Generate a PDF report on Artificial Intelligence in 2026"
"Create a report about climate change"
"Write a detailed report on wars after 2000"

📄 A Download PDF Report button appears in the chat after the report is generated.

📥 20. Export Conversations

Export your chat history as TXT or PDF.

1

Click the 📥 Export button in the session bar.

2

Choose TXT for a plain-text file or PDF for a styled printable document.

🌗 21. Dark / Light Theme

Toggle between Dark and Light mode using the switch in the header. Your preference is saved to localStorage.

Dark Mode	Galaxy background with stars, nebula, aurora, and shooting stars
Light Mode	Soft floating blobs, shimmer lines, and bokeh circles

⌨️ 22. Keyboard Shortcuts

`Enter`	Send text message
`Space`	Toggle voice recording (when not typing)
`Escape`	Cancel recording or cancel in-flight processing

🔐 23. Authentication

The assistant supports three authentication methods:

HTTP Basic Auth	Username + Password — simplest method
Auth0 JWT	Paste your Auth0 Bearer token
Keycloak JWT	Paste your Keycloak Bearer token

🔒 After login, credential fields are frozen. All interactive features (mic, send, wake word) are disabled until you log in.

⚡ Quick Capabilities Reference

💬 AI Chat

🎤 Voice Input

🔊 Voice Output

📱 Send SMS

📞 Make Calls

📧 Send Emails

🎨 Generate Images

🎬 Generate Videos

🌤️ Weather

📰 News

🔍 Web Search

📝 Notes

⏰ Reminders

🎙️ Voice Clone

📍 Nearby Places

🔬 Deep Search

🚨 Emergency

📄 PDF Reports

🗣️ Wake Word

🔔 Notifications

Servantica AI Assistant

📑 Table of Contents

🚀 1. Getting Started

💬 2. AI Chat (Text)

🎤 3. Voice Assistant

🗣️ 4. Wake Word — "Hey Servantica"

📱 5. Send SMS

📞 6. Make a Call

📧 7. Send Email

🎨 8. Image Generation (DALL·E)

🎬 9. Video Generation (Sora-2)

🌤️ 10. Live Weather

📰 11. Latest News

🔍 12. Web Search

📝 13. Voice Notes

⏰ 14. Reminders

🎙️ 15. Voice Cloning (ElevenLabs)

📍 16. Live Location / Nearby Search

🔬 17. Deep Search

🚨 18. Emergency Protocol

📄 19. PDF Report Generation

📥 20. Export Conversations

🌗 21. Dark / Light Theme

⌨️ 22. Keyboard Shortcuts

🔐 23. Authentication

⚡ Quick Capabilities Reference