Why Add AI to Your Application?
AI-powered features can transform your application — from intelligent search and automated customer support to content generation and data analysis. The OpenAI API makes it accessible without needing to train your own models.
Here are some high-value integrations I've built for clients:
- Intelligent chatbots that understand context and provide relevant answers
- Content generation for marketing copy, product descriptions, and emails
- Data extraction from unstructured text (invoices, contracts, reviews)
- Smart search that understands natural language queries
Getting Started
API Setup
from openai import OpenAI
client = OpenAI(api_key="your-api-key")
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain microservices in simple terms."}
],
temperature=0.7,
max_tokens=500,
)
print(response.choices[0].message.content)Key Parameters
- model —
gpt-4ofor best quality,gpt-4o-minifor speed and cost efficiency - temperature — 0 for deterministic outputs, 0.7-1.0 for creative responses
- max_tokens — Controls response length (and cost)
- system message — Sets the AI's behavior and constraints
Prompt Engineering Best Practices
The quality of your AI integration depends heavily on your prompts.
Be Specific About the Output Format
system_prompt = """You are a product description generator.
Output format: JSON with fields: title, description, features (array), seo_keywords (array).
Keep descriptions under 200 words.
Tone: Professional but approachable."""Use Few-Shot Examples
messages = [
{"role": "system", "content": "Classify customer feedback as positive, negative, or neutral."},
{"role": "user", "content": "The product arrived on time and works great!"},
{"role": "assistant", "content": "positive"},
{"role": "user", "content": "It broke after two days. Very disappointed."},
{"role": "assistant", "content": "negative"},
{"role": "user", "content": actual_feedback},
]Add Guardrails
system_prompt = """You are a customer support assistant for an e-commerce store.
Rules:
- Only answer questions about orders, shipping, and returns
- Never provide medical, legal, or financial advice
- If unsure, say "Let me connect you with a human agent"
- Always be polite and concise"""Streaming Responses
For chat interfaces, streaming provides a much better user experience:
stream = client.chat.completions.create(
model="gpt-4o",
messages=messages,
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)FastAPI Streaming Endpoint
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
app = FastAPI()
@app.post("/api/chat")
async def chat(request: ChatRequest):
async def generate():
stream = client.chat.completions.create(
model="gpt-4o",
messages=request.messages,
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
yield f"data: {content}\n\n"
yield "data: [DONE]\n\n"
return StreamingResponse(generate(), media_type="text/event-stream")Production Considerations
Rate Limiting and Caching
from functools import lru_cache
import hashlib
def get_cache_key(messages: list) -> str:
return hashlib.md5(str(messages).encode()).hexdigest()
# Cache identical requests to reduce API costs
@lru_cache(maxsize=1000)
def cached_completion(cache_key: str, messages_json: str):
messages = json.loads(messages_json)
return client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
)Error Handling
from openai import RateLimitError, APIError
import time
def safe_completion(messages, retries=3):
for attempt in range(retries):
try:
return client.chat.completions.create(
model="gpt-4o",
messages=messages,
)
except RateLimitError:
wait = 2 ** attempt
time.sleep(wait)
except APIError as e:
if attempt == retries - 1:
raise
time.sleep(1)Cost Management
- Use
gpt-4o-minifor simple tasks (classification, extraction) - Use
gpt-4oonly when quality matters (customer-facing, complex reasoning) - Set
max_tokensto limit response length - Cache repeated queries
- Monitor usage with OpenAI's dashboard
Real-World Architecture
Here's a typical architecture for an AI-enhanced application:
- Frontend sends user input to your API
- Your API validates input, applies rate limits, constructs the prompt
- OpenAI API processes the request and returns the response
- Your API post-processes the response (validation, formatting, logging)
- Frontend displays the result with streaming for chat interfaces
The key insight: the AI is a tool in your stack, not the entire stack. Wrap it with validation, error handling, and monitoring just like any other external service.
Getting Help
If you're looking to add AI capabilities to your application, check out our AI & ML integration services. We've helped businesses across industries implement practical AI solutions that deliver real value.
For more engineering insights, read about web scraping with Python or software development costs.
