Lesson 2: Essential Libraries for AI Builders
Building AI applications requires a robust toolkit. While the Python standard library is powerful, a set of third-party libraries has become essential for modern AI development. This lesson covers the absolute must-haves that you'll use in nearly every project.
We'll also introduce uv, the modern, high-performance package manager we'll use to install them.
1. uv: The Modern Python Package Manager
Before we install libraries, let's talk about *how* we install them. In 2025, the Python community is rapidly adopting uv, a Rust-based tool that replaces pip and venv.
uv?
- Blazing Fast: It's an order of magnitude faster than pip, which makes setting up projects and managing dependencies a breeze.
- All-in-One: It handles creating virtual environments (uv venv) and installing packages (uv pip install), simplifying your workflow.
- Drop-in Replacement: You can use it just like you would use pip.
To install requests using uv:
# Create a virtual environment
uv venv
Activate it (macOS/Linux)
source .venv/bin/activate
Install a package
uv pip install requests
2. requests: The Go-To for API Calls
Virtually all AI applications interact with external APIs, especially LLM provider APIs (like OpenAI, Anthropic, or Groq). The requests library is the de facto standard for making HTTP requests in Python.
requests is the engine for this.
- Data Fetching: You'll use it to download data for RAG systems or for fine-tuning.
Example: Calling the Groq API
import os
import requests
A .env file is recommended for storing API keys
from dotenv import load_dotenv
load_dotenv()
GROQ_API_KEY = os.environ.get("GROQ_API_KEY")
GROQ_API_URL = "https://api.groq.com/openai/v1/chat/completions"
headers = {
"Authorization": f"Bearer {GROQ_API_KEY}",
"Content-Type": "application/json",
}
payload = {
"model": "llama3-8b-8192",
"messages": [
{"role": "user", "content": "Explain the importance of fast inference in AI."}
],
}
try:
response = requests.post(GROQ_API_URL, headers=headers, json=payload)
response.raise_for_status() # Raises an HTTPError for bad responses (4xx or 5xx)
data = response.json()
print(data['choices'][0]['message']['content'])
except requests.exceptions.RequestException as e:
print(f"An error occurred: {e}")
3. pydantic: Bulletproof Data Validation
When you receive data from an API, you can't trust that it will always have the correct format. Pydantic enforces data schemas using Python's type hints. If the data doesn't match the schema, it raises a clear validation error.
Example: Validating LLM JSON Output
from pydantic import BaseModel, ValidationError
class AISearchResult(BaseModel):
tool_name: str
description: str
year_released: int
Imagine this is a JSON string from an LLM
llm_output = """
{
"tool_name": "Phidata",
"description": "A new agent framework with a clean API.",
"year_released": "2024"
}
"""
try:
# Pydantic automatically converts the 'year_released' string to an int!
result = AISearchResult.model_validate_json(llm_output)
print("Validation successful!")
print(result)
print(f"Tool: {result.tool_name} (Released: {result.year_released})")
except ValidationError as e:
print("Validation failed!")
print(e)
4. python-dotenv: Managing Environment Variables
You should never hardcode sensitive information like API keys in your code. The dotenv library provides a simple way to load these values from a .env file.
.env files.
Example Usage
1. Create a file named .env in your project root:
``
GROQ_API_KEY="your-secret-api-key-here"
OPENAI_API_KEY="another-secret-key"
`
2. Add .env to your .gitignore file to prevent it from being committed.
3. In your Python code, load the variables:
`python
import os
from dotenv import load_dotenv
# Load variables from .env file into the environment load_dotenv()
api_key = os.getenv("GROQ_API_KEY") # Now you can use this key safely in your application ``
5. NumPy & Pandas: Data Manipulation (Brief Overview)
While we won't dive deep into data science in this track, it's important to know what NumPy and Pandas are used for.
- NumPy: The fundamental package for numerical computing in Python. It provides a powerful N-dimensional array object. You'll encounter it when working with embeddings (which are often represented as NumPy arrays) or doing low-level performance-critical calculations. - Pandas: Built on top of NumPy, Pandas provides high-performance, easy-to-use data structures (like the DataFrame) and data analysis tools. It's the standard for cleaning, transforming, and analyzing structured data. If you're preparing a dataset for fine-tuning, you'll almost certainly use Pandas.
We will use these libraries lightly in later modules, but a deep dive is a key part of our "Data Science for AI" track.