Chỉ còn 7 suất cuối trong tháng này, hỗ trợ tư vấn ngoài giờ hành chính (Trị giá 500,000 VND)
First: Thank you to Thorsten Ball, for writing https://ampcode.com/how-to-build-an-agent. This blog post is an attempt to further simplify his code. Please do read the original; Thorsten's humor and expertise are infectious and inspired me to write this myself.
Second: You can find a video walkthrough of this demo at https://youni.ai/lightning-lessons/coding-agent-deepseek-scratch. In that video, I type out each line (often incorrectly) so you definitely can't miss any steps. Please do leave your email, so I can justify the time spent on this to my employer.
If you want to see the final product, you can find it here: https://github.com/coderschool/python-coding-agent-tutorial. Note that it varies a little from what I'll show below; I've cleaned it up a bit, made it have some colors in the terminal, and added the tiniest bit of error handling. Pull requests are welcome, of course.
My next blog post will introduce how to use MCPs instead of this basic tool implementation, as MCPs are the next big thing that everyone's making a big fuss about, despite the concepts being painfully simple to the point of being boring to write about.
So, you think Cursor/Windsurf/Trae/Claude Code/OpenAI Codex CLI are going to replace coders forever?
Well, maybe. But the ones of us that will be replaced last, the ones hanging for dear life, will survive because we're inquisitive and fast-learning. The survivor group will be those who learn how the tools work, how to work best with them, and how to create and improve them. Let's be survivors. (But also, I don't think coders are being replaced, but everyone's scared to say that now lest they sound arrogant and entitled.)
So today, we can learn the fundamentals of how Cursor et al, and really all of "Agentic UI", is implemented, by building a very simple coding "agent".
Our goals here are just to send messages back and forth with our LLM. I'm using DeepSeek, because it's inexpensive, and if you want to substitute OpenAI ChatGPT models, it doesn't require any code change (just remove the base\_api\_url and choose a GPT model).
Import the basic libraries, and get the API set up. You'll need your API key - create a new file, called .env
where you put a line that says:
DEEPSEEK_API_KEY=sk-123123123123
Storing secrets in a separate file is about the only "real-world engineering" type thing I'll do in this tutorial, note that everything else will be for very educational purposes.
import os
import sys
import json
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
api_key = os.environ.get("DEEPSEEK_API_KEY")
if not api_key:
print("Error: DEEPSEEK_API_KEY not found in .env", file=sys.stderr)
sys.exit(1)
client = OpenAI(api_key=api_key, base_url="https://api.deepseek.com/")
MODEL = "deepseek-chat"
Next, we'll create a new class for our "Agent", so we can finally add "Agentic AI Programming" to our LinkedIn skills.
class Agent:
def __init__(self, client_instance):
self.client = client_instance
def run(self):
print(f"Chatting with {MODEL}. Hit Ctrl+C to exit.")
while True:
try:
user_input = input("You: ")
if not user_input:
continue
print(f"Processing: '{user_input}'...") # Placeholder
except KeyboardInterrupt:
print("\nExiting...")
break
if __name__ == "__main__":
agent_instance = Agent(client_instance=client)
agent_instance.run()
Congratulations, you now realize that an Agent is just a class. This current agent can't do anything other than echo back things to the console. Let's now make it actually talk to our LLM API, by adding a function called run_inference
to our class (put it above def run
)
def run_inference(self, conversation_history):
response = self.client.chat.completions.create(
model=MODEL,
messages=conversation_history,
max_tokens=1024,
)
return response.choices[0].message
Now that our Agent knows how to call the LLM API, let's modify that run
function:
def run(self):
conversation = []
print(f"Chatting with {MODEL}. Hit Ctrl+C to exit.")
while True:
try:
user_input = input("You: ")
conversation.append({"role": "user", "content": user_input})
assistant_message_obj = self.run_inference(conversation)
conversation.append(assistant_message_obj.model_dump(exclude_unset=True))
if assistant_message_obj.content:
print(f"{MODEL}: {assistant_message_obj.content}")
except KeyboardInterrupt:
print("\nExiting...")
break
Now you can have a conversation with our Agent, just as if it were the regular ChatGPT app, except you're in terminal. So far we've done nothing important, but it's still kinda fun to have built your own interface to Deepseek.
Now, by default, the LLMs are programmed to be eager for "tools" to use. You can show this in a simple conversation, where you can offer the LLM an imaginary tool to use. Here's the conversation you can now have in terminal:
You: You are a weather expert. When I ask you about the weather in a given location, I want you to reply with `get_weather(<location_name>)`. I will then tell you what the weather in that location is. Understood?
deepseek-chat: Understood! From now on, when you ask about the weather in a location, I’ll reply with:
`get_weather(<location_name>)`
(Example: You say *“Weather in Tokyo?”* → I reply `get_weather(Tokyo)`.)
Ready when you are! 🌤️
You: Hey, what's the weather in Ho Chi Minh City?
deepseek-chat: `get_weather(Ho Chi Minh City)`
Let me know the details, and I’ll summarize or analyze it for you! 🌧️☀️
You: Hot, Humid, about 33 degrees Celsius
deepseek-chat: Got it! Here’s your **Ho Chi Minh City** weather breakdown:
**🌡️ Weather Update:**
- **Temperature:** **33°C** (*Scorching!*)
- **Conditions:** **Hot & Humid** (Feels like a steamy sauna 🧖♂️💦)
- **Survival Tips:**
- Wear light, breathable fabrics (linen is your friend!).
- Hydrate like it’s your job—water, coconut juice, or iced coffee works! ☕🧊
- Avoid midday sun; embrace shady alleys or air-conditioned malls.
**Fun Twist:** Did you know HCMC’s heat inspired the phrase *"Don’t fry eggs on the sidewalk… but maybe you could?"* 🍳 (Disclaimer: Please don’t test this.)
Need a virtual ice cream emoji to cope? 🍦😅
As you can see in the conversation, the bot is eager to call a function, get_weather
, with an argument Ho Chi Minh City
. Luckily, the OpenAI client libraries expose an argument, called tools
, where we can do something similar to what I just did: let the AI know about what tools it can call, for what purpose, and with which arguments.
Step 1 Checkpoint - code looks like:
import os
import sys
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
api_key = os.environ.get("DEEPSEEK_API_KEY")
client = OpenAI(api_key=api_key, base_url="https://api.deepseek.com/")
MODEL = "deepseek-chat"
class Agent:
def __init__(self, client_instance):
self.client = client_instance
def run_inference(self, conversation_history):
try:
response = self.client.chat.completions.create(
model=MODEL,
messages=conversation_history,
max_tokens=1024,
)
return response.choices[0].message
def run(self):
conversation = []
print(f"Chatting with {MODEL}. Hit Ctrl+C to exit.")
while True:
try:
user_input = input("You: ")
conversation.append({"role": "user", "content": user_input})
assistant_message_obj = self.run_inference(conversation)
conversation.append(assistant_message_obj.model_dump(exclude_unset=True))
if assistant_message_obj.content:
print(f"{MODEL}: {assistant_message_obj.content}")
except KeyboardInterrupt:
print("\nExiting...")
break
if __name__ == "__main__":
agent_instance = Agent(client_instance=client)
agent_instance.run()
Okay, so let's create a way to tell our LLM about the tools it can use.
class ToolDefinition:
def __init__(self, name: str, description: str, parameters: dict, function_to_call: typing.Callable):
self.name = name
self.description = description
self.parameters = parameters
self.function = function_to_call
I've added some type hints to the Python declaration - they're optional, but hopefully make the code easier for you to follow. If you are actually reading the text in between the code snippets, first of all, thank you. Secondly, it's similar to what we did before, with get_weather
, but we're just doing it in code, instead of in conversation.
We'll need to format our tool description to be what the API expects, so let's add a simple function to this class that does this:
def get_api_description(self) -> dict:
return {
"type": "function",
"function": {
"name": self.name,
"description": self.description,
"parameters": self.parameters,
}
}
And now we just need our Agent to know what tools it has, so we add a line to our constructor like so:
class Agent:
def __init__(self, client_instance, tool_definitions: list[ToolDefinition] = []):
self.client = client_instance
self.tool_definitions = {tool.name: tool for tool in tool_definitions}
And finally, we need our Agent to tell the LLM about its tools, which we can do by the `tools` parameter built into the LLM API call:
def run_inference(self, conversation_history):
try:
response = self.client.chat.completions.create(
model=MODEL,
messages=conversation_history,
max_tokens=1024,
tools=[tool.get_api_description() for tool in self.tool_definitions.values()],
)
return response.choices[0].message
With that, we have all the plumbing complete, and we can get to the fun part: actually writing our tools. The first tool we'll make is something to read files. Let's instantiate a new ToolDefinition
like so:
read_file_tool = ToolDefinition(
name="read_file",
description="Read the contents of a file",
parameters={
"type": "object",
"properties": {
"file_path": {
"type": "string",
"description": "Path to the file to read"
}
},
"required": ["file_path"]
},
function_to_call=read_file_from_path
)
We also need to implement the function_to_call
. For now, let's just keep it simple with these three lines.
def read_file_from_path(file_path: str) -> str:
with open(file_path, 'r', encoding='utf-8') as f:
return f.read()
Last but not least, we need to edit that one line we wrote ages ago, to pass our new tool to our agent instantiation:
if __name__ == "__main__":
agent_instance = Agent(client_instance=client, tool_definitions=[read_file_tool])
agent_instance.run()
As you can see, most of the work is just connecting things to each other. The actual "brains" of the tools can be found in the three lines that read from the filesystem. But wait! There's more! We still haven't connected the tool to our program output yet. Just like we needed to read and react to the LLM saying get_weather
, we need our code to respond. Our run_inference
call response has an object called tool_calls
, which will be populated with one or more tools that the AI wants called. We then need to respond with the answer, similarly to how we said "33 degrees, hot" when asked about the weather.
if assistant_message_obj.tool_calls:
for tool_call in assistant_message_obj.tool_calls:
tool_result = self.execute_tool_call(tool_call)
tool_response = {
"role": "tool",
"tool_call_id": tool_call.id,
"name": tool_call.function.name,
"content": str(tool_result)
}
conversation.append(tool_response)
To make this whole loop work, we kind of need to reconfigure the structure of our run
function. Here's the new version incorporating these changes:
def run(self):
conversation = []
print(f"Chatting with {MODEL}. Hit Ctrl+C to exit.")
while True:
try:
user_input = Console.user("")
conversation.append({"role": "user", "content": user_input})
while True:
assistant_message_obj = self.run_inference(conversation)
assistant_message = assistant_message_obj.model_dump(exclude_unset=True)
conversation.append(assistant_message)
if assistant_message_obj.content:
print(f"{MODEL}: {assistant_message_obj.content}")
if assistant_message_obj.tool_calls:
for tool_call in assistant_message_obj.tool_calls:
print(f"Tool Call: {tool_call.function.name}")
tool_result = self.execute_tool_call(tool_call)
tool_response = {
"role": "tool",
"tool_call_id": tool_call.id,
"name": tool_call.function.name,
"content": str(tool_result)
}
conversation.append(tool_response)
continue
break
except KeyboardInterrupt:
print("\nExiting...")
break
We also need to add the `execute_tool_call` method:
def execute_tool_call(self, tool_call):
tool_name = tool_call.function.name
tool_args_str = tool_call.function.arguments
tool_args = json.loads(tool_args_str)
result = self.tool_definitions[tool_name].function(**tool_args)
return result
All we do in this function is get the tool by name, find its function, and invoke it with the supplied parameters. Now that you have this, you should be able to do cool stuff like this:
You: what's in main.py?
Tool Call: read_file
deepseek-chat: The `main.py` file contains a Python script that sets up an agent to interact with the DeepSeek API. Here's a summary of its contents:
1. **Imports**:
- `openai`: Used to interact with the DeepSeek API.
- `dotenv`: Loads environment variables (e.g., API keys).
- `json`: Handles JSON data.
- `os`: Accesses environment variables.
2. **Environment Setup**:
- Loads the `.env` file to get the `DEEPSEEK_API_KEY`.
- Initializes the OpenAI client with the DeepSeek API URL.
3. **Classes**:
- **`ToolDefinition`**: Defines a tool (function) that the agent can use, including its name, description, parameters, and the function to call.
- **`Agent`**: Manages the conversation with the DeepSeek API, handles tool calls, and executes functions.
4. **Functions**:
- **`read_file_path`**: Reads the contents of a file (used as a tool by the agent).
- **`run_inference`**: Sends a conversation to the DeepSeek API and retrieves the response.
- **`execute_tool`**: Executes a tool (function) based on the API's response.
- **`run`**: Main loop for interacting with the user, sending queries to the API, and handling responses.
5. **Tool Definition**:
- `read_file_tool`: An instance of `ToolDefinition` for reading files.
6. **Main Execution**:
- Initializes the `Agent` with the `read_file_tool`.
- Starts the agent's interaction loop.
### Key Features:
- The agent can dynamically call tools (like `read_file`) based on the API's response.
- It maintains a conversation history and handles user input interactively.
- The script is designed to be extensible with additional tools.
Congratulations. If you haven't already added Agentic AI programming to your LinkedIn, now you certainly should - your agent is alive.
Seriously, feel free to add me on LinkedIn at http://linkedin.com/in/charleslee5 and ask me to "Endorse" you. I'll do it. I'd do anything for readers of my blog.
(Imagine how much more we'd do for learners at www.coderschool.vn)
Checkpoint 2: Edit Files
import os
from openai import OpenAI
from dotenv import load_dotenv
import json
load_dotenv()
api_key = os.environ.get("DEEPSEEK_API_KEY")
if not api_key:
raise ValueError("DEEPSEEK_API_KEY is not set")
sys.exit(1)
client = OpenAI(api_key=api_key, base_url="https://api.deepseek.com/")
MODEL = "deepseek-chat"
class ToolDefinition:
def __init__(self, name, description, parameters, function_to_call):
self.name = name
self.description = description
self.parameters = parameters
self.function_to_call = function_to_call
def get_api_description(self):
return {
"type": "function",
"function": {
"name": self.name,
"description": self.description,
"parameters": self.parameters,
},
}
class Agent:
def __init__(self, client_instance, tool_definitions: list[ToolDefinition]):
self.client = client_instance
self.tools = tool_definitions
def run_inference(self, conversation_history):
response = self.client.chat.completions.create(
model=MODEL,
messages=conversation_history,
max_tokens=1024,
tools=[tool.get_api_description() for tool in self.tools],
)
return response.choices[0].message
def execute_tool_call(self, tool_call):
tool_name = tool_call.function.name
tool_args_str = tool_call.function.arguments
tool_args = json.loads(tool_args_str)
result = self.tool_definitions[tool_name].function(**tool_args)
return result
def run(self):
conversation = []
print(f"Chatting with {MODEL}...")
while True:
try:
user_input = input("You: ")\
conversation.append({"role": "user", "content": user_input})
while True:
assistant_message_obj = self.run_inference(conversation)
conversation.append(assistant_message_obj.model_dump(exclude_unset=True))
if assistant_message_obj.content:
print(f"{MODEL}: {assistant_message_obj.content}")
if assistant_message_obj.tool_calls:
for tool_call in assistant_message_obj.tool_calls:
print(f"Tool call: {tool_call.function.name}")
tool_result = self.execute_tool_call(tool_call)
tool_response = {
"role": "tool",
"content": str(tool_result),
"tool_call_id": tool_call.id,
"name": tool_call.function.name,
}
conversation.append(tool_response)
continue
break
except KeyboardInterrupt:
print("Exiting...")
break
def read_file_from_path(file_path):
with open(file_path, "r") as file:
return file.read()
read_file_tool = ToolDefinition(
name="read_file",
description="Read a file from the given path",
parameters={
"type": "object",
"properties": {
"file_path": {
"type": "string",
"description": "The path to the file to read",
}
},
"required": ["file_path"],
},
function_to_call=read_file_from_path,
)
if __name__ == "__main__":
agent = Agent(client_instance=client, tool_definitions=[read_file_tool])
agent.run()
I saved the easiest step for last, but partially because it really underscores how simple AI can be sometimes. If you think about what a coding agent can do, it basically just reads and edits files, with some LLM magic sprinkled throughout. To be fully-featured like the production apps, you'd have to read multiple files, index a codebase, follow rules, and so on, but at the core, we're most of the way there with basic file writing capability. If you don't believe me, I applaud you for reading this far despite your skepticism. Let's banish that skepticism with the following lines of code:
def edit_file_path(file_path, old_str, new_str):
try:
with open(file_path, 'r') as f:
old_content = f.read()
new_content = old_content.replace(old_str, new_str)
with open(file_path, 'w') as f:
f.write(new_content)
except FileNotFoundError:
return create_new_file(file_path, new_str)
def create_new_file(file_path: str, content: str) -> str:
dir_path = os.path.dirname(file_path)
if dir_path and dir_path != ".":
os.makedirs(dir_path, exist_ok=True)
with open(file_path, 'w', encoding='utf-8') as f:
f.write(content)
return f"Successfully created file {file_path}"
edit_file_tool = ToolDefinition(
name="edit_file",
description="Edit the contents of a file. Replaces 'old_str' with 'new_str'. 'old_str' and 'new_str' MUST be different. If the file with the path doesn't exist, it will be created.",
parameters={
"type": "object",
"properties": {
"file_path": {
"type": "string",
"description": "path of the file to edit"
},
"old_str": {
"type": "string",
"description": "text to search for, must match exactly and have only one match exactly"
},
"new_str": {
"type": "string",
"description": "text to replace old_str with"
}
},
"required": ["file_path", "old_str", "new_str"]
},
function_to_call=edit_file_path
)
Don't forget to pass this new `edit_file_tool` to the agent invocation, and bam. You'll be able to have the following conversation:
Chatting with deepseek-chat. Hit Ctrl+C to exit.
You: hey deepseek, create fizzbuzz.js that I can run with Nodejs and that has fizzbuzz in it and executes it
Tool: edit_file({"path":"fizzbuzz.js","old_str":"","new_str":"// FizzBuzz implementation in Node.js\nfunction fizzbuzz(n) {\n for (let i = 1; i <= n; i++) {\n if (i % 15 === 0) {\n console.log('FizzBuzz');\n } else if (i % 3 === 0) {\n console.log('Fizz');\n } else if (i % 5 === 0) {\n console.log('Buzz');\n } else {\n console.log(i);\n }\n }\n}\n\n// Execute FizzBuzz for numbers 1 to 100\nfizzbuzz(100);"})
deepseek-chat: I've created a file named `fizzbuzz.js` with a FizzBuzz implementation in Node.js. Here's what it contains:
```javascript
// FizzBuzz implementation in Node.js
function fizzbuzz(n) {
for (let i = 1; i <= n; i++) {
if (i % 15 === 0) {
console.log('FizzBuzz');
} else if (i % 3 === 0) {
console.log('Fizz');
} else if (i % 5 === 0) {
console.log('Buzz');
} else {
console.log(i);
}
}
}
// Execute FizzBuzz for numbers 1 to 100
fizzbuzz(100);
```
To run it, simply execute the following command in your terminal:
```bash
node fizzbuzz.js
```
This will print the FizzBuzz sequence from 1 to 100. Let me know if you'd like any modifications!
You: Please edit fizzbuzz.js so that it only prints until 15
Tool: edit_file({"path":"fizzbuzz.js","old_str":"// Execute FizzBuzz for numbers 1 to 100\nfizzbuzz(100);","new_str":"// Execute FizzBuzz for numbers 1 to 15\nfizzbuzz(15);"})
deepseek-chat: The `fizzbuzz.js` file has been updated to print the FizzBuzz sequence only until 15. Here's the updated content:
```javascript
// FizzBuzz implementation in Node.js
function fizzbuzz(n) {
for (let i = 1; i <= n; i++) {
if (i % 15 === 0) {
console.log('FizzBuzz');
} else if (i % 3 === 0) {
console.log('Fizz');
} else if (i % 5 === 0) {
console.log('Buzz');
} else {
console.log(i);
}
}
}
// Execute FizzBuzz for numbers 1 to 15
fizzbuzz(15);
```
Run it with:
```bash
node fizzbuzz.js
```
Let me know if you'd like any further changes!
You:
Exactly. You're speechless at the end.
Hopefully you've enjoyed this tutorial and actually followed along! The GitHub implementation has some quality of life improvements around logging, the tools defined in their own file to keep main.py
clean, and a list_files
tool that can execute ls
.
Tham gia ngay 45 phút định hướng cùng Mentor tại CoderSchool hoàn toàn miễn phí
Chương trình đảm bảo việc làm sau tốt nghiệp