Python + AI Weekly Office Hours: Recordings & Resources #280
Replies: 40 comments
-
|
2026/01/06: Do you think companies will create internal MCP servers for AI apps to connect to? Yes, this is already happening quite a bit. Common use cases include:
A particularly valuable use case is data science/engineering teams creating MCP servers that enable less technical folks (marketing, PMs, bizdev) to pull data safely without needing to write SQL. The pattern often starts with an engineer building an MCP server for themselves, sharing it with colleagues, adding features based on their needs, and growing from there. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: How do you set up Entra OBO (On-Behalf-Of) flow for Python MCP servers? 📹 5:48 The demo showed how to use the Graph API with the OBO flow to find out the groups of a signed-in user and use that to decide whether to allow access to a particular tool. The flow works as follows:
For the authentication dance, FastMCP handles the DCR (Dynamic Client Registration) flow since Entra itself doesn't support DCR natively. To test from scratch:
Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: Which MCP inspector should I use for testing servers with Entra authentication? 📹 20:24 The standard MCP Inspector doesn't work well with Entra authentication because it doesn't do the DCR (Dynamic Client Registration) dance properly. MCP Jam is recommended instead because it properly handles the OAuth flow with DCR. To set it up:
MCP Jam also has nice features like:
One note: enum values in tools don't yet show as dropdowns in MCP Jam (issue to be filed). Links shared: What's the difference between MCP Jam and LM Studio? 📹 34:19 LM Studio is primarily for playing around with LLMs locally. MCP Jam has some overlap since it includes a chat interface with access to models, but its main purpose is to help you develop MCP servers and apps. It's focused on the development workflow rather than just chatting with models. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: How do you track LLM usage tokens and costs? 📹 28:04 For basic tracking, Azure portal shows metrics for token usage in your OpenAI accounts. You can see input tokens and output tokens in the metrics section. You can also:
If you use multiple providers, you need a way to consolidate the tracking. OpenTelemetry metrics could work but you'd need a way to hook into each system. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: How do you keep yourself updated with all the new changes related to AI? 📹 30:32 Several sources recommended:
Particularly recommended:
Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: How do you build a Microsoft Copilot agent in Python with custom API calls? 📹 36:30 For building agents that work with Microsoft 365 Copilot (which appears in Windows Copilot and other Microsoft surfaces):
The agent framework team is responsive if there are issues. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: As a backend developer with a non-CS background, how do I learn about AI from scratch? 📹 46:39 Recommended approach:
Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: What's new with the RAG demo (azure-search-openai-demo) after the SharePoint data source was added? 📹 49:50 The main work is around improving ACL (Access Control List) support. The cloud ingestion feature was added recently, but it doesn't yet support ACLs. The team is working on making ACLs compatible with all features including:
A future feature idea: adding an MCP server to the RAG repo for internal documentation use cases, leveraging the Entra OBO flow for access control. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: Do you think companies will create internal MCP servers for AI apps to connect to? 📹 53:53 Yes, this is already happening quite a bit. Common use cases include:
A particularly valuable use case is data science/engineering teams creating MCP servers that enable less technical folks (marketing, PMs, bizdev) to pull data safely without needing to write SQL. The pattern often starts with an engineer building an MCP server for themselves, sharing it with colleagues, adding features based on their needs, and growing from there. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: What advantages do other formats have over .txt for prompts? How do you improve prompts with DSPy and evals? 📹 4:55 Prompty is a template format that mixes Jinja and YAML together. The YAML goes at the top for metadata, and the rest is Jinja templating. Jinja is the most common templating system for Python (used by Flask, etc.). The nice thing about Jinja is you can pass in template variables—useful for customization, passing in citations, etc. Prompty turns the file into a Python list of chat messages with roles and contents. However, we're moving from Prompty to plain Jinja files because:
Recommendation: Keep prompts separate from code when possible, especially long system prompts. Use plain .txt or .md if you don't need variables, or Jinja if you want to render variables. With agents and tools, some LLM-facing text (like tool descriptions in docstrings) will inevitably live in your code—that's fine. For iterating on prompts: Run evaluations, change the prompt, and see whether it improves things. There are tools like DSPy and Agent Framework's Lightning that do automated prompt optimization/fine-tuning. Lightning says it "fine-tunes agents" but may actually be doing prompt changes. Most of the time, prompt changes don't make a huge difference, but sometimes they might. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: What is the future of AI and which specialization should I pursue? 📹 11:54 If you enjoy software engineering and full-stack engineering, it's more about understanding the models so you understand why they do what they do, but it's really about how you're building on top of those models. There's lots of interesting stuff to learn, and it really depends on you and what you're most interested in doing. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: Which livestream series should I follow to build a project using several tools and agents, and should I use a framework? 📹 13:33 Everyone should understand tool calling before moving on to agents. From the original 9-part Python + AI series, start with tool calling, then watch the high-level agents overview. The upcoming six-part series in February will dive deeper into each topic, especially how to use Agent Framework. At the bare minimum, you should understand LLMs, tool calling, and agents. Then you can decide whether to do everything with just tool calling (you can do it yourself with an LLM that has tool calling) or use an agent framework like LangChain or Agent Framework if you think it has enough benefits for you. It's important to understand that agents are based on tool calling—it's the foundation of agents. The success and failure of agents has to do with the ability of LLMs to use tool calling. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: How does Azure manage the context window? How do I maintain a long conversation with a small context window? 📹 15:21 There are three general approaches:
With today's large context windows (128K, 256K), it's often easier to just wait for an error and tell the user to start a new chat, or do summarization when the error occurs. This approach is most likely to work across models since every model should throw an error when you're over the context window. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: How do we deal with context rot and how do we summarize context using progressive disclosure techniques? 📹 19:17 Read through Kelly Hong's (Chroma researcher) blog post on context rot. The key point is that even with a 1 million token context window, you don't have uniform performance across that context window. She does various tests to see when performance starts getting worse, including tests on ambiguity, distractors, and implications. A general tip for coding agents with long-running tasks: use a main agent that breaks the task into subtasks and spawns sub-agents for each one, where each sub-agent has its own focused context. This is the approach used by the LangChain Deep Agents repo. You can also look at how different projects implement summarization. LangChain's summarization middleware is open source—you can see their summary prompt and approach. They do approximate token counting and trigger summarization when 80% of the context is reached. Links shared:
How do I deal with context issues when using the Foundry SDK with a single agent? 📹 25:03 If you're using the Foundry SDK with a single agent (hosted agent), you can implement something like middleware through hooks or events. Another approach is the LangChain Deep Agents pattern: implement sub-agents as tools where each tool has a limited context and reports back a summary of its results to the main agent. For the summarization approach with Foundry agents, you'd need to figure out what events, hooks, or middleware systems they have available. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: Have you seen or implemented anything related to AG-UI or A2UI? 📹 29:02 AG-UI (Agent User Interaction Protocol) is an open standard introduced by the CopilotKit team that standardizes how front-end applications communicate with AI agents. Both Pydantic AI and Microsoft Agent Framework have support for AG-UI—they provide adapters to convert messages to the AG-UI format. The advantage of standardization is that if people agree on a protocol between backend and frontend, it means you can build reusable front-end components that understand how to use that backend. Agent Framework also supports different UI event stream protocols, including Vercel AI (though Vercel is a competitor, so support may be limited). These are adapters—you can always adapt output into another format if needed, but it's nice when it's built in. A2UI is created by Google with Consortium CopilotKit and relates to A2A (Agent-to-Agent). A2UI appears to be newer with less support currently in Agent Framework, though A2A is supported. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/20: What do you think about coding agents like Claude Code, OpenCode, and Antigravity for complex tasks? 📹 55:54 These coding agents use patterns like creating plans, dividing tasks into steps, implementing and testing separately, and storing progress in files. There's been talk of "gas town" approaches with lots of agents working in parallel. For complex tasks, I prefer to stay really involved because I need to review the code afterwards. If I don't understand why some code is there, that's a problem. I typically use a really long thread and maybe make a plan in a markdown file first. For personal/utility software (like the presentation write-up tool), it's fine to let agents write code you don't fully review since you only care about the output. But for maintainable code that others will look at, every line needs to make sense. The models are getting better with large context windows - it seems like you might not need an explicit planning phase or to-do list. Good models plus some summarization might be enough. This pattern is related to SDD (Spec-Driven Development). |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/20: Who maintains spec-kit now that Den Delimarsky left Microsoft? 📹 59:56 Den Delimarsky left Microsoft to join Anthropic, where he's now working directly on MCP. His blog is still great for MCP-related content. John Lam is now the maintainer/contact for spec-kit. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/20: News & Updates GitHub Copilot SDK OpenCode support for GitHub Copilot Python Agent Framework breaking changes What Pamela is working on
|
Beta Was this translation helpful? Give feedback.
-
|
2026/01/20: Upcoming Events
|
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What is MCP Apps support in VS Code? 📹 0:13 MCP Apps (previously known as MCP UI) is the first official MCP extension. While not technically part of the core Model Context Protocol, it is an official extension that allows MCP servers to pass down rich interfaces instead of just text or binary files. The idea is that sometimes you need richer interaction than just text—you want actual user interfaces. This appears to work via iframes where you pass down an iframe with width, height, etc. This could be the future of the web—instead of going to websites, we might do everything through agentic chat interfaces (like Microsoft Copilot, Claude, ChatGPT) with rich interaction coming into the chat terminal using MCP Apps. Kent C. Dodds has a great talk called "The Future of User Interaction" discussing this vision. Links shared:
Will MCP Apps be available in Teams/Copilot? 📹 1:35 Unknown at this time. VS Code is the most feature-complete MCP client out there, which is why it often has support for new features immediately. Other clients, including Teams Copilot, tend to lag behind in MCP support. There may also be additional security considerations for Teams/Copilot, though if the apps are iframed, that should relieve many security concerns. Is there a Python version of MCP Apps? 📹 3:12 Not yet. The current MCP Apps playground repository is JavaScript-based. A Python version would need to be created. The FastMCP SDK doesn't appear to have support for apps yet—no open issues were found for MCP App or MCP UI support. Update: FastMCP maintainer Jeremiah Lowin confirmed they've been waiting to see the spec and absolutely want to add support—currently doing design work on it. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What's new in VS Code Insiders? 📹 7:58 Follow Pierce Boggan for the latest VS Code Insiders updates. Recent additions include:
Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What is the GitHub Copilot SDK and CLI? 📹 11:07 The GitHub Copilot SDK is a programmatic way to access GitHub Copilot. Key benefits:
The limitation is that you're currently restricted in model selection (e.g., can't use Opus, only Sonnet). The GitHub Copilot CLI requires the CLI to be installed. From the CLI you can:
For an example of using the SDK to analyze pull requests, see the write-up from the January 26 live stream. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What developer hackathons are coming up? 📹 19:06 Two hackathons were mentioned:
Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: Is this a good place to ask about Microsoft Foundry SDK or Agent Framework SDK? 📹 20:53 Yes, you can ask questions about these here. The upcoming Python + Agents series will be diving deep into the Agent Framework SDK (which sometimes wraps the Foundry SDK). If you haven't registered for the agent series yet, definitely do that—it will cover the basics and go deeper into Agent Framework. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: Are you using Spec-Driven Development (SDD) or SpecKit to guide coding agents? 📹 21:59 Pamela had not used SpecKit before. For bigger projects, her approach has been to either:
SpecKit seems good if you really know what you want versus being more experimental. Den, the original creator of SpecKit, has moved to Anthropic, but there is a new maintainer. Update: Pamela tried out SpecKit in a livestream later that day. It worked pretty well, but may not be as necessary with newer models and GitHub Copilot features. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What's new with the RAG demo - ACL support? 📹 25:30 A new release was just made for the azure-search-openai-demo repo adding ACL (Access Control List) support for the cloud ingestion pipeline. This enables document-level security filtering in Azure AI Search. How it works:
AI Search has built-in understanding for ACLs. You set up fields for user IDs and group IDs, mark them as such, and enable access control enforcement on the index. Links shared: Can the RAG repo support ACLs from other identity providers like Okta? 📹 31:01 Yes, but it requires custom implementation. You would need to:
AI Search only has built-in support for Entra. For other IDPs, you'd implement it similarly to how the repo worked before the built-in Entra support was added—by passing along the token and checking permissions manually. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: Have you tried memory tools in GitHub Copilot? 📹 36:16 Pamela has not used memory tools in GitHub Copilot, but recently experimented with memory in Microsoft Copilot. To save a memory, you need to cue it up with "remember" - for example, "remember to never call me Pam, only call me Pamela." You can view saved memories in Settings > Personalization and Memories > Manage Save Memories. Without explicitly using "remember," it doesn't seem to save memories automatically despite many conversations. For VS Code, there are MCP servers for memory, though it's unclear which ones work well. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: When should I use Foundry IQ knowledge bases vs MCP tools? 📹 45:25 Foundry IQ (the new name for Azure AI Search capabilities in Azure AI Foundry) provides:
MCP as a knowledge source is in private preview. If you want to use something like Elastic's MCP server as a knowledge source in Foundry IQ, you could request access to the private preview. Additionally, there is an MCP endpoint for knowledge bases—meaning that if you create a knowledge base in AI Search, you can use it as an MCP server. Note on naming: Azure Search → Azure Cognitive Search → Azure AI Search → Foundry IQ. The underlying Bicep/ARM resources still use "search services." |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What tools do you use to automate developer workflows? 📹 38:48 Several approaches are being used:
The flexibility is notable—you can pick whatever form of programmatic manipulation you want: custom agents, skills with prompts, Python scripts, or agent frameworks. Different approaches suit different needs. One interesting use case: converting presentations to writeups, including automatic slide-to-timestamp alignment (which LLMs handle surprisingly well). Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What is Work IQ? 📹 17:09 Work IQ is a new command line tool and MCP server from Microsoft. It has read-only access to your Outlook, email, and Teams. You can:
Example uses:
The read-only access limits some usefulness, but it's helpful for information retrieval. Links shared: |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Each week, we hold weekly office hours about all things Python + AI in the Foundry Discord.
Join the Discord here: http://aka.ms/aipython/oh
This thread will list the recordings of each office hours, and any other resources that come out of the OH sessions. The questions and answers are automatically posted (based on the transcript) as comments in this thread.
January 27, 2026
Topics covered:
January 20, 2026
Topics covered:
January 13, 2026
Topics covered:
January 6, 2025
Topics covered:
Beta Was this translation helpful? Give feedback.
All reactions