conoro
diff --git a/‎feeds/LocalLLaMA.xml‎
Lines changed: 30 additions & 30 deletions b/‎feeds/LocalLLaMA.xml‎
Lines changed: 30 additions & 30 deletions
@@ -2,23 +2,23 @@
 <feed xmlns="http://www.w3.org/2005/Atom">
   <id>/r/LocalLLaMA/.rss</id>
   <title>LocalLlama</title>
-  <updated>2026-03-28T16:12:00+00:00</updated>
+  <updated>2026-03-28T16:28:56+00:00</updated>
   <link href="https://old.reddit.com/r/LocalLLaMA/" rel="alternate"/>
   <generator uri="https://lkiesow.github.io/python-feedgen" version="1.0.0">python-feedgen</generator>
   <icon>https://www.redditstatic.com/icon.png/</icon>
   <subtitle>Subreddit to discuss locally hostable AI</subtitle>
   <entry>
-    <id>t3_1s639zz</id>
-    <title>How are you solving agent-to-agent access control?</title>
-    <updated>2026-03-28T15:25:13+00:00</updated>
+    <id>t3_1s63wpc</id>
+    <title>Post your Favourite Local AI Productivity Stack (Voice, Code Gen, RAG, Memory etc)</title>
+    <updated>2026-03-28T15:49:28+00:00</updated>
     <author>
-      <name>/u/nightFlyer_rahl</name>
-      <uri>https://old.reddit.com/user/nightFlyer_rahl</uri>
+      <name>/u/No-Paper-557</name>
+      <uri>https://old.reddit.com/user/No-Paper-557</uri>
     </author>
-    <content type="html">&lt;!-- SC_OFF --&gt;&lt;div class="md"&gt;&lt;p&gt;&lt;strong&gt;Builders, how are you solving the access control problem for agents?&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;Context: I'm building &lt;a href="https://github.com/GetBindu/Bindu"&gt;Bindu&lt;/a&gt;, an operating layer for agents. The idea is any framework, any language - agents can talk to each other, negotiate, do trade. We use DIDs (decentralized identifiers) for agent identity. Communication is encrypted.&lt;/p&gt; &lt;p&gt;But now I'm hitting a wall: &lt;strong&gt;agent trust.&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;Think about it. In a swarm, some agents should have more power than others. A high trust orchestrator agent should be able to:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;compress or manage the context window&lt;/li&gt; &lt;li&gt;delegate tasks to lower trust worker agents&lt;/li&gt; &lt;li&gt;control who can write to the database&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;The low trust agents? They just do their job with limited scope. They shouldn't be able to escalate or pretend they have more access than they do.&lt;/p&gt; &lt;p&gt;The DB part: sure, MCP and skills can handle that. But what about at the agent-to-agent level? How does one agent prove to another that it has the authority to delegate? How do you stop a worker agent from acting like an orchestrator?&lt;/p&gt; &lt;p&gt;In normal software we'd use Keycloak or OAuth for this. But those assume human users, sessions, login flows. In the agent world, there are no humans — just bots talking to bots.&lt;/p&gt; &lt;p&gt;What are you all doing for this? Custom solutions? Ignoring it? Curious what's actually working in practice.&lt;/p&gt; &lt;p&gt;&lt;em&gt;English is not my first language, I use AI to clean up grammar. If it smells like AI, that's the editing&lt;/em&gt;&lt;/p&gt; &lt;/div&gt;&lt;!-- SC_ON --&gt; &amp;#32; submitted by &amp;#32; &lt;a href="https://old.reddit.com/user/nightFlyer_rahl"&gt; /u/nightFlyer_rahl &lt;/a&gt; &lt;br /&gt; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s639zz/how_are_you_solving_agenttoagent_access_control/"&gt;[link]&lt;/a&gt;&lt;/span&gt; &amp;#32; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s639zz/how_are_you_solving_agenttoagent_access_control/"&gt;[comments]&lt;/a&gt;&lt;/span&gt;</content>
-    <link href="https://old.reddit.com/r/LocalLLaMA/comments/1s639zz/how_are_you_solving_agenttoagent_access_control/"/>
+    <content type="html">&lt;!-- SC_OFF --&gt;&lt;div class="md"&gt;&lt;p&gt;Hi all,&lt;/p&gt; &lt;p&gt;It seems like so many new developments are being released as OSS all the time, but I’d like to get an understanding of what you’ve found to personally work well.&lt;/p&gt; &lt;p&gt;I know many people here run the newest open source/open weight models with llama.cpp or ollama etc but I wanted to gather feedback on how you use these models for your productivity.&lt;/p&gt; &lt;p&gt;1) Voice Conversations - If you’re using things like voice chat, how are you managing that? Previously i was recommended this solution - Faster-whisper + LLM + Kokoro, tied together with LiveKit is my local voice agent stack. I’ll share it if you want and you can just copy the setup&lt;/p&gt; &lt;p&gt;2) code generation - what’s your best option at the moment? Eg. Are you using Open Code or something else? Are you managing this with llama.cpp and does tool calling work?&lt;/p&gt; &lt;p&gt;3) Any other enhancements - RAG, memory, web search etc &lt;/p&gt; &lt;/div&gt;&lt;!-- SC_ON --&gt; &amp;#32; submitted by &amp;#32; &lt;a href="https://old.reddit.com/user/No-Paper-557"&gt; /u/No-Paper-557 &lt;/a&gt; &lt;br /&gt; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s63wpc/post_your_favourite_local_ai_productivity_stack/"&gt;[link]&lt;/a&gt;&lt;/span&gt; &amp;#32; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s63wpc/post_your_favourite_local_ai_productivity_stack/"&gt;[comments]&lt;/a&gt;&lt;/span&gt;</content>
+    <link href="https://old.reddit.com/r/LocalLLaMA/comments/1s63wpc/post_your_favourite_local_ai_productivity_stack/"/>
     <category term="LocalLLaMA" label="r/LocalLLaMA"/>
-    <published>2026-03-28T15:25:13+00:00</published>
+    <published>2026-03-28T15:49:28+00:00</published>
   </entry>
   <entry>
     <id>t3_1s5g1v9</id>
@@ -34,17 +34,17 @@
     <published>2026-03-27T20:43:24+00:00</published>
   </entry>
   <entry>
-    <id>t3_1s60en2</id>
-    <title>Does it make sense to use 4x32Gb RAM or 2x64Gb is the only reasonable option?</title>
-    <updated>2026-03-28T13:30:32+00:00</updated>
+    <id>t3_1s64eux</id>
+    <title>Which is better : one highly capable LLM (100+B) or many smaller LLMs (&gt;20B)</title>
+    <updated>2026-03-28T16:08:30+00:00</updated>
     <author>
-      <name>/u/Real_Ebb_7417</name>
-      <uri>https://old.reddit.com/user/Real_Ebb_7417</uri>
+      <name>/u/More_Chemistry3746</name>
+      <uri>https://old.reddit.com/user/More_Chemistry3746</uri>
     </author>
-    <content type="html">&lt;!-- SC_OFF --&gt;&lt;div class="md"&gt;&lt;p&gt;Hi, I currently own:&lt;/p&gt; &lt;p&gt;GPU: RTX5080&lt;/p&gt; &lt;p&gt;CPU: AMD 9950 x3d&lt;/p&gt; &lt;p&gt;RAM: 2x32Gb DDR5 6000MT/s 30CL&lt;/p&gt; &lt;p&gt;Aaaaand I'd like to slowly gear up to be able to run bigger models OR run them faster. Obviously GPU is an important factor here (and I'm planning to change it to RTX5090), but the immediate and cheaper upgrade is to increase my RAM.&lt;/p&gt; &lt;p&gt;I could buy 2x64Gb instead of my current 2x32Gb (but with worse stats, 2x64Gb are hard to get now and almost nonexistant with 6000MT/s. I found some available with 5600MT/s and 40CL though)... But changing my RAM to 2x64Gb, while probably better, is also much more expensive.&lt;/p&gt; &lt;p&gt;Another option is to buy the same 2x32Gb that I currently have and put it next to my current RAM. (my motherboard has 4 sockets)&lt;/p&gt; &lt;p&gt;But I wonder how much it might slow down interference for models that are partially offloaded to RAM? As far as I understand, it might slow the RAM down (not sure how exactly it works, I'm not good at hardware xd), but I also don't know if it will be an issue in case of running models or playing video games (two things I care about on that PC). Maybe the bottleneck is actually somewhere else and runnning 4x32GB RAM instead of 2x64Gb won't give me any noticeable difference?&lt;/p&gt; &lt;p&gt;So... do you know if it's worth trying? Or I should totally abandon this cheaper idea and go for 2x64Gb with worse parameters?&lt;/p&gt; &lt;/div&gt;&lt;!-- SC_ON --&gt; &amp;#32; submitted by &amp;#32; &lt;a href="https://old.reddit.com/user/Real_Ebb_7417"&gt; /u/Real_Ebb_7417 &lt;/a&gt; &lt;br /&gt; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s60en2/does_it_make_sense_to_use_4x32gb_ram_or_2x64gb_is/"&gt;[link]&lt;/a&gt;&lt;/span&gt; &amp;#32; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s60en2/does_it_make_sense_to_use_4x32gb_ram_or_2x64gb_is/"&gt;[comments]&lt;/a&gt;&lt;/span&gt;</content>
-    <link href="https://old.reddit.com/r/LocalLLaMA/comments/1s60en2/does_it_make_sense_to_use_4x32gb_ram_or_2x64gb_is/"/>
+    <content type="html">&lt;!-- SC_OFF --&gt;&lt;div class="md"&gt;&lt;p&gt;I'm thinking about either having multiple PCs that run smaller models, or one powerful machine that can run a large model. Let's assume both the small and large models run in Q4 with sufficient memory and good performance&lt;/p&gt; &lt;/div&gt;&lt;!-- SC_ON --&gt; &amp;#32; submitted by &amp;#32; &lt;a href="https://old.reddit.com/user/More_Chemistry3746"&gt; /u/More_Chemistry3746 &lt;/a&gt; &lt;br /&gt; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s64eux/which_is_better_one_highly_capable_llm_100b_or/"&gt;[link]&lt;/a&gt;&lt;/span&gt; &amp;#32; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s64eux/which_is_better_one_highly_capable_llm_100b_or/"&gt;[comments]&lt;/a&gt;&lt;/span&gt;</content>
+    <link href="https://old.reddit.com/r/LocalLLaMA/comments/1s64eux/which_is_better_one_highly_capable_llm_100b_or/"/>
     <category term="LocalLLaMA" label="r/LocalLLaMA"/>
-    <published>2026-03-28T13:30:32+00:00</published>
+    <published>2026-03-28T16:08:30+00:00</published>
   </entry>
   <entry>
     <id>t3_1s57ky1</id>
@@ -59,19 +59,6 @@
     <category term="LocalLLaMA" label="r/LocalLLaMA"/>
     <published>2026-03-27T15:37:04+00:00</published>
   </entry>
-  <entry>
-    <id>t3_1s5yv7o</id>
-    <title>Running my own LLM as a beginner, quick check on models</title>
-    <updated>2026-03-28T12:20:55+00:00</updated>
-    <author>
-      <name>/u/PiratesOfTheArctic</name>
-      <uri>https://old.reddit.com/user/PiratesOfTheArctic</uri>
-    </author>
-    <content type="html">&lt;!-- SC_OFF --&gt;&lt;div class="md"&gt;&lt;p&gt;Hi everyone&lt;/p&gt; &lt;p&gt;I'm on a laptop (Dell XPS 9300, 32gb ram / 2tb drive, linux mint), don't plan to change it anytime soon.&lt;/p&gt; &lt;p&gt;I'm tip toeing my way into the llm, and would like to sense check the models I have, they were suggested by claude when asking about lightweight types, claude made the descriptions for me: &lt;/p&gt; &lt;p&gt;llama.cpp&lt;br /&gt; Openweb UI&lt;/p&gt; &lt;p&gt;Models:&lt;br /&gt; Qwen2.5-Coder 3B Q6_K - DAILY: quick Python, formulas, fast answers&lt;br /&gt; Qwen3.5-9B Q6_K - DEEP: complex financial analysis, long programs&lt;br /&gt; Gemma 3 4B Q6_K - VISION: charts, images, screenshots&lt;br /&gt; Phi-4-mini-reasoning Q6_K - CHECK: verify maths and logic&lt;/p&gt; &lt;p&gt;At the moment, they are working great, response times are reasonably ok, better than expected to be honest!&lt;/p&gt; &lt;p&gt;I'm struggling (at the moment) to fully understand, and appreciate the different models on huggingface, and wondered, are these the most 'lean' based on descriptions, or should I be looking at swapping any? I'm certainly no power user, the models will be used for data analysis (csv/ods/txt), python programming and to bounce ideas off.&lt;/p&gt; &lt;p&gt;Next week I'll be buying a dummies/idiot guide. 30 years IT experience and I'm still amazed how much and quick systems have progressed!&lt;/p&gt; &lt;/div&gt;&lt;!-- SC_ON --&gt; &amp;#32; submitted by &amp;#32; &lt;a href="https://old.reddit.com/user/PiratesOfTheArctic"&gt; /u/PiratesOfTheArctic &lt;/a&gt; &lt;br /&gt; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s5yv7o/running_my_own_llm_as_a_beginner_quick_check_on/"&gt;[link]&lt;/a&gt;&lt;/span&gt; &amp;#32; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s5yv7o/running_my_own_llm_as_a_beginner_quick_check_on/"&gt;[comments]&lt;/a&gt;&lt;/span&gt;</content>
-    <link href="https://old.reddit.com/r/LocalLLaMA/comments/1s5yv7o/running_my_own_llm_as_a_beginner_quick_check_on/"/>
-    <category term="LocalLLaMA" label="r/LocalLLaMA"/>
-    <published>2026-03-28T12:20:55+00:00</published>
-  </entry>
   <entry>
     <id>t3_1s5z0kx</id>
     <title>Local LLM evaluation advice after DPO on a psychotherapy dataset</title>
@@ -124,6 +111,19 @@
     <category term="LocalLLaMA" label="r/LocalLLaMA"/>
     <published>2026-03-28T14:13:57+00:00</published>
   </entry>
+  <entry>
+    <id>t3_1s5yv7o</id>
+    <title>Running my own LLM as a beginner, quick check on models</title>
+    <updated>2026-03-28T12:20:55+00:00</updated>
+    <author>
+      <name>/u/PiratesOfTheArctic</name>
+      <uri>https://old.reddit.com/user/PiratesOfTheArctic</uri>
+    </author>
+    <content type="html">&lt;!-- SC_OFF --&gt;&lt;div class="md"&gt;&lt;p&gt;Hi everyone&lt;/p&gt; &lt;p&gt;I'm on a laptop (Dell XPS 9300, 32gb ram / 2tb drive, linux mint), don't plan to change it anytime soon.&lt;/p&gt; &lt;p&gt;I'm tip toeing my way into the llm, and would like to sense check the models I have, they were suggested by claude when asking about lightweight types, claude made the descriptions for me: &lt;/p&gt; &lt;p&gt;llama.cpp&lt;br /&gt; Openweb UI&lt;/p&gt; &lt;p&gt;Models:&lt;br /&gt; Qwen2.5-Coder 3B Q6_K - DAILY: quick Python, formulas, fast answers&lt;br /&gt; Qwen3.5-9B Q6_K - DEEP: complex financial analysis, long programs&lt;br /&gt; Gemma 3 4B Q6_K - VISION: charts, images, screenshots&lt;br /&gt; Phi-4-mini-reasoning Q6_K - CHECK: verify maths and logic&lt;/p&gt; &lt;p&gt;At the moment, they are working great, response times are reasonably ok, better than expected to be honest!&lt;/p&gt; &lt;p&gt;I'm struggling (at the moment) to fully understand, and appreciate the different models on huggingface, and wondered, are these the most 'lean' based on descriptions, or should I be looking at swapping any? I'm certainly no power user, the models will be used for data analysis (csv/ods/txt), python programming and to bounce ideas off.&lt;/p&gt; &lt;p&gt;Next week I'll be buying a dummies/idiot guide. 30 years IT experience and I'm still amazed how much and quick systems have progressed!&lt;/p&gt; &lt;/div&gt;&lt;!-- SC_ON --&gt; &amp;#32; submitted by &amp;#32; &lt;a href="https://old.reddit.com/user/PiratesOfTheArctic"&gt; /u/PiratesOfTheArctic &lt;/a&gt; &lt;br /&gt; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s5yv7o/running_my_own_llm_as_a_beginner_quick_check_on/"&gt;[link]&lt;/a&gt;&lt;/span&gt; &amp;#32; &lt;span&gt;&lt;a href="https://old.reddit.com/r/LocalLLaMA/comments/1s5yv7o/running_my_own_llm_as_a_beginner_quick_check_on/"&gt;[comments]&lt;/a&gt;&lt;/span&gt;</content>
+    <link href="https://old.reddit.com/r/LocalLLaMA/comments/1s5yv7o/running_my_own_llm_as_a_beginner_quick_check_on/"/>
+    <category term="LocalLLaMA" label="r/LocalLLaMA"/>
+    <published>2026-03-28T12:20:55+00:00</published>
+  </entry>
   <entry>
     <id>t3_1s5plrv</id>
     <title>Anyway to get close to GPT4o on a local model (I know it’s a dumb question)</title>