Why don't I receive the same level of response using the API as I do using the official UI search? #37

caos30 · 2024-09-20T02:19:22Z

caos30
Sep 20, 2024

Hello, I'm starting to use the Perplexity API to integrate information query functionalities into an ERP. As I'm just beginning, I suspect I might be doing something wrong.

The issue is that a simple question like "Is there any module related to invoicing in Colombia available on Dolistore?" returns a very interesting response from the web UI, mentioning and COMMENTING some modules offered on Dolistore. However, when I use the API, it returns a paragraph that "doesn't say anything useful" like this:

Yes, there are some modules. I recommend searching Dolistore for something that fits your needs... read the reviews...

Am I doing something wrong?

The endpoints I've tried are llama-3.1-sonar-small-128k-online and llama-3.1-sonar-large-128k-online, and the differences in the responses are minimal. Honestly, it seems like it's not searching the internet, or if it does, it rarely finds anything relevant, which surprises me because Perplexity is very good at online searches.

Additionally, I'd like to ask how I can obtain the "sources" or "citations" used for the response. I have this variable defined in the API call, but it never returns anything:

return_citations: true

By the way, other variables I use:

temperature: 0.1
top_p: 0.5
presence_penalty: 1
search_recency_filter: month

The system prompt is the typical one for any "chat completion" endpoint. That's not wrong, right? Should I be using a "search prompt" or a special format for system and user messages? I haven't found much relevant information in the API documentation, which is why I'm writing here.

Thanks in advance for any help.

Answered by caos30

Sep 20, 2024

After writing my initial post, I revisited the API reference regarding this matter and discovered some mistakes I was making. Upon fixing these issues, the results began to more closely resemble those offered by the web UI:

Changing the following parameter from 0.5 to 0.9 made the answer seem "a bit more diverse" (it became more proactive in adding things not directly requested, which is fine for me):

top_p: 0.9

However, the real magic happened when I omitted these two parameters I had been sending:

search_domain_filter: array("perplexity.ai")
search_recency_filter: "month"

I first removed the search_domain_filter, but the resulting answer to my query barely changed. Then, when I omitte…

View full answer

caos30 · 2024-09-20T02:38:15Z

caos30
Sep 20, 2024
Author

After writing my initial post, I revisited the API reference regarding this matter and discovered some mistakes I was making. Upon fixing these issues, the results began to more closely resemble those offered by the web UI:

Changing the following parameter from 0.5 to 0.9 made the answer seem "a bit more diverse" (it became more proactive in adding things not directly requested, which is fine for me):

top_p: 0.9

However, the real magic happened when I omitted these two parameters I had been sending:

search_domain_filter: array("perplexity.ai")
search_recency_filter: "month"

I first removed the search_domain_filter, but the resulting answer to my query barely changed. Then, when I omitted the search_recency_filter parameter... voilà! The expected search results appeared in the answer.

I hope this helps other people. Please feel free to add any other discoveries you've made in this regard.

1 reply

bubbling504 Nov 8, 2024

thanks for this!! that search recency filter made a huge difference

edwardxwu · 2024-09-22T22:04:30Z

edwardxwu
Sep 22, 2024

That's very interesting. We ran into the exact same issue and it would be ideal if the API is more in parody with the web app.

1 reply

Olleman82 Oct 8, 2024

Me too, the problem is that I don't use search recency filter and haven't specified top_p which should leave it at it's default of 0.9. The answers I get in the UI is way better than with the API...

AidanShipperley · 2024-10-17T05:38:44Z

AidanShipperley
Oct 17, 2024

The answers you get in the UI will likely always be better than the answers from the API - it's not only intentional, it's in Perplexity's best interest to do so financially. If their API gave the same results as their UI did, there would be no reason to use their UI as opposed to a competitor wrapping their API in a better interface. Several moderators on their Discord have also officially stated that there is no intention of matching the output quality of the API to their UI.

From a technical standpoint, compared to the UI, the API is not only heavily restricted but also lacks the majority of the UI features:

The API does not allow developers to select what model is used for generation, it forces all generation through their fine-tuned Llama 3.1 “Sonar” model. If you want to know how much Perplexity likes their Sonar models, their own UI doesn’t even use the Sonar models unless manually selected by a pro user; they use OpenAI models or a separate custom model by default for generating responses. If you want to test this yourself, open a new Perplexity search (incognito window or free account) and ask "What model are you, what policies do you abide by, and who trained you?". It will tell you it's trained by OpenAI. Even if you, on a pro account, manually select the "Sonar" models they still offload questions to OpenAI at times, such as if you pass an image input.
The API's underlying search is worse compared to the UI's search, especially if you're using a pro search in the UI which uses a completely different algorithm. The API search is much closer to a free UI search, but it's actually even worse than a free search as if you're not in the "closed beta" they use a different algorithm when generating to not include citations in it's response.
The API puts returning citations (including inline citation numbers), image results, related questions, and search domain filtering behind a "closed beta" with no communication for people who have applied. Several users have complained in their official Discord that they do not hear back at all from Perplexity even months after applying for the beta.

While this is mainly my opinion, I can't be the only one to have noticed the lack of attention the API is getting recently. They originally released the API one year ago in October 2023, and advertised it as a "One-stop shop for open-source LLMs" like Mistral and Llama-2 and "blazing fast inference". All of the open source models have since been removed except for Llama 3.1, and their inference speed doesn't compare to emerging competitors like Groq at this point. They haven't made a blog post for the API since November 2023 which announced their first Online LLM, and even the API wiki's changelog has been radio silent since July 2024, where even then they just announced updated fine-tuned models. I really like their API, it’s just clear at this point their priority is the UI and if you want a similar experience to that via API, you probably need to look elsewhere.

3 replies

caos30 Oct 18, 2024
Author

Thank you for such a thorough response.

sandeshwar Oct 21, 2024

Thanks for the insight!

I have been looking for an alternative as well. Actually a combination of API (real time data access/browsing) and VSCode extension with features like Copilt/Cline etc.

If you are using any service and it's better, then please do let us know, it would be really helpful. Thanks!

shubhang98 Oct 21, 2024

Your feedback is really appreciated. We are a small team that's trying to keep up with explosive growth so I apologize for any inconvenience that might have been caused due to our bandwidth issues. The API is a priority for the company and we are scaling up our team to release some exciting updates to the product later this quarter and in Q1 2025 (we are aware of the pain points mentioned here). We are also scaling up our support team to improve the customer experience around the product.

Thanks for your patience and we are looking to address most of the points raised here in the next couple of months.

shubhang98 · 2024-10-21T21:13:51Z

shubhang98
Oct 21, 2024

The answers you get from the API should closely resemble the default search answers in the UI. The API uses the same search subsystem as the UI with small differences in configuration. However, you could be seeing differences between the API and UI due to the following reasons:

Pro Search

The API doesn't support Pro Search today. Pro Search uses a multi-step reasoning process which increases the quality of the answer.

Using third party models

At this time, the API only supports the Sonar models. Using other third party models like GPT-4o/Sonnet 3.5 in the UI could lead to diverging results.

Tuning of sampling parameters (presence_penalty, top_p etc) and system prompt in the API

Our defaults are tuned to give the best results from the API and match the default search experience in the UI. We give users the power to tune the API to their respective use cases and custom tuning to specific use cases might lead to less generalization of the API/different results vs the UI. We recommend not to explicitly provide sampling parameters in your API requests if you want parity with the default experience in the UI.

0 replies

YunfanZhang42 · 2024-12-16T11:18:48Z

YunfanZhang42
Dec 16, 2024

@shubhang98 As of Dec 16th, 2024, Perplexity APIs are still lacking in terms of features and quality compared to UI search, even in free tier. Two features most relevant to me:

The API search does not quote the same references compared to UI. The refereces in the UI search have much higher quality and are more relevant compared to API search, at least for examples that I have tried. So this implies that API search does use a different search subsystem, contrary to what @shubhang98 suggested.
The API only provide URLs to the references, rather than the cleaned text that the LLMs actually see. If you mouse over the references in the UI, you can notice that perplexity has a nice cleaned text available to its LLMs. However the cleaned text is not present in the API, and it could be exceedingly difficult for developers to replicate what perplexity have done in terms of processing the text and make the references actually useful.

Is there a roadmap to bring API services on par with the UI search? Would appreciate a response from perplexity team. Thanks a lot!

1 reply

AnshulJ999 Jan 5, 2025

Have there been no updates regarding this? API search is severely lacking compared to the UI search: Reddit results are completely missing, and the citations are often irrelevant and off-topic.

andrej-griniuk · 2025-01-23T04:52:56Z

andrej-griniuk
Jan 23, 2025

Would also appreciate any update on this. For my use case given the same prompt the UI search is always spot on, while API search is usually far from that. I was hoping the new sonar/sonar-pro models would solve or at least improve this, but it didn't really happen.

0 replies

caos30 · 2025-01-23T20:33:57Z

caos30
Jan 23, 2025
Author

I integrated Sonar & Sonar PRO with one of my software applications yesterday, and I did some testing. I agree with you: the responses haven't improved.

It was certainly a great improvement when a few months ago the API started returning the "citations" array. This has already allowed me to give a more professional touch and better user experience to my software integrated with "Perplexity searches."

But yesterday, I just did a couple of simple tests with a prompt like this: "make me a comparison of the latest models with reasoning abilities, even if they're experimental or Chinese," and the results returned by the API are quite confusing.

In fact, I couldn't say if they're "worse" than those returned by PPLX's official web UI, because truthfully, every time I run the query in either environment, the response varies quite a bit in format and content.

What does seem apparent is that in the official UI, the result is more "thoughtful" and the response is more elaborate, in terms of understanding the question and expected answer well. While in the API response, it responds by listing models that aren't reasoning models or even mentioning very old models like GPT-3!?

Another very strange thing is that I haven't noticed ANY DIFFERENCE between Sonar and Sonar PRO responses!? Neither in speed, format, nor content. Even stranger: in some prompts, Sonar's response is better than Sonar PRO's?!?

I hope someone from PPLX might be reading this and check if there's any problem with that implementation... but I would say it's always the same model responding. Could that be?

Regards!

Note: anyway, I remain a firm enthusiast of Perplexity's value and the great work their champion team has done so far. I never tire of recommending it and bringing them new fans :-)

2 replies

shubhang98 Jan 29, 2025

Thanks @caos30 for the feedback. Could you share you exact API request you are using for Sonar and Sonar Pro for each of your issues? We will take a look.

caos30 Jan 30, 2025
Author

Hie @shubhang98 thanks a lot for your cooperation. Below i pasted 2 queries made on 2025/01/22, both to endpoint:

https://api.perplexity.ai/chat/completions

Although the language used is spanish, but you can easily translate it to english 😅

SONAR MODEL REQUEST

{"model":"sonar","messages":"[{\"role\":\"system\",\"content\":\"Always respond in the same language used by visitor.\"},{\"role\":\"user\",\"content\":\"Cu\\u00e1les son las \\u00faltimas versiones de modelos LLM que han aparecido en Enero de 2025 incluyendo modelos experimentales?\"}]","max_tokens":3000,"temperature":0.1,"top_p":0.9,"presence_penalty":1,"return_citations":true,"return_related_questions":false,"return_images":false,"top_k":0,"stream":false}

SONAR MODEL RESPONSE

{"id":"e5dffd13-e7a9-4552-bf3b-b37625d7a2e9","model":"sonar","created":1737590762,"usage":{"prompt_tokens":41,"completion_tokens":504,"total_tokens":545},"citations":["https:\/\/www.shakudo.io\/blog\/top-9-large-language-models","https:\/\/botpress.com\/es\/blog\/best-large-language-models","https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/openai\/whats-new","https:\/\/noticias.frecuenciaonline.com\/2025\/01\/15\/tendencias-clave-para-modelos-llm-2025\/","https:\/\/www.unite.ai\/es\/mejores-modelos-de-lenguajes-grandes-llms\/"],"object":"chat.completion","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"A continuaci&oacute;n, te presento algunas de las &uacute;ltimas versiones de modelos LLM que han aparecido o han sido mencionadas en enero de 2025, incluyendo modelos experimentales:\n\n## Modelos LLM destacados en enero de 2025\n\n1. **GPT-4o y GPT-4o mini**: Estos modelos de OpenAI son conocidos por su capacidad de procesamiento multimodal y su gran ventana de contexto de 128,000 tokens. Aunque no se menciona una versi&oacute;n espec&iacute;fica lanzada en enero de 2025, siguen siendo relevantes en el mercado[1][3].\n\n2. **EXAONE 3.0 de LG AI**: Lanzado en diciembre de 2024, este modelo es una versi&oacute;n avanzada con 7.8 mil millones de par&aacute;metros, optimizado para reducir tiempos de inferencia y costos operativos. Aunque no es una versi&oacute;n de enero de 2025, es una de las &uacute;ltimas actualizaciones significativas[1].\n\n3. **LLaMA 3.2 de Meta**: Aunque se menciona que LLaMA 3.1 fue lanzado en abril de 2024, LLaMA 3.2 se lanz&oacute; en septiembre de 2024. No hay una versi&oacute;n espec&iacute;fica de enero de 2025, pero sigue siendo relevante por su naturaleza de c&oacute;di"},"delta":{"role":"assistant","content":""}}]}

SONAR-PRO MODEL REQUEST

{"model":"sonar-pro","messages":"[{\"role\":\"system\",\"content\":\"Always respond in the same language used by visitor.\"},{\"role\":\"user\",\"content\":\"Cu\\u00e1les son las \\u00faltimas versiones de modelos LLM que han aparecido en Enero de 2025 incluyendo modelos experimentales?\"}]","max_tokens":3000,"temperature":0.1,"top_p":0.9,"presence_penalty":1,"return_citations":true,"return_related_questions":false,"return_images":false,"top_k":0,"stream":false}

SONAR-PRO MODEL RESPONSE

{"id":"6949bd58-ce5e-4a37-bcbc-274ab9e80ca8","model":"sonar-pro","created":1737590877,"usage":{"prompt_tokens":41,"completion_tokens":500,"total_tokens":541,"citation_tokens":5459,"num_search_queries":2},"citations":["https:\/\/www.shakudo.io\/blog\/top-9-large-language-models","https:\/\/arxiv.org\/abs\/2501.09686","https:\/\/botpress.com\/es\/blog\/best-large-language-models","https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/openai\/whats-new","https:\/\/simonwillison.net\/2025\/Jan\/22\/llm-gemini\/","https:\/\/noticias.frecuenciaonline.com\/2025\/01\/15\/tendencias-clave-para-modelos-llm-2025\/","https:\/\/www.youtube.com\/watch?v=kogy04oZMuA","https:\/\/mlops.substack.com\/p\/ml-overview-what-might-be-coming","https:\/\/www.datacamp.com\/es\/blog\/top-open-source-llms","https:\/\/www.youtube.com\/watch?v=SYTCa0H_aaw"],"object":"chat.completion","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"En enero de 2025, se han lanzado varias nuevas versiones y modelos experimentales de LLM:\n\n1. GPT-4o y GPT-4o mini: Estos son los &uacute;ltimos modelos de OpenAI, con m&aacute;s de 175 mil millones de par&aacute;metros y una ventana de contexto de 128,000 tokens. Ofrecen capacidades multimodales para manejar texto, voz e im&aacute;genes[1].\n\n2. LlaMA 3.2: Lanzado por Meta en septiembre de 2024, este modelo tiene capacidades multimodales para procesar texto e im&aacute;genes. Viene en versiones de 8, 70 y 405 mil millones de par&aacute;metros[1].\n\n3. EXAONE 3.0: Desarrollado por LG AI Research, este modelo biling&uuml;e de 7.8 mil millones de par&aacute;metros fue lanzado en diciembre de 2024[1].\n\n4. o1-preview y o1-mini: Estos son nuevos modelos experimentales de Azure OpenAI dise&ntilde;ados para tareas de razonamiento y resoluci&oacute;n de problemas. Est&aacute;n disponibles con acceso limitado[4].\n\n5. learnlm-1.5-pro-experimental: Un modelo experimental de Google espec&iacute;ficamente entrenado para alinearse con principios de ciencias del aprendizaje[5].\n\n6. gemini-2.0-flash-thinkin"},"delta":{"role":"assistant","content":""}}]}

SONAR model screenshot

SONAR-PRO model screenshot

Final toughts
Now that I've taken the time to capture these two calls for you, I notice that SONAR-PRO seems to have searched (found?) 10 sources, while SONAR only 5 sources?

However, the quality of the response in the PRO case doesn't seem to be fully optimal:

in fact, the markdown appears "less polished" than in teh SONAR answer
from 5 enumerated models it only increases to 7, although certainly being more specific regarding versions of the models
some of the listed models (both with SONAR and SONAR-PRO) are from September 2024, even it's indicated in the response (!?), even though it was requested for 2025 in the prompt
there are clear omissions such as DeepSeek or Minimax models
interestingly, I also made the same request with the PROMPT in English (see below) and the response improved considerably, with DeepSeek, QwQ, and Mistral Large 2 models being mentioned

Especially regarding this last point, I know that your models' responses heavily depend on search results! I mention this because if you wanted to GREATLY improve the quality of responses, you might consider in Perplexity (generally for any model) to always perform an English search along with the search in the language used in the prompt or by the user, to much better complement the provided information. It's a fact that there is much more information published in English.

Being aware of this, I sometimes directly write my search prompts in Perplexity in English, but many users worldwide don't know this or can barely write in English. I know this is "off-topic," but I'm taking advantage of your time to mention it and hopefully you can take it into account. Thank you.

SONAR-PRO model with ENGLISH PROMPT screenshot

Thanks a lot for this kind of enthusiasm to improve your already great service. You're the best! 🤩

dshardul · 2025-01-29T17:57:05Z

dshardul
Jan 29, 2025

I've noticed similar results. Even with Deepseek now integrated into the sonar-reasoning model, I only get a couple citations on the API whereas I get 10+ on the UI. I've even added into the prompt to specifically give me 10 or more citations and it refuses. Search is a large part of what I want to build, anyone having better luck with Search through the API?

1 reply

caos30 Jan 30, 2025
Author

Please look at my comment above (#discussioncomment-12001857), where I consistently obtain between 5-10 source links (citations) simply by using SONAR and SONAR-PRO.

Note: I don't believe that specifying the desired number of sources in the prompt would have any effect. This parameter is likely "hardcoded" within one of the layers of Perplexity's "mysterious search logic," and I don't think the model itself can control this aspect. I might be wrong about this. Only someone from PPLX could provide a clear answer on this matter. However, I suspect this is part of their proprietary "secret formula" 😊

sallard · 2025-01-29T23:50:12Z

sallard
Jan 29, 2025

Same trouble here. It seems even worse with the new Sonar model. It's close to useless at this point.
Sonar-pro gives more detailed answers... but too far from my initial request, while the official Perplexity UI gives me perfect results.
So disappointing.

I've also noticed the quality of citations provided with API is so much lower than the UI, also considering API provides text based on 10 citations while UI can go above 40 citations...
Sometimes the insights provided in the API text can't be found in the linked citation.

@shubhang98 Some guidance on how to be closer to the Perplexity experience would be highly appreciated. Thanks in advance.

0 replies

Why don't I receive the same level of response using the API as I do using the official UI search? #37

Uh oh!

Replies: 9 comments · 9 replies

Uh oh!

caos30 Sep 20, 2024 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

caos30 Oct 18, 2024 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

caos30 Jan 23, 2025 Author

Uh oh!

Uh oh!

caos30 Jan 30, 2025 Author

Uh oh!

Uh oh!

caos30 Jan 30, 2025 Author

Uh oh!

Uh oh!

Replies: 9 comments 9 replies

caos30
Sep 20, 2024
Author

caos30 Oct 18, 2024
Author

caos30
Jan 23, 2025
Author

caos30 Jan 30, 2025
Author

caos30 Jan 30, 2025
Author