Based on our analysis, Bing Chat consistently outperformed ChatGPT and Bard. Besides the clunkiness of its user experience, Bing Chat is a very impressive tool.
A majority of us have been captivated by the recent unveiling of Google’s Large Language Model (LLM), Bard. The media has lavished it with praise, dubbing it as the ultimate AI instrument for research.
Nevertheless, when I dove into using Bard as my research aid, it left much to be desired. It seemed to falter on a number of essential aspects.
To find a research tool for my purposes, I put ChatGPT, Bard, and Bing Chat to the test.
As of the writing of this article on June 15, 2023, I found Bing Chat to be the most proficient LLM widely available, owing to its impressive factual precision, the relevancy of its results, and reliable citations. ChatGPT + Browsing Plugin came in second, and Bard came in dead last.
Trust me, this discovery was as startling for me as it probably is for you.
Here is a brief overview of my analysis. The rankings are arranged in descending order, with 1 signifying the best and 3 signifying the worst.
Factual Accuracy | Relevance of Output | Citations | Writing Quality and Creativity | User Experience | Price | |
ChatGPT + Browsing Plugin | 2 | 2 | 2 | 1 | 2 | $20 |
Bard | 3 | 3 | 3 | 3 | 1 | Free |
Bing Chat | 1 | 1 | 1 | 2 | 3 | Free |
Factual Accuracy
Based on our analysis, Bing Chat had the greatest factual accuracy.
An examination was conducted using a straightforward query to the three models:
I want to know the 2022 revenue for the following companies:
1) Udemy
2) Coursera
3) Instructure
4) Chegg
This question should be a breeze for a competent browser plugin, given that these companies are public and disclose annual financial reports. Nonetheless, the models returned remarkably varied responses:
Udemy | Coursera | Chegg | Instructure | |
ChatGPT | – | – | – | – |
Bard | $1.2 billion | $580 million | $1.7 billion | $240 million |
Bing Chat | $629 million | $524 million | $760 million | $475 million |
Actual | $629 million | $524 million | $766 million | $475 million |
Bing Chat takes the lead in providing factually accurate information for research purposes. It hit the mark in its first attempt, nearly matching the actual revenue figures, barring a minor discrepancy with Chegg’s revenue – it predicted $760 million against the actual of $766 million.
ChatGPT refrained from giving any answer at all, while Bard had strong hallucinations. It overestimated Udemy’s revenue by nearly 2x.
I actually appreciate that ChatGPT admitted that it didn’t know. I would prefer my research tool admit that it does not know instead of hallucinating a confident sounding answer. That’s why I ranked ChatGPT above Bard in this category.
Full Responses
Relevance of Output
Bing Chat took the cake on this one.
I posed a question about summarizing the events of the recent Reddit blackout:
Did you hear about the Reddit blackout that started on June 12, 2023? Please summarize the details of the blackout and what the outcome was.
Bing Chat demonstrated a more current and comprehensive understanding in summarizing the news about the Reddit blackout. It pinpointed several key aspects of the news story that the other models overlooked, such as:
- The onset of the site outage on Monday morning (06/12/2023)
- The heavy reliance of the Reddit volunteer community on third-party apps
In contrast, ChatGPT merely confessed its ignorance about the outcome of the blackout, while Bard hallucinated again, saying:
Reddit backed down from its plan to charge fees for third-party access to the platform.
As of the writing of this article on June 15, 2023, that appears to be a strong hallucination. On Monday June 13, Reddit CEO Steve Huffman sent internal memo doubling down on Reddit’s plans:
There’s a lot of noise with this one. Among the noisiest we’ve seen. Please know that our teams are on it, and like all blowups on Reddit, this one will pass as well. The most important things we can do right now are stay focused, adapt to challenges, and keep moving forward.
It sounds like Redditors and the company will not be coming to an agreement anytime soon.
Full Responses
Citations
Once again, Bing Chat is the winner.
Let’s use the same prompt as before:
Did you hear about the Reddit blackout that started on June 12, 2023? Please summarize the details of the blackout and what the outcome was.
This is what Bing’s citations looked like. Your English teacher would be proud. Each citation includes the name of the article, the URL, and the date that the resource was accessed.
Source: Conversation with Bing
6/15/2023
(1) Reddit went down amid blackout protest over company’s new policy. https://www.msn.com/en-us/news/technology/reddit-is-down-amid-blackout-protest-over-companys-new-policy/ar-AA1crUHQ Accessed 6/15/2023.
(2) Reddit blackout: Thousands of communities go dark to protest … – CNN. https://www.cnn.com/2023/06/12/tech/reddit-blackout/index.html Accessed 6/15/2023.
While ChatGPT appears to cite multiple sources, it only draws information from two unique sites. The browsing plugin appears restricted by the number of webpages it can scrape per request, which is a real limitation to a research tool.
Bard, on the other hand, did not offer any citations. Even upon specific request to generate citations, it was unable to provide them.
Writing Quality and Creativity
ChatGPT takes the prize here.
I presented the three models with the following task:
Write a single paragraph, no more than 150 words, trying to inspire people to save the environment
ChatGPT’s response was so poignant it nearly moved me to tears. This was the first sentence:
Our planet is an extraordinary orchestra of life, harmoniously interwoven with remarkable ecosystems and diverse species.
However, the outputs from both Bing Chat and Bard were notably lackluster. Bard almost put me to sleep with this first sentence:
The environment is our home, and it is in danger.
Bard also disregarded the instructions by providing multiple paragraphs. The subpar writing quality and creativity demonstrated by Bard, Google’s flagship language model, was rather surprising. They still have a long way to go to catch up to ChatGPT.
Even though both Bing Chat and ChatGPT are powered by the same GPT-4 model, I was somewhat taken aback by Bing Chat’s inferior writing quality.
It seems that Bing Chat is designed for more casual, chat-like interactions with users. I consistently found Bing Chat’s responses to be more succinct compared to the other two models. I suspect this aspect influences the overall writing quality delivered by Bing Chat.
Full Responses
Price
AI Tool | Price |
---|---|
ChatGPT + Browsing Plugin | $20 / month |
Bard | Free |
Bing Chat | Free |
Hard to compete with free, isn’t it? Both Bing Chat and Bard offer their LLM services without any cost, making them ideal choices for cost-conscious users.
ChatGPT Plus users get access to GPT-4, but users only get 25 calls every 3 hours to the flagship model. ChatGPT carries a price tag of $20 a month.
However, the real bargain here is Bing Chat. Not only is it free, but it also provides unlimited daily calls to GPT-4. Though they claim to limit users to 30 calls per day, I discovered a loophole – simply refreshing the page gives me a new batch of tokens for the day. But let’s keep that between us!
Access to GPT-4 through Bing Chat for free is a smashing deal, and I encourage you to take advantage of it.
User Experience
My favorite user experience has to be Bard. Thanks to an array of handy features that seamlessly integrate AI into your Google Workspace, Bard provides a provides a rich search experience.
Google has been in this game for longer than I have been alive, and I would expect nothing less. Bard has the following features that I find really useful:
Export to Gmail and Docs and a “Google It” button if you do not find what you are looking for.
Ability to view 3 different generated versions of responses to your prompt, which is a feature I wish all AI’s had.
The Verdict
In the arena of large language models as research aids, Bing Chat undoubtedly takes the top spot, leaving both ChatGPT and Bard trailing behind. While it may not excel in writing flair and user experience, Bing Chat compensates with its unfailing factual accuracy, relevance of responses, and solid citation methods, thus qualifying as a superb choice for thorough and credible research.
Moreover, its free access and potential for unlimited daily calls to GPT-4 elevate its cost-effectiveness, despite not leading in creative writing or user experience.
If you’re looking for a reliable, accurate, and cost-effective AI research tool, Bing Chat should be your go-to option. It’s a testament to the fact that efficiency in AI does not always need to come at a price.