Connect with us

Accounting

Which generative AI model did best on the CPA exam? Depends on the section

Published

on

ChatGPT is no longer the only large language model to pass the CPA exam.

After ChatGPT 3.5 initially bombed the CPA exam and then version 4.0 passed, it does remain the top performer overall. However, like any human accountant, it has its strengths and weaknesses.

These were part of the findings of a recent paper from Case Western Reserve University and accounting automation solutions provider AIgency. The researchers systematically evaluated the performance of Google Gemini, ChatGPT-4, Claude, Mixtral and Llama-2b on multiple-choice questions from CPA test preparation tools.

Overall, they found that ChatGPT-4 scored the best, with Claude 3-opus coming in a close second, followed by Google Gemini Advanced, then Mixtral-8x7b-32768. Llama 2B-70b-4096 did the worst.

Source: William Zacher Jr. & Sanmukh Kuppannagari

However, as the results show, not every model did uniformly well on all sections. ChatGPT, while a strong performer overall, was especially good on the BAR section for business analysis and reporting. Meanwhile, although its weakest point is REG, the regulatory area that is mostly devoted to tax regulations, it did better on this section of the exam than any other model. Claude was the best performer in the AUD section on auditing and attestation. While its weakest point was FAR, the section on financial accounting and reporting, even there its performance was second only to ChatGPT. Gemini was the second strongest performer on the BAR section, but did not do so well on REG. Mixtral, overall, had decent enough scores compared to a human but would only pass BAR, making it a mediocre player compared to its peers. Llama was the only one that would not pass any section, and it did especially poorly on REG. It was also the only one that did worse than a human. The average score for human test takers on REG was 59.19%, according to the paper.

“The study revealed that while some LLMs have made significant advances in mimicking the complex decision-making skills required for CPA exams, there remains variability in performance across different sections of the test,” said the paper. “This variability underlines the importance of tailored training and specialization in developing LLMs for professional applications such as the CPA exams.”

To perform the test, the researchers drew their multiple choice questions from the Becker CPA test preparation suite. Google Gemini, Claude and ChatGPT-4 were accessed via their online platforms. Mixtral and Llama-2b models were accessed through the Groq platform, an advanced computational infrastructure for high-speed AI processing. The questions were directly copied and pasted into the AI platforms from Becker’s test preparation material without any additional prompting or modification to ensure each AI model received the questions in their original form as they would appear in a CPA exam context.

Becker’s platform randomized the questions in batches of 15 questions, which the research said further mitigated potential selection bias. The tester, responsible for inputting the questions into the AI models, deliberately refrained from reading or evaluating the questions beforehand to prevent any unconscious bias in the prompting process. For each question, the tester selected the AI model’s first response marked as “correct,” irrespective of any variations in the explanations or outputs provided by different models.

Each AI model was subjected to each multiple choice section of the CPA test three times, allowing for a comprehensive assessment of its performance across multiple attempts. The criterion for determining an AI model’s success in this study was achieving a passing score, defined as an average score of 75 or higher, on any given section.

The researchers said the data indicates there is no one universal model for all tasks, so it is important to use the right model for the right applications. For example, the paper concluded that ChatGPT is “the only real option for zero-shot BAR automation,” as “no other model came close to its performance, and it had a relatively narrow variance,” meaning that ChatGPT-4 could be used to help with automated financial statement preparation or additional forecasting. On the other hand, the researchers said Claude was probably better on auditing-related tasks, which the paper said “is a solid indication that it can be used for fraud detection and internal control validation.”

“It is apparent from the results that there is no clear-cut winner,” the researchers concluded. “Most companies utilizing AI to perform financial administration functions should use a software infrastructure that allows them to use multiple task-dependent AI models.”

However, the researchers did recommend that “model selection for AI in an applied accounting setting should avoid Llama-2B, which performed worse than any other model in every section.”

Continue Reading

Accounting

In the blogs: To be continued?

Published

on

TikTok and taxes; future of L.A. revenues; engagement limits; and other highlights from our favorite tax bloggers.

Continue Reading

Accounting

Carr, Riggs & Ingram merges in CapinCrouse

Published

on

Carr, Riggs & Ingram, a Top 25 Firm based in Enterprise, Alabama, has added CapinCrouse, a Regional Leader based in Indianapolis, effective Jan. 17, 2025.

The deal is CRI’s biggest merger in its history, and the first since it received outside investment last November from Centerbridge Partners and Bessemer Venture Partners. 

CapinCrouse focuses on exclusively serving nonprofits, such as faith-based  organizations and private colleges. The merger will add 40 partners, 185 professionals and 15 offices to CRI, which has 437 partners and 2,304 staff 

After the outside investment, CRI split its attest and non-attest practices, as is common when accounting firms receive private equity or venture capital funding. Carr, Riggs & Ingram, L.L.C., as an independent licensed CPA firm, is providing assurance, attest and audit services. CRI Advisors, LLC (including its subsidiary entities) operates as a separate legal entity, providing clients with tax and business consulting services.  

“This merger represents an exciting milestone in our firm’s history and a significant  advancement for both CRI and CapinCrouse,” said CRI Advisors LLC chairman Bill Carr in a statement Tuesday. “We have previously invested in firms that specialize in serving faith-based  organizations and private colleges. With the addition of CapinCrouse, CRI is now  positioned to become the leading national provider in these vital markets. By combining  our strengths, we will enhance the value we offer and greatly expand our national  geographical presence. We are proud to welcome CapinCrouse to the CRI family.” 

Financial terms of the deal were not disclosed. CRI ranked No. 24 on Accounting Today‘s 2024 list of the Top 100 Firms, with $455.36 million in annual revenue. CapinCrouse ranked No. 27 on Accounting Today‘s Regional Leaders list of the Top Firms in the Great Lakes region, with $35.51 million in annual revenue.

“We are very pleased to join CRI,” said Fran Brown, Managing Partner of CapinCrouse. “For  over 50 years, our focus has been on providing innovative service to nonprofit  organizations whose outcomes are measured in lives changed. CRI’s commitment to client service, respect, and integrity is an excellent fit with our mission and firm culture. We will  continue to operate under the CapinCrouse brand and are excited to now have access to  more offerings and resources to further drive exceptional client service.” 

Koltin Consulting Group CEO Allan Koltin advised both firms on the merger. “It is interesting to note that this is CRI’s biggest M&A deal in its history, and it comes on the heels of their private equity deal with Centerbridge Partners and Bessemer Venture Partners,” he said in a statement. “CapinCrouse, a top 125 firm nationally, is viewed by many as the preeminent firm in the country when it comes to the audit and related advisory  services of nonprofits and religious organizations. My intuition suggests that going forward, we will see CRI expanding its geographic reach nationally by combining with more top 200 firms.” 

Last August, CRI added ProSport CPA, a firm in New Kent County, Virginia, offering tax and accounting services within the sports and entertainment niche. In 2023, CRI expanded into Oklahoma by adding Stanfield + O’Dell PC, a firm in Tulsa. CRI expanded to South Carolina in 2022 by adding Lanning Group LLC, a firm based in Mount Pleasant in the Charleston suburbs, and expanded in Florida by adding Alonso & Garcia, a firm in Miami. It expanded that year in Florida by adding Travani & Richter in Jupiter, and in Texas by adding Pharr Bounds LLP in Austin.

In 2022, CapinCrouse acquired the Global Center for Nonprofit Excellence.

Continue Reading

Accounting

Trump names Mark Uyeda acting chair of SEC

Published

on

uyeda-mark-sec.png

SEC commissioner Mark Uyeda, speaking at the AICPA & CIMA Conference on Current SEC and PCAOB Developments

President Donald Trump named Mark Uyeda, a Republican member of the Securities and Exchange Commission, as acting chairman of the SEC, while confirmation hearings await for Trump’s official pick as chairman, Paul Atkins.

Uyeda has been an SEC commissioner since 2022 and a member of the staff since 2006. Last month, he discussed at an AICPA & CIMA conference in Washington how the SEC is likely to pursue a more deregulatory approach during the Trump administration. The previous SEC chair, Gary Gensler, has pursued an active approach to enforcement and rulemaking, provoking opposition and a wave of lawsuits from the financial industry. A few weeks after the election, Gensler announced plans to step down on Jan. 20, Inauguration Day. 

“I am honored to serve in this capacity after serving as a Commissioner since 2022, and a member of the staff since 2006,” Uyeda said in a statement Monday. “I have great respect for the knowledge, expertise and experience of the agency and its people. The SEC has a vital mission—protecting investors, maintaining fair, orderly, and efficient markets, and facilitating capital formation—that plays a key role in promoting innovation, jobs creation, and the American Dream.”

Last month, Trump named Paul Atkins, a former SEC commissioner, as a replacement for Gensler. Atkins has been a proponent of cryptocurrency, while Gensler had imposed steep penalties on companies in the crypto industry. Confirmation hearings have not yet begun for Atkinds, but he has been meeting with lawmakers privately and is expected to be confirmed.

As acting chairman, Uyeda announced Monday that he would be launching a crypto task force dedicated to developing a comprehensive and clear regulatory framework for crypto assets. The task force will be led by another Republican commissioner, Hester Peirce. 

The task force plans to collaborate with SEC staff and the public to set the SEC on a regulatory path as opposed to pursuing enforcement actions to regulate crypto “retroactively and reactively,” according to a news release.

“This undertaking will take time, patience and much hard work,” Peirce said in a statement. “It will succeed only if the Task Force has input from a wide range of investors, industry participants, academics and other interested parties. We look forward to working hand-in-hand with the public to foster a regulatory environment that protects investors, facilitates capital formation, fosters market integrity, and supports innovation.”

The task force plans to hold roundtables in the future, but in the meantime is asking for public input at [email protected].  

Continue Reading

Trending