Connect with us

Accounting

Microsoft researchers teach LLMs to use spreadsheets well

Published

on

Large language models like ChatGPT have traditionally had trouble reading and interacting with spreadsheets, limiting their application in this realm, but recent research from Microsoft claims to have found an answer. 

The paper, SPREADSHEETLLM: Encoding Spreadsheets for Large Language Models, described the problems LLMs typically face with spreadsheets and proposed what it called the “SheetCompressor” framework to address them. 

The issue LLMs have with spreadsheets has to do with tokenage requirements. LLMs, generally, run on “tokens,” which are the basic units of data the model processes. Tokens are words, character sets, or combinations of words and punctuation that are used by large language models to decompose text into. LLMs operate by converting input text into a series of tokens, which the model then uses to understand and generate responses. 

The number of tokens determines the computational cost and capacity needed to handle the input, making token management crucial, especially for complex data like spreadsheets. For example, the phrase, “I heard a dog bark loudly at a cat” would be represented by eight tokens, one for each unique word. In order to preserve system resources, many LLMs have token limits, but even in a limitless environment, complex jobs are resource intensive, with significant computational effort that affects both performance and efficiency. 

Typically, each part of a spreadsheet — even blank cells or repeating cells or those with irrelevant information — costs tokens, meaning even a simple spreadsheet has a much higher token requirement than traditional text. Furthermore, LLMs often struggle with spreadsheet-specific features such as cell addresses and formats, complicating their ability to effectively parse and utilize spreadsheet data. These challenges have limited just how much generative AI models can be applied to reading and interacting with spreadsheets. Considering how many spreadsheets the profession tends to use, this consequently limits their application towards deep accounting work. 

What Microsoft researchers discovered, in short, is that the LLM does not need to burn tokens reading and processing the entire spreadsheet. Instead people can create a compressed version of the document to function as something like an index, with markers or “anchors” indicating especially important information like totals. Additional compression comes from grouping together similar types of data like date columns. So, in a sense, the LLM does not work through the spreadsheet itself but instead references it via a much more efficient index. 

Complex spreadsheets are further supported through a concept called “chain of spreadsheet,” which is similar to “chain of thought” prompting. The method unfolds in two stages. First, the model identifies the table that is relevant to the query and determines the precise boundaries of the relevant content. This step ensures that only pertinent data is considered in the subsequent analysis. Then, the query and the identified table section are re-input into the LLM. The model then processes this information to generate an accurate response to the query.

“Through the CoS, SPREADSHEETLLM effectively handles complex spreadsheets by breaking down the process into manageable parts, thus enabling precise and context-aware responses,” said the paper. 

Experiments with this method found that it significantly increased performance on larger spreadsheets where token limits are a particular challenge. The F1 score (which is used to measure the accuracy of an AI model) for massive spreadsheets was 75% higher than GPT-4 and 19% higher than TableSense-CNN, another spreadsheet methodology for AI; for large spreadsheets, the difference was 45% and 17% respectively; for medium spreadsheets it was 13% and 5%; and for small spreadsheets it was 8%. Overall, the results show that while the method gets more effective the larger the spreadsheet, it can still improve the efficiency of even small spreadsheets. 

“Through a novel encoding method, SHEETCOMPRESSOR, this framework effectively addresses the challenges posed by the size, diversity, and complexity inherent in spreadsheets,” the paper concluded. “It achieves a substantial reduction in token usage and computational costs, enabling practical applications on large datasets. The fine-tuning of various cutting-edge LLMs further enhances the performance of spreadsheet understanding. Moreover, Chain of Spreadsheet, the framework’s extension to spreadsheet downstream tasks illustrates its broad applicability and potential to transform spreadsheet data management and analysis, paving the way for more intelligent and efficient user interactions.”

Implications

Donny Shimamoto, founder and managing director of accounting tech-focused accounting firm IntrapriseTechKnowlogies said, by enabling LLMs to “understand” tabular spreadsheets, accountants will have increased ability to either summarize or analyze a set of data. More than that, however, he said this will likely allow even non-accountants to do the same, removing the accountant as the middle person. However while some accountants may see this as a threat, he said what this would mainly do is clear the majority of simple inquiries from their plates, letting them save their energy for more complex questions and deeper analysis.

“Implementing something like this will require good testing to ensure that the risk of hallucinations is minimized, especially if it is going to help provide non-accountants with information to support decision-making,” said Shimamoto.

David Wood, a Bringham Young University accounting professor who specializes in AI within the profession, raised a similar point, as it would allow those without significant technical knowledge to do the same kinds of tasks that, previously, could only be done by seasoned accounting experts. He raised the example of novices being able to use generative AI to make spreadsheets that only expert professionals could put together. However, while he thinks this could be possible soon, he said that, despite the Microsoft research, it hasn’t arrived just yet.

“However, there are at least three challenges holding back using GenAI with spreadsheets: the size and complexity of the spreadsheets, and the required accuracy for most uses of spreadsheets. This paper takes a large step in the right direction, but it doesn’t solve all the challenges and more work will still be needed in each of these three areas. It would be a mistake to assume that after reading this paper, we have fully figured out how to use spreadsheets and GenAI together. More work is still needed. … I think the path these researchers are taking is significant, but the research “hasn’t arrived yet” meaning that more work is needed. The accuracy rates are just not high enough…yet. Hopefully this paves the way for the next researcher to move it forward further.” he said in an email.

Continue Reading

Accounting

The basics of tax-aware long-short investment strategies

Published

on

Financial advisors and clients seeking to boost the tax savings available through loss harvesting may consider an increasingly popular leveraging strategy known as the “long-short” method.

The combination of “long” investments on a stock’s positive outlook with “short” ones based on equity declines, plus margin loans that add debt leverage to the vehicle, may turn off some advisors with risk-averse clients who don’t have a lot of capital gains that need offsetting. But tax-aware long-short investing is drawing clients seeking to maximize returns through active management on a lengthy timeline with lower payments to Uncle Sam.

At their root, tax-aware long-short vehicles present “an opportunity to go overweight certain factors and go underweight certain factors and find alpha between the two,” said Brent Sullivan, a consultant on taxable investing product distribution to sub-advisory and ETF firms who writes the Tax Alpha Insider blog. The accompanying tax savings stem from loss harvesting that “oftentimes will exceed a dollar contributed” or could even reach 200% to 400% of the principal, he noted. Continual rebalancing pushes up the losses past the level available from many direct indexing strategies in a process Sullivan compares to a “perpetual ball machine.”

“The loss harvesting paradigm here is just totally different than a direct indexing long-only,” Sullivan said. “As the market goes up, you can continue shorting. Those shorts generate harvestable losses.”

READ MORE: How the ticking clock affects tax-loss harvesting

A ‘rapidly growing but sometimes confusing area’

Much like his research documenting the continual rise in Section 351 conversions to ETFs, Sullivan is keeping close watch on tax-aware long-short vehicles, which have already surpassed his prediction of attracting $30 billion in assets under management by the end of the year. AQR Capital Management, a pioneer in tax-aware long-short strategies, is leading the way with $21.7 billion, but other managers such as Invesco, BlackRock and Quantinno have pushed the total above at least $35 billion, Sullivan noted in a newsletter last month.

“Today, advisers recognize that tax is a practice differentiator and a source of recurring client value,” Sullivan wrote. “They may be torn between low-cost, passive index ETFs and direct indexing, but that debate fades into the background once they learn of tax-aware long/short strategies.”

On the other hand, AQR itself is seeking to “help parse the jargon of this rapidly growing but sometimes confusing area” amid some “blurring of terminology, strategy design and investment objectives,” the asset management firm said in a blog post earlier this year. The company pushed back on the idea that the strategies are “only for billionaires” or simply trying to achieve benchmark returns, along with the notion that they are a form of “supercharged direct indexing.” While their tax benefits “are larger and last longer” than those of direct indexing, the two strategies come from “diametrically opposite starting points (active management for the former versus passive indexing for the latter),” the post said.

“Tax-aware long-short factor strategies realize higher tax benefits than direct indexing not because they try harder, but because they (1) trade quite a bit due to changes in pretax alpha, (2) hold large positions relative to invested capital due to leverage, and (3) can slow unnecessary gain recognition without significantly impacting pretax alpha, thanks to relatively long holding periods and highly diversified portfolios,” the company wrote. “The core strength of tax-aware long-short strategies lies in their ability to align pretax performance with the needs of tax-sensitive investors.”

READ MORE: A complex but tax-friendly approach to diversification

Estate implications

Those characteristics may eventually pose tax problems with a client’s estate plans, Sulllivan noted. Estates face an obligation to settle any debts.

“The strategy is effectively over,” he told FP. “You will realize a ton of capital gains if you suddenly, without planning, close the long and short positions.”

Advisors and their clients could take steps to wind down the leverage “years and years in advance” with as low tax exposure as possible, he said. Or they could set up an intentionally defective grantor trust or another entity instructing the trustee to manage the strategy based on a “prudent investor standard” and a long-term plan for the estate and its heirs, Sullivan said.

Since “you do not want to be auto-liquididated” upon the benefactor’s death, some of the “the brightest minds out there are thinking about trust structures” to hold the tax-aware long-short strategies, he said.

“That can be a real tax drag for any assets passing to beneficiaries,” Sullivan said. “What you do is, make sure that the trust is properly structured to continue holding margin and short positions. You’re essentially transferring the entire balance sheet of the strategy.”

Continue Reading

Accounting

House tax bill calls for $30K SALT, omits millionaire tax

Published

on

The House tax committee is seeking to increase the state and local deduction and make official several of President Donald Trump’s campaign tax pledges in a multitrillion-dollar package that will serve as Republicans’ signature legislative effort.

The House Ways and Means Committee release of the tax measures, ahead of planned debate on the panel Tuesday, is a sign the Republican-controlled chamber is moving toward a floor vote this month on the legislation. The bill aims to cut taxes by more than $4 trillion and reduce spending by at least $1.5 trillion over a decade.

The proposal doesn’t include a tax hike on the wealthiest Americans, after weeks of debate among Republicans about whether to raise levies on millionaires. The bill would permanently extend the 37% top rate for individuals that was set in Trump’s 2017 tax law. That’s despite Trump telling Speaker Mike Johnson as recently as last week that he wanted a 39.6% rate for individuals making more than $2.5 million.

The package — which Trump has dubbed his “one big, beautiful bill” is the centerpiece of his legislative agenda. It renews many of his first-term tax cuts, set to expire at the end of the year. But narrow Republican margins in the House mean that the president needs near-unanimous support from his party to pass the bill.

The bill would raise the nation’s borrowing limit by $4 trillion. This is smaller than the Senate’s preferred $5 trillion level. Lawmakers are hoping to push any additional votes on raising the debt ceiling until after the 2026 midterms.

The draft language eliminates income taxes on tips and overtime pay through 2028. House Ways and Means Committee Chairman Jason Smith had vowed to follow through on Trump’s campaign pledges to end those levies.

Trump had also campaigned on ending taxes on Social Security benefits, but that cannot be done in the special budget process that Congress is using to advance the tax package. Instead, the bill provides a $4,000 bonus for seniors on top of the regular standard deduction.

One of the thorniest issues — including a contentious standoff over increasing the state and local tax deduction — is still not resolved. The draft calls for increasing the state and local tax deduction to $30,000 for both individuals and couples, up from $10,000, with income limits for single taxpayers earning $200,000 or joint filers making twice that. But some lawmakers representing high-tax areas want an even bigger tax break — as much as $124,000 for joint filers.

On the hook for tax increases: wealthy private universities, which could see an increase in the levy on endowments from 1.4% to as high as 21% on investment income. 

Johnson told reporters Monday that the House is on track to pass the legislation by Memorial Day. It would then go to the Senate, where it could be subject to major revisions.

The new details come after the tax-writing committee released some initial provisions late Friday. Those included raising the maximum child tax credit to $2,500 from $2,000 and increasing the standard deduction, both retroactive to 2025 to put more money in voters’ pockets before the 2026 election. 

The bill also raises the estate tax exemption to $15 million and increases the 20% deduction for closely-held businesses to 23%.

Continue Reading

Accounting

Jon Voight joins studios, unions to press Trump for film aid

Published

on

President Donald Trump’s Hollywood ambassadors joined studios, labor unions and producers in asking the White House to expand and extend tax incentives as part of an upcoming budget reconciliation bill.

A letter dated Monday asked the president to include three film and TV incentives in the budget bill being drafted by Congress. The coalition includes the Motion Picture Association, which represents Hollywood studios, as well as unions of writers, actors and other trades.

Actor Jon Voight, who was named one of three special ambassadors to Hollywood in January, is leading the effort to obtain assistance from Washington to boost US film and TV jobs. The groups signing the letter represent nearly 400,000 industry professionals. Sylvester Stallone, another Trump ambassador, also signed the letter.

The U.S. film and TV industry has struggled in recent years as entertainment companies reduced their spending and moved production overseas, where cheaper labor and more generous government subsidies make their business more profitable. 

The letter doesn’t mention tariffs on foreign film production, which Trump said he would pursue in a social media post on May 4. His 100% tariff proposal, made after a visit with Voight, sent the shares of studios such as Netflix Inc. and Walt Disney Co. tumbling as investors considered the possibility of rising costs and a trade war in the entertainment business. 

The specific proposals in the new letter involve reviving Section 199 of the tax code, which provided deductions for manufacturing to film and TV production, extending Section 181, which allows for accelerated deductions, and restoring Section 461, which lets businesses use past losses to reduce future taxes.

Continue Reading

Trending