Connect with us

Latest

Sensient Technologies Corp (SXT) Q3 2020 Earnings Call Transcript

Mish Boyka

Published

on

 

Sensient Technologies Corp (NYSE:SXT)
Q3 2020 Earnings Call
Oct 16, 2020, 9:30 a.m. ET

Contents:

  • Prepared Remarks
  • Questions and Answers
  • Call Participants

Prepared Remarks:

Operator

Good morning and welcome to the Sensient Technologies Corporation 2020 Third Quarter Earnings Conference Call. [Operator Instructions] After today’s presentation, there will be an opportunity to ask questions. [Operator Instructions]

I would now like to turn the conference over to Mr. Steve Rolfs. Please go ahead, sir.

Stephen J. RolfsSenior Vice President and Chief Financial Officer

Good morning. I’m Steve Rolfs, Senior Vice President and Chief Financial Officer of Sensient Technologies Corporation. I would like to welcome all of you to Sensient’s third quarter earnings call. I’m joined this morning by Paul Manning, Sensient’s Chairman, President and Chief Executive Officer. This morning, we released our 2020 third quarter financial results. A copy of the release and our investor presentation is now available on our website at sensient.com.

During our call today, we will reference certain non-GAAP financial measures, which we believe provide investors with additional information to evaluate the Company’s performance and improve the comparability of results between reporting periods. These non-GAAP financial results should not be considered in isolation from or as a substitute for financial information calculated in accordance with GAAP. A reconciliation of non-GAAP financial measures to the most directly comparable GAAP financial measure is available on the Investor Information section of our website at sensient.com and in our press release. We encourage investors to review these reconciliations in connection with the comments we make this morning.

I would also like to remind everyone that comments made this morning, including responses to your questions, may include forward-looking statements. Our actual results may differ materially, particularly in view of the uncertainties created by the COVID-19 pandemic, governmental attempts at remedial action and the timing of a return of more normal economic activity. We urge you to read Sensient’s filings, including our 10-K, our second quarter 10-Q and our forthcoming third quarter 10-Q for a description of additional factors that could potentially impact our financial results. Please bear these factors in mind when you analyze our comments today.

Due to changes we have made to our portfolio and the divestitures we announced last year, we are updating our group and product lines. The most notable change is that our flavors and fragrances segment will now be named the flavors and extracts segment. You will also notice some small changes to the names we use for some of our product lines. Sensient’s focused portfolio strengthens our ability to service the food, pharmaceutical and personal care markets. We will continue to report the three divested product lines of fragrances, yogurt fruit preparations and inks as long as these product lines impact our comparisons.

Now we’ll hear from Paul.

Paul ManningChairman of the Board, President and Chief Executive Officer

Thanks, Steve. Good morning. Sensient reported third quarter earnings this morning. And I’m pleased to report that results were in line with our expectations and our overall guidance for the year. I’m very pleased with the continued revenue and profit growth in our flavors and extract group as well as our food and pharmaceutical business in the color group. Our Asia Pacific group also posted solid profit growth in the quarter.

Overall, each of our groups performed well despite the adverse impact of COVID-19. COVID-19 continues to be a net negative to the Company. The market decline in the makeup industry continues to impact the color group’s personal care business. And on a geographic basis, we continue to see headwinds in Asia Pacific, Europe and Latin America. In the midst of this pandemic, we have ensured our employees are safe and healthy, our facilities remain open, our supply chain remains strong and we have delivered our products on time to our customers.

Based on current trends, I expect that we will deliver on our EPS outlook for the year as the foundation of our business remains strong. Our focus over the years on customer service, on-time delivery and sales execution has led to a high level of revenue from new product wins during the second half of 2019 and the first half of this year. Furthermore, as the pandemic continues and new product development at certain companies has slowed, we have focused on regaining lost business and gaining share at our customers. This focus, coupled with lower overall sales attrition, is paying off in our results and should continue to benefit future periods.

Last year at this time, we announced three divestitures. In the second quarter, we completed the sale of inks. And I’m pleased to say that we completed the sale of the yogurt fruit prep business during the third quarter. This is our second completed divestiture in 2020, and I am optimistic that we will complete the divestiture of fragrances in the near future. As I mentioned last year, the divestiture of these three businesses allows us to focus on our key customer markets, food, pharmaceutical and personal care.

I’m very pleased with the progress of our flavors and extracts group this year. The group had an impressive quarter with adjusted local currency revenue growth of 13% and profit growth of 24%. This is the third straight quarter of revenue growth, which has resulted in continued profit and margin improvement. This growth is based on the group’s focus on sales execution, which has resulted in a high win rate, a focus on retaining existing business and an overall decline in the group’s attrition rate. Additionally, the group’s focus on transitioning the product portfolio to more value-added solutions and the reduction of its production cost structure from restructuring and ongoing initiatives is complementing the revenue growth and the overall improvement in the group’s profit and margin.

Within the flavors and extracts group, the natural ingredients business had another strong quarter with local currency sales growth of 14.5% as a result of strong demand for seasoning, snacks and packaged foods. This business has a solid foundation to deliver a consistent and reliable supply of high-quality natural ingredients to its customers.

Flavors extracts and flavor ingredients also had a nice quarter, up 12% in local currency. The business’ strong technology platform in flavor modulation and enhancement, its clean label solutions and its applications expertise are leading factors in the growth of this business. Overall, the flavors and extracts group’s operating profit margin was up 110 basis points in the quarter. Long term, I expect mid-single digit revenue growth with continued margin improvement for the group.

Now turning to colors. Revenue for food and pharmaceutical colors was up low-single digits for the quarter. The group continues to see solid demand for natural colors in the market. There’s also strong consumer interest in functional natural extracts and nutraceuticals, and the group’s product portfolio and innovation are well positioned to support this demand. Despite the continued growth in food and pharmaceutical colors, revenue in personal care continues to be down as a result of the negative impacts of COVID-19 on the color makeup market. The demand for makeup in Europe, North America and Asia continues to be down substantially for the year. Given the uncertainty with COVID-19 and ongoing restrictions, I anticipate challenges for this cosmetics product line to continue.

The color group’s adjusted operating profit increased 3% in the quarter. Food and pharmaceutical colors had a great quarter, generating profit growth of more than 20% and about 15% for the first nine months of 2020. However, the lower demand for makeup and other personal care products continues to be a drag on the group’s profit performance. Overall, the color group’s operating profit margin increased 110 basis points in this quarter. Long term, I continue to expect mid-single digit revenue growth from food and pharmaceutical colors as well as personal care once demand normalizes from the impacts of COVID-19.

We’ve made good product — progress in our Asia Pacific group this year. Similar to flavors and extracts and colors, the group has focused on sales execution and building a stronger customer service and technology-driven organization. The group has created a solid infrastructure and has been focused on localizing production.

During the quarter, the group had solid sales growth in certain regions. However, this growth was offset by declines in other regions, as government COVID-19 restrictions continue to significantly impact many sales channels. The group had another strong quarter of profit growth, up approximately 15% in the quarter and 17% for the first nine months of 2020. The group’s operating profit margin increased 200 basis points in the quarter. This was the third straight quarter of strong profit improvement. The Asia Pacific group is well positioned for long-term growth. And I anticipate that as certain COVID-19-related restrictions ease, the group will resume mid-to-high single-digit revenue growth.

Overall, I’m very pleased with the results of our groups this year. Our flavors and extract group is having a great year, and the food and pharmaceutical business within the color group continues to have solid revenue and very strong profit growth. Our Asia Pacific group is well positioned for future revenue growth. Overall, COVID-19 continues to be a headwind for the Company. Despite this headwind, I’m excited about the future growth opportunities for Sensient due to the strength of our portfolio of technologies and our exceptional customer service.

Steve will now provide you with additional details on the third quarter results.

Stephen J. RolfsSenior Vice President and Chief Financial Officer

Thank you, Paul. In my comments this morning, I will be explaining the differences between our GAAP results and our adjusted results. The adjusted results for 2020 and 2019 remove the impact of the divestiture-related costs, the operations divested or to be divested and our recently implemented operational improvement plan. We believe that the removal of these items provides a clearer picture to investors of the Company’s performance.

This also reflects how management reviews the Company’s operations and performance. During the third quarter, the Company initiated a plan primarily to consolidate some of our global cosmetic manufacturing operations. The Company expects to complete this operational improvement plan during the first half of 2021. The costs of this plan are estimated to be approximately $5 million to $7 million.

Our third quarter GAAP diluted earnings per share was $0.78. Included in these results are $1.4 million or approximately $0.03 per share of costs related to the divestitures and other related costs and the cost of the operational improvement plan. In addition, our GAAP earnings per share this quarter include approximately $0.04 of earnings related to the results of the operations targeted for divestiture, which represents approximately $23.6 million of revenue in the quarter.

Last year’s third quarter GAAP results included approximately $0.02 of earnings per share from the operations to be divested and approximately $34.1 million of revenue. Excluding these items, consolidated adjusted revenue was $300 million, an increase of approximately 6.1% in local currency compared to the third quarter of 2019. This revenue growth was primarily a result of the flavors and extracts group, which was up approximately 13% in local currency.

Consolidated adjusted operating income increased 10.1% in local currency to $41.5 million in the third quarter of 2020. This growth was led by the flavors and extracts group, which increased operating income by 24.1% in local currency. The Asia Pacific group also had a nice growth in operating income in the quarter, up 15.5% in local currency. And operating income in the food and pharmaceutical business in the color group was up over 20% in local currency. The increase in operating income in these businesses is a result of the volume growth Paul explained earlier combined with the overall lower cost structure across the Company.

Our adjusted diluted earnings per share was $0.77 in this year’s third quarter compared to $0.74 in last year’s third quarter. As Paul mentioned, the overall impact of COVID on the Company’s results has been a headwind. The impact on our food and pharmaceutical businesses is mixed, but as we have discussed, the negative impact in our personal care business is significant.

We have reduced debt by approximately $60 million since the beginning of the year. We have adequate liquidity to meet operating and financial needs through our cash flow and available credit lines. Our debt-to-EBITDA is 2.6, down from 2.9 at the start of the year. Cash flow from operations was $143 million for the first nine months of 2020, an increase of 12% compared to prior year. Capital expenditures were $34 million in the first nine months of 2020 compared to $26.1 million in the first nine months of 2019. Our free cash flow increased 7% to $109 million for the first nine months of this year.

Consistent with what we communicated during our last call, we expect our adjusted consolidated operating income and earnings may be flat to lower in 2020 because of the level of non-cash performance-based equity expense in 2020. We also expected a higher tax rate in 2020 compared to our 2019 rate, which was lower as a result of a number of planning opportunities. Based on current trends, we are reconfirming our previously issued full-year GAAP earnings per share guidance of $2.10 per share to $2.35 per share.

The full-year guidance also now includes approximately $0.05 of currency headwinds based on current exchange rates. We are also reconfirming our previously issued full-year adjusted earnings per share guidance of $2.60 to $2.80, which excludes divestiture-related costs, operational improvement plan costs, the impact of the divested or to-be-divested businesses and foreign currency impacts. The Company expects foreign currency impacts to be minimal in the fourth quarter. We are also maintaining our adjusted EBITDA guidance of low-to-mid single-digit growth.

In conclusion, we continue to expect long-term revenue growth rates of mid-single digits in each of our groups. Our stock-based compensation and other incentive costs have reset this year. Going forward, this should be less of a headwind for us. We do expect our tax rate to trend up slightly in future years under current law. As a result, we believe adjusted EBITDA is a better measure of the Company’s operating performance, and expect this metric to grow at a mid-single digit rate or better.

In terms of our capital allocation priorities, we will continue to pay down debt in the near term. We also continue to evaluate acquisition opportunities. Absent an acquisition, we have the ability to buy back shares. We expect our capital expenditures to be in a range of $50 million to $60 million annually. Our divestiture activity and our operational improvement plan allows us to focus on our key customer markets of food, pharmaceutical excipients and personal care, while providing the foundation for future revenue and margin growth.

Thank you for your time this morning. We’ll now open the call for questions.

Questions and Answers:

Operator

Thank you. We will now begin the question-and-answer session. [Operator Instructions] And the first question will be from Mark Connelly with Stephens. Please go ahead.

Mark ConnellyStephens — Analyst

Thank you. So you’ve talked in the past about the benefits of B2C clients in terms of food innovation. But one of the most common questions we get is what are supermarkets doing in terms of prioritization? There’s a view that supermarkets are de-emphasizing new and smaller companies that are — that they see as less reliable. Although I have to say in my own area, we’re not seeing that. So, I was hoping you could just give us a sense of how that’s impacting your customer base, just the access to the market right now.

Paul ManningChairman of the Board, President and Chief Executive Officer

Hey, Mark. Good morning. So I would say this, there’s many different channels that our customers deal and supermarkets is certainly one of them. Supermarkets in the U.S. is certainly a subset of that. Your comment about certain B and C brands, perhaps being de-emphasized in supermarkets in the U.S. that’s — I think that’s probably directionally correct, but I think B and C companies, many of the ones that we deal ensure supermarkets is one channel, but there are many other channels specialty stores, online, a lot of these other areas that continue to grow.

Certainly online is yet a small fraction of what you could do in a supermarket channel, but I think directionally your comment is correct. I would tell you that we see some of that in the European market, perhaps a little bit less pronounced than in the U.S. market. And I see that as being even less of a factor and say, places like Latin America and Asia Pacific.

Mark ConnellyStephens — Analyst

Okay. That’s helpful. Thank you. And just one more question, then I’ll jump back in the queue. In Asia, you talked a little more extensively last quarter about the impact of local restrictions. I was hoping you could give us a little bit of a sense of how that’s evolving. You’ve obviously got a lot of local manufacturing and supply. Can you talk to about how much of what you produce in those countries in Asia stays within the country?

Paul ManningChairman of the Board, President and Chief Executive Officer

So, as a general statement, in this country, we produce where our customers are. And so, to that end, our operations in China, largely produced for China, although not — we do bring in products from other part of the Sensient manufacturing footprint into China because we don’t make everything there. But as a general statement, yeah, our supply chains for raw material production tend to be fairly localized.

That said, we have continued to generate really solid — in fact, probably even better than in normal times on-time delivery and service levels to our customers. And so, I’d like to tell you that we’ve been out of front of this for really since this began seven months ago. We began to really accumulate certain raw materials in key markets and in key product lines. And from that and plus having really good supply chain operations, we’ve been able to maintain output to our customers.

As you think about lockdowns in Asia, say versus Europe or the U.S., what we see happening right now is Asia lockdowns tend to be a little bit more broad-based in my GFN than say what you may see in elsewhere. So for example, you look at the United States, well, Wisconsin, where we are is on kind of a bad boy list right now for COVID infections. And so, that has restricted certain travel and — but the lockdowns are very much pointed here.

You go to Asia, some of these lockdowns are less about an individual county or state or even city, and they’re more broad-based. So I think that has been the big difference between the markets as we see it and in our experience with our products. So nevertheless, Asia was able to generate some topline growth, and they did really exceptionally well on their profit growth. So, I think we were able to overcome that pretty nicely.

But kind of circling back to your question, I would continue to tell you that we feel very good about our supply chain. Even though we do produce locally, we do still source many raw materials from Asia for the rest of the Sensient operations in, say, the Americas and Europe. Nevertheless, we feel very good about that. And whether it came through sort of stockpiling some of those raw materials or just having a very broad-based supply base, I think we’ve been able to capitalize on this pretty well.

Mark ConnellyStephens — Analyst

Thank you. I’ll jump back in the queue. Thank you.

Operator

And the next question will come from Heidi Vesterinen with Exane. Please go ahead.

Heidi VesterinenExane — Analyst

Good morning. I have a few questions. The first one, why have you not upgraded full-year guidance after such a strong quarter?

Paul ManningChairman of the Board, President and Chief Executive Officer

Heidi, so I’ll take that one. I would say this. It was a really strong quarter. We had great results out of each of the three groups. Am I being conservative? Sure, I’m probably being conservative. I mean, we certainly feel very confident at being at the top-end of that range, possibly even above that. But it’s — I don’t necessarily want to be too granular on any sort of 90-day period. I think that ultimately, I think that the businesses are going to continue to deliver and to deliver very nicely. You saw the flavor numbers, you saw the food colors, and you saw the Asia profit numbers. Not to mention they each had very really good EBIT margin growth.

So I feel good about them. I feel good about them in the rest of the year and into next year, but there’s moving parts in tax. There’s moving parts in COVID, whether there may be additional lockdowns. It’s hard to anticipate at this point. But we’ve kept guidance all year. Some companies remove their guidance, we did not. We’ve been very committed to that. And so, in short, yeah, maybe there’s some moving parts, but maybe I’m being a little conservative.

Heidi VesterinenExane — Analyst

Thanks for that. That’s great to hear. And congratulations on the flavors number, by the way, spectacular. So just to focus on that segment, so do you confirm that there was nothing really one-off or exceptional in nature in terms of growth? There’s no, like, I don’t know, pull forward of demand or I don’t know something one-off. And also, what was the contribution of volume and price in that flavor growth number?

Paul ManningChairman of the Board, President and Chief Executive Officer

So yeah. Flavors, we were up certainly well into the double digits, and that has translated very nicely to the operating profit. So, we’re starting to see that operating leverage I referenced the beginning of this year, where we would see an increasing profit picture as we went.

In terms of one-off, I would not point to any one-off. I think the demand has been pretty solid, and it’s been pretty well across the board. You look at each of our product categories I referenced in the script. Whether it’s SNI or flavor ingredients or any of our segments for that matter, the growth was really quite good and quite broad-based.

Certainly, there are pockets where demand is struggling considerably, for example, quick service restaurants. I think that’s been a very hard hit area. You look at some other product categories in the sweet flavors and even some of the dairy categories. Some of them not doing as well. Some of them are rebounding. But then, of course, you see seasonings and snacks and things like that doing quite well. So no, I wouldn’t point to any one-off. They just generated a lot of good wins.

I think that our attrition is way down. And I think one of the things that I kind of talked about last year, but obviously, it was somewhat lost in what was going on, but we had very good win rates last year in the flavor group. And what was ultimately suppressing that optically was the fact that we had very high attrition in some of our legacy product lines.

Now that we’ve largely flushed that through, and of course, as you know, we sold two of our businesses, and we expect to sell that third, I think we very strongly removed that headwind. So I suppose that there’s one — if there’s a one-timer that I would point to, maybe it’s those businesses kind of going away, and you can kind of see that distinction between our GAAP and non-GAAP on those business lines. But no, I think ultimately, you’re getting at the sustainability of the flavor results. And I feel really good about flavors moving forward. And I think that mid-single digit revenue is going to be very, very achievable. And I think we’ve got a real nice future going there in flavors.

Heidi VesterinenExane — Analyst

Thank you very much. I’ll get back in the queue.

Paul ManningChairman of the Board, President and Chief Executive Officer

Okay. Thanks, Heidi.

Operator

Thank you. And our next question is from Mitra Ramgopal with Sidoti. Please go ahead.

Mitra RamgopalSidoti — Analyst

Yes. Hi, good morning. Thanks for taking the questions. I believe in the first half, the net impacts of COVID on EPS was about $0.10. Just wanted to get a sense as to that impact in 3Q and how you see it playing out over the rest of the year.

Stephen J. RolfsSenior Vice President and Chief Financial Officer

So on the negative impact, and understand that this is somewhat subjective because we don’t know exactly where every product goes with our customers, but we looked at direct costs and then we also looked at the sales impact. So year-to-date, the direct costs are probably close to $5 million. However, we’re seeing an offset by lower travel and other SG&A type expenses. So, that mitigates most of the direct cost impact.

And so, where we really see the negative impact is on the topline. It is certainly most pronounced within the color group, where our cosmetic business is down about 12% on the topline in the quarter. So, we really have probably in excess of $20 million of revenue year-to-date that we believe were down as a result of COVID. And what that converts into in terms of EPS, it’s around $0.20.

Mitra RamgopalSidoti — Analyst

Okay. No, that’s great. Thanks for the color there. And obviously, you’ve talked about the operational improvement plan. Was that something that was sort of as a result of what you’re seeing in with the COVID-19 pandemic? Or was that something you’re probably going to do in any event as you look to improve efficiencies?

Paul ManningChairman of the Board, President and Chief Executive Officer

Yeah, it’s probably a matter of interpretation. I think the profit improvement plan as we look at that for some folks who may be newer to the Sensient story, they’ll note that we did have a series of restructuring events over the last number of years. And those were designed to consolidate many of our facilities to shrink some of our capacity to ultimately remove some of our legacy products that were — had become quite a headwind. We had tremendously high fixed costs in many parts of the Company.

And so, we went through those restructuring programs. We took out a lot of that cost. And I think some of the impact you see in some of that operating leverage is obviously very closely linked to some of those efforts. But we’re always looking at the business. And any time you talk to a plant manager, and he’s talking about volume and the need for volume to cover its fixed cost, well, you intuitively know you have a fixed cost problem in that facility. And so with that philosophy, which we’ve applied really everywhere in the Company, we’re always looking for opportunities in different parts of the world, and so — and in different business lines, for that matter.

So done a lot of work there in flavors. We’ve done work there in colors. We’re doing some additional work there in colors now. So I would say, Mitra, it’s really in the normal course of business for us to be looking to operate as efficiently as possible. The rounds that we’ve been working on really closely have really been tied to fixed costs, but we’re not blind to opportunities within SG&A as well and whether that is automating certain processes or standardizing. It’s always in the mix for us.

So, we’re always looking to drive that op margin. There’s a lot more op margin that this Company is able to, I think, going to be able to produce. Many of you also who’ve been with the story for some time would note that flavors is now starting to move up on the op margin ladder, and I think that’s going to continue. Certainly, as we get into 2021, I could see that being up another 50 basis points to 100 basis points. And color in Asia right now, you see those folks are 20% roughly and 20% plus at times. So, I feel quite good about those groups. And I think the focus here is going to be in flavors. And so, a lot of the activity has been in flavors. But yeah, in short, it’s kind of in the normal course of things, I would tell you, Mitra.

Mitra RamgopalSidoti — Analyst

Okay. No, that’s definitely great. And then, just curious on the personal care side. Obviously, you’re experiencing some softness due to COVID, but I believe at one point this is an area you felt there were some really nice opportunities you’d be looking to expand into whether it was oral care, etc. I was wondering if anything has changed on that front.

Paul ManningChairman of the Board, President and Chief Executive Officer

No. Cosmetics or as we will oftentimes refer to it as personal care is an outstanding business for us. It’s generated a lot of profit over the years. It’s very profitable. It’s a very technically driven business, lot of applications are required. We have a very extensive portfolio. And it’s a tremendous market with tremendous customers, right? We deal in the who’s who of the cosmetic world.

And so, as you look at that business today, it’s makeup, it’s skin care, it’s hair care, but it’s also more specifically personal care items, as you referenced, such as oral care, things like body wash, and those types of products, which as you can imagine are doing quite well. Well, actually, not so much the oral care. You may find this one interesting, Mitra. A lot fewer people are brushing their teeth nowadays and also chewing gum, right? So be mindful, the next person you get close to — although you shouldn’t get close to them, right, with COVID times.

But you get the idea here, there’s some temporary factors in the market that are playing out. But our business is — the majority of our business, more than 50% of that business that we have is makeup. And then we — as of course, as you know, we have a lot within hair care and hair colors and then we also have skincare and other related products. So, we continue to diversify that business. Makeup is still going to be a very good category.

I think as restrictions continue to ease and as we potentially find vaccines and other opportunities to suppress this virus, I think we’re going to see a very nice return to that business. But even with that, come March, we start lapping a lot of these real negative headwinds we’ve had in cosmetics moving forward. But no, this is an outstanding business. It’s very much going to be a part of our future. The dynamics are outstanding in it. And I think we’ve got a very strong leadership position there to boot.

Mitra RamgopalSidoti — Analyst

Okay. No, that’s great. And then finally, just again, it was a result of COVID, obviously, that there have been some positive [Indecipherable]. I believe you had talked in the past about a favorable product mix shift in terms of customers transitioning from more — from synthetic to more natural, etc. Just wondering if there are any other trends you’re seeing that you think would really be positive for you longer term.

Paul ManningChairman of the Board, President and Chief Executive Officer

Well, I think you mentioned one. You mentioned natural colors, which is going to continue to be a very nice trend for our Company. Natural flavors, which has been a long-standing trend, I think that continues. But extracts and functional ingredients in general, whether it’s designed for a nutraceutical product, for a food product or even for a pharmaceutical over-the-counter product, that continues to be a very, very strong part of the portfolio for us.

You probably noted some CPGs talking about returning to their core brands, many of which do not really contain a lot of natural ingredients. I think that’s kind of more of a temporary statement because as I look at our pipelines around the Company, I see a lot of activity continuing in this world of natural colors, extracts, functional ingredients.

So while there may be a small hiatus from new product launches in many of the markets from the large multi-nationals, the level of product launches and pipelines on B and C customers continues to be quite strong. And we continue to generate wins, right. The revenue you’re seeing in the Company right now is no accident. And Steve kind of told you about the headwinds here, but we’ve been able to be successfully winning new projects at a lot of different customers.

And it’s because we’re very committed to continuing to operate this Company. Our employees are very committed to the mission of this Company, and that is providing these essential products throughout the world. But in some cases, we’ve won because customers call up and say, you guys are the only guys who are working.

So we continue to take advantage where we can take advantage. And these products that we have are ideally suited for many of our customers right now who are trying to advance these more health-driven products. But long term, it’s a tremendous portfolio to do just that because I think those trends are trends, they’re by no means fads in any way.

Mitra RamgopalSidoti — Analyst

Okay. No, that’s great. Thanks for taking the questions.

Paul ManningChairman of the Board, President and Chief Executive Officer

Okay. Thanks, Mitra.

Operator

[Operator Instructions] The next question is a follow-up from Mark Connelly with Stephens. Please go ahead.

Mark ConnellyStephens — Analyst

Thank you. Just two more. I was hoping you could give us a little bit of a sense of what the impact of this restaurant recovery with restaurants opening at reduced capacities. If this ends up being a new normal for, say, the next year or so, would you have to scale back any of your operations during that market more than you already have?

Paul ManningChairman of the Board, President and Chief Executive Officer

I would say no. When we talk about restaurants, there’s really kind of a very simple interpretation of things. You have the quick service, the brands you know and love, and oftentimes, those are serviced through drive-throughs anyway. They are still being hurt, but I think we can ultimately mitigate the impacts from that standpoint. But in a traditional sit-down restaurant, that’s certainly part of our portfolio, but that doesn’t constitute a vast part of our portfolio. So, the short answer is no. Even if this were to continue, I would not anticipate the need to do any sort of production or supply chain reconfiguration on the food side of things to address that.

Mark ConnellyStephens — Analyst

Okay. That’s helpful. And just one financial question. I was a little bit high on my cash flow assumptions. Can you tell us if there’s anything that might be swinging in the fourth quarter? And how I should be thinking about working capital next year, assuming that we do have sort of a steady recovery?

Stephen J. RolfsSenior Vice President and Chief Financial Officer

Yeah. So year-to-date, our results on cash flow are good. We’re up about 12% in cash flow from operations. There’s a little bit of a dip in the third quarter. One of the things going on there, there were a number of tax payment deferrals. So a lot of companies did not, and Sensient included, did not make their federal tax payments in the first half of the year and then had to catch up in the third quarter, and that was in place in some other countries as well.

And then our sales were very strong really throughout the quarter in certain product lines. And so, I think there’s a little bit of a timing element on receivables. So if there was a little bit of dip in the quarter, it’s just — it’s those two items, but again, still up double digits year-to-date. We’ve made a lot of really nice progress in bringing inventories down, primarily in our flavors and extracts group this year. So, if you look at our — on a normalized basis, taking out the divestitures, I think we’re down about 24 days year-over-year. I think we can — we have some additional improvement we can make in flavors and in colors, but we’ve made a lot of improvement really over the last year. So, I would look for more maybe smaller incremental improvement next year.

Mark ConnellyStephens — Analyst

Super helpful. Thank you.

Operator

And our next question is also a follow-up and it’s from Heidi Vesterinen with Exane. Please go ahead. Please proceed, Heidi. Perhaps your line is muted on your end.

Heidi VesterinenExane — Analyst

Sorry about that. Thanks for that. So we saw recently that Chr. Hansen’s colors business was sold for nearly 21 times EBITDA. Can you explain how your food and beverage colors business compares with Chr. Hansen, please? Thank you.

Paul ManningChairman of the Board, President and Chief Executive Officer

Well, I think Chr. Hansen is a great competitor. And I think under their new ownership, I think they’re going to continue to be a great competitor. So yeah, I guess that’s what I’d say about that.

Operator

Thank you. At this time, I would like to turn the conference back over to the Company for any closing remarks as there are no further questions.

Stephen J. RolfsSenior Vice President and Chief Financial Officer

Okay. Thank you very much, everyone. That will conclude our call for this quarter. Thank you for your time this morning. Goodbye.

Operator

[Operator Closing Remarks]

Duration: 43 minutes

 

Latest

WSJ News Exclusive | Justice Department to File Long-Awaited Antitrust Suit Against Google

Mish Boyka

Published

on

The Justice Department will file an antitrust lawsuit Tuesday alleging that Google engaged in anticompetitive conduct to preserve monopolies in search and search-advertising that form the cornerstones of its vast conglomerate, according to senior Justice officials.

The long-anticipated case, expected to be filed in a Washington, D.C., federal court, will mark the most aggressive U.S. legal challenge to a company’s dominance in the tech sector in more than two decades, with the potential to shake up Silicon Valley and beyond. Once a public darling, Google attracted considerable scrutiny over the past decade as it gained power but has avoided a true showdown with the government until now.

The department will allege that Google, a unit of

Alphabet Inc.,

GOOG -2.44%

is maintaining its status as gatekeeper to the internet through an unlawful web of exclusionary and interlocking business agreements that shut out competitors, officials said. The government will allege that Google uses billions of dollars collected from advertisements on its platform to pay mobile-phone manufacturers, carriers and browsers, like

Apple Inc.’s

Safari, to maintain Google as their preset, default search engine.

The upshot is that Google has pole position in search on hundreds of millions of American devices, with little opportunity for any competitor to make inroads, the government will allege.

Justice officials said the lawsuit will also take aim at arrangements in which Google’s search application is preloaded, and can’t be deleted, on mobile phones running its popular Android operating system. The government will allege Google unlawfully prohibits competitors’ search applications from being preloaded on phones under revenue-sharing arrangements, they said.

Google owns or controls search distribution channels accounting for about 80% of search queries in the U.S., the officials said. That means Google’s competitors can’t get a meaningful number of search queries and build a scale needed to compete, leaving consumers with less choice and less innovation, and advertisers with less competitive prices, the lawsuit will allege.

Google didn’t immediately respond to a request for comment, but the company has said its competitive edge comes from offering a product that billions of people choose to use each day.

The Mountain View, Calif., company, sitting on a $120 billion cash hoard, is unlikely to shrink from a legal fight. The company has argued that it faces vigorous competition across its different operations and that its products and platforms help businesses small and large reach new customers.

Google’s defense against critics of all stripes has long been rooted in the fact that its services are largely offered to consumers at little or no cost, undercutting the traditional antitrust argument around potential price harms to those who use a product.

The lawsuit follows a Justice Department investigation that has stretched more than a year, and comes amid a broader examination of the handful of technology companies that play an outsize role in the U.S. economy and the daily lives of most Americans.

A loss for Google could mean court-ordered changes to how it operates parts of its business, potentially creating new openings for rival companies. The Justice Department’s lawsuit won’t specify particular remedies; that is usually addressed later in a case. One Justice official said nothing is off the table, including possibly seeking structural changes to Google’s business.

A victory for Google could deal a huge blow to Washington’s overall scrutiny of big tech companies, potentially hobbling other investigations and enshrining Google’s business model after lawmakers and others challenged its market power. Such an outcome, however, might spur Congress to take legislative action against the company.

The case could take years to resolve, and the responsibility for managing the suit will fall to the appointees of whichever candidate wins the Nov. 3 presidential election.

The challenge marks a new chapter in the history of Google, a company formed in 1998 in a garage in a San Francisco suburb—the same year

Microsoft Corp.

was hit with a blockbuster government antitrust case accusing the software giant of unlawful monopolization. That case, which eventually resulted in a settlement, was the last similar government antitrust case against a major U.S. tech firm.

Google’s billionaire co-founders Sergey Brin, left, and Larry Page, shown in 2008, gave up their management roles but remain in effective control of the company.



Photo:

Paul Sakuma/Associated Press

Google started as a simple search engine with a large and amorphous mission “to organize the world’s information.” But over the past decade or so it has developed into a conglomerate that does far more than that. Its flagship search engine handles more than 90% of global search requests, some billions a day, providing fodder for what has become a vast brokerage of digital advertising. Its YouTube unit is the world’s largest video platform, used by nearly three-quarters of U.S. adults.

Google has been bruised but never visibly hurt by various controversies surrounding privacy and allegedly anticompetitive behavior, and its growth has continued almost entirely unchecked. In 2012, the last time Google faced close antitrust scrutiny in the U.S., the search giant was already one of the largest publicly traded companies in the nation. Since then, its market value has roughly tripled to almost $1 trillion.

The company takes on this legal showdown under a new generation of leadership. Co-founders

Larry Page

and

Sergey Brin

, both billionaires, gave up their management roles last year, handing the reins solely to

Sundar Pichai

, a soft-spoken, India-born engineer who earlier in his career helped present Google’s antitrust complaints about Microsoft to regulators.

The chief executive has in his corner Messrs. Page and Brin, who remain on Alphabet’s board and in effective control of the company thanks to shares that give them, along with former Chief Executive

Eric Schmidt

, disproportionate voting power.

More on Google’s Business

Executives inside Google are quick to portray their divisions as mere startups in areas—like hardware, social networking, cloud computing and health—where other Silicon Valley giants are further ahead. Still, that Google has such breadth at all points to its omnipresence.

European Union regulators have targeted the company with three antitrust complaints and fined it about $9 billion, though the cases haven’t left a big imprint on Google’s businesses there, and critics say the remedies imposed on it have proved underwhelming.

In the U.S., nearly all state attorneys general are separately investigating Google, while three other tech giants—

Facebook Inc.,

Apple and

Amazon.com Inc.

—likewise face close antitrust scrutiny. And in Washington, a bipartisan belief is emerging that the government should do more to police the behavior of top digital platforms that control widely used tools of communication and commerce.

More than 10 state attorneys general are expected to join the Justice Department’s case, officials said. Other states are still considering their own cases related to Google’s search practices, and a large group of states is considering a case challenging Google’s power in the digital advertising market, The Wall Street Journal has reported. In the ad-technology market, Google owns industry-leading tools at every link in the complex chain between online publishers and advertisers.

The Justice Department also continues to investigate Google’s ad-tech practices.

Democrats on a House antitrust subcommittee released a report this month following a 16-month inquiry, saying all four tech giants wield monopoly power and recommending congressional action. The companies’ chief executives testified before the panel in July.

Google CEO Sundar Pichai testified before Congress in July, in hearings where lawmakers pressed tech companies’ leaders on their business practices.



Photo:

Graeme Jennings/Press Pool

Big Tech Under Fire

The Justice Department isn’t alone in scrutinizing tech giants’ market power. These are the other inquiries now under way:

  • Federal Trade Commission: The agency has been examining Facebook’s acquisition strategy, including whether it bought platforms like WhatsApp and Instagram to stifle competition. People following the case believe the FTC is likely to file suit by the end of the year.
  • State attorneys general: A group of state AGs led by Texas is investigating Google’s online advertising business and expected to file a separate antitrust case. Another group of AGs is reviewing Google’s search business. Still another, led by New York, is probing Facebook over antitrust concerns.
  • Congress: After a lengthy investigation, House Democrats found that Amazon holds monopoly powers over its third-party sellers and that Apple exerts monopoly power through its App Store. Those findings and others targeting Facebook and Google could trigger legislation. Senate Republicans are separately moving to limit Section 230 of the Communications Decency Act, which gives online platforms a liability shield, saying the companies censor conservative views.
  • Federal Communications Commission: The agency is reviewing a Trump administration request to reinterpret key parts of Section 230, for the same reasons cited by GOP senators. Tech companies are expected to challenge possible action on free-speech grounds.

“It’s Google’s business model that is the problem,”

Rep. David Cicilline

(D., R.I.), the subcommittee chairman, told Mr. Pichai. “Google evolved from a turnstile to the rest of the web to a walled garden that increasingly keeps users within its sights.”

“We see vigorous competition,” Mr. Pichai responded, pointing to travel search sites and product searches on Amazon’s online marketplace. “We are working hard, focused on the users, to innovate.”

Amid the criticism, Google and other tech giants remain broadly popular and have only gained in might and stature since the start of the coronavirus pandemic, buoying the U.S. economy—and stock market—during a period of deep uncertainty.

At the same time, Google’s growth across a range of business lines over the years has expanded its pool of critics, with companies that compete with the search giant, as well as some Google customers, complaining about its tactics.

Specialized search providers like

Yelp Inc.

and

Tripadvisor Inc.

have long voiced such concerns to U.S. antitrust authorities, and newer upstarts like search-engine provider DuckDuckGo have spent time talking to the Justice Department.

News Corp,

owner of The Wall Street Journal, has complained to antitrust authorities at home and abroad about both Google’s search practices and its dominance in digital advertising.

Some Big Tech detractors have called to break up Google and other dominant companies. Courts have indicated such broad action should be a last resort available only if the government clears high legal hurdles, including by showing that lesser remedies are inadequate.

The outcome could have a considerable impact on the direction of U.S. antitrust law. The Sherman Act that prohibits restraints of trade and attempted monopolization is broadly worded, leaving courts wide latitude to interpret its parameters. Because litigated antitrust cases are rare, any one ruling could affect governing precedent for future cases.

Google’s growth across a range of business lines has expanded its pool of critics. The company exhibited at the CES 2020 electronics show in Las Vegas on Jan. 8.



Photo:

Mario Tama/Getty Images

The tech sector has been a particular challenge for antitrust enforcers and the courts because the industry evolves rapidly and many products and services are offered free to consumers, who in a sense pay with the valuable personal data companies such as Google collect.

The search company famously outmaneuvered the Federal Trade Commission nearly a decade ago.

The FTC, which shares antitrust authority with the Justice Department, spent more than a year investigating Google but decided in early 2013 not to bring a case in response to complaints that the company engaged in “search bias” by favoring its own services and demoting rivals. Competition staff at the agency deemed the matter a close call, but said a case challenging Google’s search practices could be tough to win because of what they described as mixed motives within the company: a desire to both hobble rivals and advance quality products and services for consumers.

The Justice Department’s case won’t focus on a search-bias theory, Justice officials said.

Google made a handful of voluntary commitments to address other FTC concerns, a resolution that was widely panned by advocates of stronger antitrust enforcement and continues to be cited as a top failure. Google’s supporters say the FTC’s light touch was appropriate and didn’t burden the company as it continued to grow.

The Department of Justice is investigating the U.S.’s largest tech firms for allegedly monopolistic behavior. Roughly 20 years ago, a similar case threatened to destabilize Microsoft. WSJ explains. (Originally published Sept. 5, 2019)

The Justice Department’s current antitrust chief, Makan Delrahim, spent months negotiating with the FTC last year for jurisdiction to investigate Google this time around. He later recused himself in the case—Google was briefly a client years before while he was in private practice—as the department’s top brass moved to take charge.

The Justice Department lawsuit comes after internal tensions, with some staffers skeptical of Attorney General

William Barr

’s push to bring a case as quickly as possible, the Journal has reported. The reluctant staffers worried the department hadn’t yet built an airtight case and feared rushing to litigation could lead to a loss in court. They also worried Mr. Barr was driven by an interest in filing a case before the election. Others were more comfortable moving ahead.

Mr. Barr has pushed the department to move forward under the belief that antitrust enforcers have been too slow and hesitant to take action, according to a person familiar with his thinking. He has taken an unusually hands-on role in several areas of the department’s work and repeatedly voiced interest in investigating tech-company dominance.

Attorney General William Barr has pushed to bring an antitrust case quickly against Google, in some cases taking an unusually hands-on role in preparations.



Photo:

matt mcclain/press pool

If the Microsoft case from 20 years ago is any guide, Mr. Barr’s concern with speed could run up against the often slow pace of litigation.

After a circuitous route through the court system, including one initial trial-court ruling that ordered a breakup, Microsoft reached a 2002 settlement with the government and changed some aspects of its commercial behavior but stayed intact. It remained under court supervision and subject to terms of its consent decree with the government until 2011.

Antitrust experts have long debated whether the settlement was tough enough on Microsoft, though most observers believe the agreement opened up space for a new generation of competitors.

Write to Brent Kendall at brent.kendall@wsj.com and Rob Copeland at rob.copeland@wsj.com

Copyright ©2020 Dow Jones & Company, Inc. All Rights Reserved. 87990cbe856818d5eddac44c7b1cdeb8

Continue Reading

Latest

You Reap What You Code

Mish Boyka

Published

on

 

2020/10/20

You Reap What You Code

 

This is a loose transcript of my talk at Deserted Island DevOps Summer Send-Off, an online conference in COVID-19 times. One really special thing about it is that the whole conference takes place over the Animal Crossing video game, with quite an interesting setup.

It was the last such session of the season, and I was invited to present with few demands. I decided to make a compressed version of a talk I had been mulling over for close to a year, and had lined up for at least one in-person conference that got cancelled/reported in April and had given in its fill hour-long length internally at work. The final result is a condensed 30 minutes that touches all kinds of topics, some of which have been borrowed from previous talks and blog posts of mine.

If I really wanted to, I could probably make one shorter blog post out of every one or two slides in there, but I decided to go for coverage rather than depth. Here goes nothing.

'You Reap What You Code': shows my character in-game sitting at a computer with a bunch of broken parts around, dug from holes in the ground

So today I wanted to give a talk on this tendency we have as software developers and engineers to write code and deploy things that end up being a huge pain to live with, to an extent we hadn’t planned for.

In software, a pleasant surprise is writing for an hour without compiling once and then it works; a nasty surprise is software that seems to work and after 6 months you find out it poisoned your life.

This presentation is going to be a high level thing, and I want to warn you that I’m going to go through some philosophical concerns at first, follow that up with research that has taken place in human factors and cognitive science, and tie that up with broad advice that I think could be useful to everyone when it comes to system thinking and designing things. A lot of this may feel a bit out there, but I hope that by the end it’ll feel useful to you

'Power and Equity; Ivan Illich' shows a screenshot of the game with a little village-style view

This is the really philosophical stuff we’re starting with. Ivan Illich was a wild ass philosopher who hated things like modern medicine and mandatory education. He wrote this essay called “Power and Equity” (to which I was introduced by reading a Stephen Krell presentation), where he decides to also dislike all sorts of motorized transportation.

Ivan Illiches introduces the concept of an “oppressive” monopoly; if we look at societies that developed for foot traffic and cycling, you can generally use any means of transportation whatsoever and effectively manage to live and thrive there. Whether you live in a tent or a mansion, you can get around the same.

He pointed out that cycling was innately fair because it does not require more energy than what is required as a baseline to operate: if you can walk, you can cycle, and cycling, for the same energy as walking, is incredibly more efficient. Cars don’t have that; they are rather expensive, and require disproportionate amounts of energy compared to what a basic person has.

His suggestion was that all non-freight transport, whether cars or busses and trains, be capped to a fixed percentage above the average speed of a cyclist, which is based on the power a normal human body can produce on its own. He suggested we do this to prevent…

Aerial stock photo of an American suburb

that!

We easily conceived cars as ways to make existing burdens easier: it created freedoms, widened our access to goods and people. It was a better horse, and a less exhausting bicycle. And so society would develop to embrace cars in its infrastructure.

Rather than having a merchant bring goods to the town square, the milkman drop milk on the porch, and markets smaller and distributed closer to where they’d be convenient, it is now everyone’s job to drive for each of these things while stores go to where land is cheap rather than where people are. And when society develops with a car in mind, you now need a car to be functional.

In short the cost of participating in society has gone up, and that’s what an oppressive monopoly is.

'The Software Society': Van Bentum's painting The Explosion in the Alchemist's Laboratory

To me, the key thing that Illich did was twist the question another way: what effects would cars have on society if a majority of people had them, and what effect would it have on the rest of us?

The question I now want to ask is whether we have the equivalent in the software world. What are the things we do that we perceive increase our ability to do things, but turn out to actually end up costing us a lot more to just participate?

We kind of see it with our ability to use all the bandwidth a user may have; trying to use old dial-up connections is flat out unworkable these days. But do we have the same with our cognitive cost? The tooling, the documentation, the procedures?

'Ecosystems; we share a feedback loop': a picture of an in-game aquarium within the game's museum

I don’t have a clear answer to any of this, but it’s a question I ask myself a lot when designing tools and software.

The key point is that the software and practices that we choose to use is not just something we do in a vacuum, but part of an ecosystem; whatever we add to it changes and shifts expectations in ways that are out of our control, and impacts us back again. The software isn’t trapped with us, we’re trapped with the software.

Are we not ultimately just making our life worse for it? I want to focus on this part where we make our own life, as developers, worse. When we write or adopt software to help ourselves but end up harming ourselves in the process, because that speaks to our own sustainability.

'Ironies of automation; (Bainbridge, 1983): A still from Fantasia's broom scene

Now we’re entering the cognitive science and human factors bit.

Rather than just being philosophical here I want to ground things in the real world with practical effects. Because this is something that researchers have covered. The Ironies of automation are part of cognitive research (Bainbridge, 1983) that looked into people automating tasks and finding out that the effects weren’t as good as expected.

Mainly, it’s attention and practice clashing. There are tons of examples over the years, but let’s take a look at a modern one with self-driving cars.

Self-driving cars are a fantastic case of clumsy automation. What most established players in the car industry are doing is lane tracking, blind spot detection, and handling parallel parking.

But high tech companies (Tesla, Waymo, Uber) are working towards full self-driving, with Tesla’s autopilot being the most ambitious one being released to the public at large. But all of these right now operate in ways Bainbridge fully predicted in 1983:

  • the driver is no longer actively involved and is shifted to the role of monitoring
  • the driver, despite no longer driving the car, regardless must be fully aware of everything the car is doing
  • when the car gets in a weird situation, it is expected that the driver takes control again
  • so the car handles all the easy cases, but all the hard cases are left to the driver

Part of the risk there is twofold: people have limited attention for tasks they are not involved in—if you’re not actively driving it’s going to be hard to be attentive for extended periods of time—and if you’re only driving rarely with only the worst cases, you risk being out of practice to handle the worst cases.

Such automation is done in airlines who otherwise make up for it in simulator hours, and still manually handling planned difficult areas like takeoff and landing. Still, a bunch of airline incidents discover that this hand-off is often complex and not going well.

Clearly, when we ignore the human component and its responsibilities in things, we might make software worse than what it would have been.

'HABA-MABA problems': a chart illustrating Fitt's model using in-game images

In general most of these errors come from the following point of view. This is called the “Fitts” model, also “HABA-MABA”, for “Humans are better at, machines are better at” (the original version was referred as MABA-MABA, using “Men” rather than “Humans”). This model frames humans as slow, perceptive beings able of judgement, and machines are fast undiscerning indefatigable things.

We hear this a whole lot even today. These things are, to be polite, a beginner’s approach to automation design. It’s based on scientifically outdated concepts, intuitive-but-wrong sentiments, and is comforting in letting you think that only the predicted results will happen and totally ignores any emergent behaviour. It operates on what we think we see now, not on stronger underlying principles, and often has strong limitations when it comes to being applied in practice.

It is disconnected from the reality of human-machine interactions, and frames choices as binary when they aren’t, usually with the intent of pushing the human out of the equation when you shouldn’t. This is, in short, a significant factor behind the ironies of automation.

'Joint Cognitive Systems': a chart illustrating the re-framing of computers as teammates

Here’s a patched version established by cognitive experts. They instead reframe the human-computer relationship as a “joint cognitive system”, meaning that instead of thinking of humans and machines as unrelated things that must be used in distinct contexts for specific tasks, we should frame humans and computers as teammates working together. This, in a nutshell, shifts the discourse from how one is limited to terms of how one can complement the other.

Teammates do things like being predictable to each other, sharing a context and language, being able to notice when their actions may impact others and adjust accordingly, communicate to establish common ground, and have an idea of everyone’s personal and shared objectives to be able to help or prioritize properly.

Of course we must acknowledge that we’re nowhere close to computers being teammates as the state of the art today. And since currently computers need us to keep realigning them all the time, we have to admit that the system is not just the code and the computers, it’s the code, the computers, and all the people who interact with them and each other. And if we want our software to help us, we need to be able to help it, and to help it that means the software needs to be built knowing it will be full of limitations and having us work to make it easier to diagnose issues and form and improve mental models.

So the question is: what makes a good model? How can we help people work with what we create?

'How People From Models': a detailed road map of the city of London, UK

note: this slide and the next one are taken from my talk on operable software

This is a map of the city of London, UK. It is not the city of London, just a representation of it. It’s very accurate: it has streets with their names, traffic directions, building names, rivers, train stations, metro stations, footbridges, piers, parks, gives details regarding scale, distance, and so on. But it is not the city of London itself: it does not show traffic nor roadwork, it does not show people living there, and it won’t tell you where the good restaurants are. It is a limited model, and probably an outdated one.

But even if it’s really limited, it is very detailed. Detailed enough that pretty much anyone out there can’t fit it all in their head. Most people will have some detailed knowledge of some parts of it, like the zoomed-in square in the image, but pretty much nobody will just know the whole of it in all dimensions.

In short, pretty much everyone in your system only works from partial, incomplete, and often inaccurate and outdated data, which itself is only an abstract representation of what goes on in the system. In fact, what we work with might be more similar to this:

A cartoony tourist map of London's main attractions

That’s more like it. This is still not the city of London, but this tourist map of London is closer to what we work with. Take a look at your architecture diagrams (if you have them), and chances are they look more like this map than the very detailed map of London. This map has most stuff a tourist would want to look at: important buildings, main arteries to get there, and some path that suggests how to navigate them. The map has no well-defined scale, and I’m pretty sure that the two giant people on Borough road won’t fit inside Big Ben. There are also lots of undefined areas, but you will probably supplement them with other sources.

But that’s alright, because mental models are as good as their predictive power; if they let you make a decision or accomplish a task correctly, they’re useful. And our minds are kind of clever in that they only build models as complex as they need to be. If I’m a tourist looking for my way between main attractions, this map is probably far more useful than the other one.

There’s a fun saying about this: “Something does not exist until it is broken.” Subjectively, you can be entirely content operating a system for a long time without ever knowing about entire aspects of it. It’s when they start breaking or that your predictions about the system no longer works that you have to go back and re-tune your mental models. And since this is all very subjective, everyone has different models.

This is a vague answer to what is a good model, and the follow up is how can we create and maintain them?

'Syncing Models': a still from the video game in the feature where you back up your island by uploading it online

One simple step, outside of all technical components, is to challenge and help each other to sync and build better mental models. We can’t easily transfer our own models to each other, and in fact it’s pretty much impossible to control them. What we can do is challenge them to make sure they haven’t eroded too much, and try things to make sure they’re still accurate, because things change with time.

So in a corporation, things we might do include training, documentation, incident investigations all help surface aspects and changes to our systems to everyone. Game days and chaos engineering are also excellent ways to discover how our models might be broken in a controlled setting.

They’re definitely things we should do and care about, particularly at an organisational level. That being said, I want to focus a bit more on the technical stuff we can do as individuals.

'Layering Observability': a drawing of abstraction layers and observation probes' locations

note: this slide is explored more in depth in my talk on operable software

We can’t just open a so-called glass pane and see everything at once. That’s too much noise, too much information, too little structure. Seeing everything is only useful to the person who knows what to filter in and filter out. You can’t easily form a mental model of everything at once. To aid model formation, we should structure observability to tell a story.

Most applications and components you use that are easy to operate do not expose their internals to you, they mainly aim to provide visibility into your interactions with them. There has to be a connection between the things that the users are doing and the impact it has in or on the system, and you will want to establish that. This means:

  • Provide visibility into interactions between components, not their internals
  • log at the layer below which you want to debug, which saves time and how many observability probes you need to insert in your code base. We have a tendency to stick everything at the app level, but that’s misguided.
  • This means the logs around a given endpoint have to be about the user interactions with that endpoint, and require no knowledge of its implementation details
  • For developer logs, you can have one log statement shared by all the controllers by inserting it a layer below endpoints within the framework, rather than having to insert one for each endpoint.
  • These interactions will let people make a mental picture of what should be going on and spot where expectations are broken more easily. By layering views, you then make it possible to skip between layers according to which expectations are broken and how much knowledge they have
  • Where a layer provides no easy observability, people must cope through inferences in the layers above and below it. It becomes a sort of obstacle.

Often we are stuck with only observability at the highest level (the app) or the lowest level (the operating system), with nearly nothing useful in-between. We have a blackbox sandwich where we can only look at some parts, and that can be a consequence of the tools we choose. You’ll want to actually pick runtimes and languages and frameworks and infra that let you tell that observability story and properly layer it.

'Logging Practices': a game character chopping down trees

Another thing to help with model formation is maintaining that relationship between humans and machines going smoothly. This is a trust relationship, and providing information that is considered misleading or unhelpful erodes that trust. There are a few things you can do with logs that can help not ruin your marriage to the computer.

The main one is to log facts, not interpretations. You often do not have all the context from within a single log line, just a tiny part of it. If you start trying to be helpful and suggesting things to people, you change what is a fact-gathering expedition into a murder-mystery investigations where bits of the system can’t be trusted or you have to rean between the lines. That’s not helpful. A log line that says TLS validation error: SEC_ERROR_UNKNOWN_ISSUER is much better than one that says ERROR: you are being hacked regardless of how much experience you have.

A thing that helps with that is structured logging, which is better than regular text. It makes it easier for people to use scripts or programs to parse, aggregate, route, and transform logs. It prevents you from needing full-text search to figure out what happened. If you really want to provide human readable text or interpretations, add it to a field within structured logging.

Finally, adopting consistent naming mechanisms and units is always going to prove useful.

'Hitting Limits': the game's museum's owl being surprised while woken up

There is another thing called the Law of Requisite Variety, which says that only complexity can control complexity. If an agent can’t represent all the possible states and circumstances around a thing it tries to control, it won’t be able to control it all. Think of an airplane’s flight stabilizers; they’re able to cope only with a limited amount of adjustment, and usually at a higher rate than we humans could. Unfortunately, once it reaches a certain limit in its actions and things it can perceive, it stops working well.

That’s when control is either ineffective, or passed on to the next best things. In the case of software we run and operate, that’s us, we’re the next best thing. And here we fall into the old idea that if you are as clever as you can to write something, you’re in trouble because you need to be doubly as clever to debug it.

That’s because to debug a system that is misbehaving under automation, you need to understand the system, and then understand the automation, then understand what the automation thinks of the system, and then take action.

That’s always kind of problematic, but essentially, brittle automation forces you to know more than if you had no automation in order to make things work in difficult times. Things can then become worse than if you had no automation in the first place.

'Handle Hand-Offs First': this in-game owl/museum curator accepting a bug he despises for his collection

When you start creating a solution, do it while being aware that it is possibly going to be brittle and will require handing control over to a human being. Focus on the path where the automation fails and how the hand-off will take place. How are you going to communicate that, and which clues or actions will an operator have to take over things?

When we accept and assume that automation will reach its limits, and the thing that it does is ask a human for help, we shift our approach to automation. Make that hand-off path work easily. Make it friendly, and make it possible for the human to understand what the state of automation was at a given point in time so you can figure out what it was doing and how to work around it. Make it possible to guide the automation into doing the right thing.

Once you’ve found your way around that, you can then progressively automate things, grow the solution, and stay in line with these requirements. It’s a backstop for bad experiences, similar to “let it crash” for your code, so doing it well is key.
:

'Curb Cut Effect': a sidewalk with the classic curb cut in it

Another thing that I think is interesting is the curb cut effect. The curb cut effect was noticed as a result from the various American laws about accessibilities that started in the 60s. The idea is that to make sidewalks and streets accessible to people in wheelchairs, you would cut the part of the curb so that it would create a ramp from sidewalk to street.

The thing that people noticed is that even though you’d cut the curb for handicapped people, getting around was now easier for people carrying luggage, pushing strollers, on skateboards or bicycles, and so on. Some studies saw that people without handicaps would even deviate from their course to use the curb cuts.

Similar effects are found when you think of something like subtitles which were put in place for people with hearing problems. When you look at the raw number of users today, there are probably more students using them to learn a second or third language than people using them with actual hearing disabilities. Automatic doors that open when you step in front of them are also very useful for people carrying loads of any kind, and are a common example of doing accessibility without “dumbing things down.”

I’m mentioning all of this because I think that keeping accessibility in mind when building things is one of the ways we can turn nasty negative surprises into pleasant emerging behaviour. And generally, accessibility is easier to build in than to retrofit. In the case of the web, accessibility also lines up with better performance.

If you think about diversity in broader terms, how would you rethink your dashboards and monitoring and on-call experience if you were to run it 100% on a smartphone? What would that let people on regular computers do that they cannot today? Ask the same question but with user bases that have drastically different levels of expertise.

I worked with an engineer who used to work in a power station and the thing they had set up was that during the night, when they were running a short shift, they’d generate an audio file that contained all the monitoring metrics. They turned it into a sort of song, and engineers coming in in the morning would listen to it on fast forward to look for anomalies.

Looking at these things can be useful. If you prepare for your users of dashboards to be colorblind, would customizing colors be useful? And could that open up new regular use cases to annotate metrics that tend to look weird and for which you want to keep an eye on?

And so software shouldn’t be about doing more with less. It’s actually requiring less to do more. As in letting other people do more with less.

'Complexity Has To Live Somewhere': in-game's 'The Thinker' sitting at a desk, looking like it's pondering at papers

note: this slide is a short version of my post on Complexity Has to Live Somewhere

A thing we try to do, especially as software engineers, is to try to keep the code and the system—the technical part of the system—as simple as possible. We tend to do that by finding underlying concepts, creating abstractions, and moving things outside of the code. Often that means we rely on some sort of convention.

When that happens, what really goes on is that the complexity of how you chose to solve a problem still lingers around. Someone has to handle the thing. If you don’t, your users have to do it. And if it’s not in the code, it’s in your operators or the people understanding the code. Because if the code is to remain simple, the difficult concepts you abstracted away still need to be understood and present in the world that surrounds the code.

I find it important to keep that in mind. There’s this kind of fixed amount of complexity that moves around the organization, both in code and in the knowledge your people have.

Think of how people interact with the features day to day. What do they do, how does it impact them? What about the network of people around them? How do they react to that? Would you approach software differently if you think that it’s still going to be around in 5, 10, or 20 years when you and everyone who wrote it has left? If so, would that approach help people who join in just a few months?

One of the things I like to think about is that instead of using military analogies of fights and battles, it’s interesting to frame it in terms of gardens or agriculture. When we frame the discussion that we have in terms of an ecosystem and the people working collectively within it, the way we approach solving problems can also change drastically.

'Replacing, Adding, or Diffusing?': the trolley problem re-enacted with in-game items

Finally, one of the things I want to mention briefly is this little thought framework I like when we’re adopting new technology.

One we first adopt a new piece of technology, the thing we try to do—or tend to do—is to start with the easy systems first. Then we say “oh that’s great! That’s going to replace everything we have.” Eventually, we try to migrate everything, but it doesn’t always work.

So an approach that makes sense is to start with the easy stuff to probe that it’s workable for the basic cases. But also try something really, really hard, because that would be the endpoint. The endgame is to migrate the hardest thing that you’ve got.

If you’re not able to replace everything, consider framing things as adding it to your system rather than replacing. It’s something you add to your stack. This framing is going to change the approach you have in terms of teaching, maintenance, and in terms of pretty much everything that you have to care about so you avoid the common trap of deprecating a piece of critical technology with nothing to replace it. If you can replace a piece of technology then do it, but if you can’t, don’t fool yourself. Assume the cost of keeping things going.

The third one there is diffusing. I think diffusing is something we do implicitly when we do DevOps. We took the Ops responsibilities and the Dev responsibilities and instead of having it in different areas and small experts in dev and operation, you end up making it everybody’s responsibility to be aware of all aspects.

That creates that diffusion where in this case, it can be positive. You want everyone to be handling a task. But if you look at the way some organisations are handling containerization, it can be a bunch of operations people who no longer have to care about that aspect of their job. Then all of the development teams now have to know and understand how containers work, how to deploy them, and just adapt their workflow accordingly.

In such a case we haven’t necessarily replaced or removed any of the needs for deployment. We’ve just taken it outside of the bottleneck and diffused it and sent it to everyone else.

I think having an easy way, early in the process, to figure out whether what we’re doing is replacing, adding, or diffusing things will drastically influence how we approach change at an organisational level. I think it can be helpful.

'Thanks': title slide again

This is all I have for today. Hopefully it was practical.

Thanks!

 

Continue Reading

Latest

The Surprising Impact of Medium-Size Texts on PostgreSQL Performance

Mish Boyka

Published

on

 


Any database schema is likely to have plenty of text fields. In this article, I divide text fields into three categories:

  1. Small texts: names, slugs, usernames, emails, etc. These are text fields that usually have some low size limit, maybe even using varchar(n) and not text.
  2. Large texts: blog post content, articles, HTML content etc. These are large pieces of free, unrestricted text that is stored in the database.
  3. Medium texts: descriptions, comments, product reviews, stack traces etc. These are any text field that is between the small and the large. These type of texts would normally be unrestricted, but naturally smaller than the large texts.

In this article I demonstrate the surprising impact of medium-size texts on query performance in PostgreSQL.

Sliced bread... it gets better<br><small>Photo by <a href="https://unsplash.com/photos/WHJTaLqonkU">Louise Lyshøj</a></small>
Sliced bread… it gets better
Photo by Louise Lyshøj
Table of Contents

When talking about large chunks of text, or any other field that may contain large amounts of data, we first need to understand how the database handles the data. Intuitively, you might think that the database is storing large pieces of data inline like it does smaller pieces of data, but in fact, it does not:

PostgreSQL uses a fixed page size (commonly 8 kB), and does not allow tuples to span multiple pages. Therefore, it is not possible to store very large field values directly.

As the documentation explains, PostgreSQL can’t store rows (tuples) in multiple pages. So how does the database store large chunks of data?

[…] large field values are compressed and/or broken up into multiple physical rows. […] The technique is affectionately known as TOAST (or “the best thing since sliced bread”).

OK, so how is this TOAST working exactly?

If any of the columns of a table are TOAST-able, the table will have an associated TOAST table

So TOAST is a separate table associated with our table. It is used to store large pieces of data of TOAST-able columns (the text datatype for example, is TOAST-able).

What constitutes a large value?

The TOAST management code is triggered only when a row value to be stored in a table is wider than TOAST_TUPLE_THRESHOLD bytes (normally 2 kB). The TOAST code will compress and/or move field values out-of-line until the row value is shorter than TOAST_TUPLE_TARGET bytes (also normally 2 kB, adjustable) or no more gains can be had

PostgreSQL will try to compress a the large values in the row, and if the row can’t fit within the limit, the values will be stored out-of-line in the TOAST table.

Finding the TOAST

Now that we have some understanding of what TOAST is, let’s see it in action. First, create a table with a text field:

db=# CREATE TABLE toast_test (id SERIAL, value TEXT);
CREATE TABLE

The table contains an id column, and a value field of type TEXT. Notice that we did not change any of the default storage parameters.

The text field we added supports TOAST, or is TOAST-able, so PostgreSQL should create a TOAST table. Let’s try to locate the TOAST table associated with the table toast_test in pg_class:

db=# SELECT relname, reltoastrelid FROM pg_class WHERE relname = 'toast_test';
  relname   │ reltoastrelid
────────────┼───────────────
 toast_test │        340488

db=# SELECT relname FROM pg_class WHERE oid = 340488;
     relname
─────────────────
 pg_toast_340484

As promised, PostgreSQL created a TOAST table called pg_toast_340484.

TOAST in Action

Let’s see what the TOAST table looks like:

db=# d pg_toast.pg_toast_340484
TOAST table "pg_toast.pg_toast_340484"
   Column   │  Type
────────────┼─────────
 chunk_id   │ oid
 chunk_seq  │ integer
 chunk_data │ bytea

The TOAST table contains three columns:

  • chunk_id: A reference to a toasted value.
  • chunk_seq: A sequence within the chunk.
  • chunk_data: The actual chunk data.

Similar to “regular” tables, the TOAST table also has the same restrictions on inline values. To overcome this restriction, large values are split into chunks that can fit within the limit.

At this point the table is empty:

db=# SELECT * FROM pg_toast.pg_toast_340484;
 chunk_id │ chunk_seq │ chunk_data
──────────┼───────────┼────────────
(0 rows)

This makes sense because we did not insert any data yet. So next, insert a small value into the table:

db=# INSERT INTO toast_test (value) VALUES ('small value');
INSERT 0 1

db=# SELECT * FROM pg_toast.pg_toast_340484;
 chunk_id │ chunk_seq │ chunk_data
──────────┼───────────┼────────────
(0 rows)

After inserting the small value into the table, the TOAST table remained empty. This means the small value was small enough to be stored inline, and there was no need to move it out-of-line to the TOAST table.

1″small value”idvalue
Small text stored inline

Let’s insert a large value and see what happens:

db=# INSERT INTO toast_test (value) VALUES ('n0cfPGZOCwzbHSMRaX8 ... WVIlRkylYishNyXf');
INSERT 0 1

I shortened the value for brevity, but that’s a random string with 4096 characters. Let’s see what the TOAST table stores now:

db=# SELECT * FROM pg_toast.pg_toast_340484;
 chunk_id │ chunk_seq │ chunk_data
──────────┼───────────┼──────────────────────
   995899 │         0 │ x30636650475a4f43...
   995899 │         1 │ x50714c3756303567...
   995899 │         2 │ x6c78426358574534...
(3 rows)

The large value is stored out-of-line in the TOAST table. Because the value was too large to fit inline in a single row, PostgreSQL split it into three chunks. The x3063... notation is how psql displays binary data.

1″small value”2213x…..x…..x…..idvalue
Large text stored out-of-line, in the associated TOAST table

Finally, execute the following query to summarize the data in the TOAST table:

db=# SELECT chunk_id, COUNT(*) as chunks, pg_size_pretty(sum(octet_length(chunk_data)::bigint))
FROM pg_toast.pg_toast_340484 GROUP BY 1 ORDER BY 1;
 chunk_id │ chunks │ pg_size_pretty
──────────┼────────┼────────────────
   995899 │      3 │ 4096 bytes
(1 row)

As we’ve already seen, the text is stored in three chunks.

size of database objects

There are several ways to get the size of database objects in PostgreSQL:

  • pg_table_size: Get the size of the table including TOAST, but excluding indexes
  • pg_relation_size: Get the size of just the table
  • pg_total_relation_size: Get the size of the table, including indexes and TOAST

Another useful function is pg_size_pretty: used to display sizes in a friendly format.

TOAST Compression

So far I refrained from categorizing texts by their size. The reason for that is that the size of the text itself does not matter, what matters is its size after compression.

To create long strings for testing, we’ll implement a function to generate random strings at a given length:

CREATE OR REPLACE FUNCTION generate_random_string(
  length INTEGER,
  characters TEXT default '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
) RETURNS TEXT AS
$$
DECLARE
  result TEXT := '';
BEGIN
  IF length < 1 then
      RAISE EXCEPTION 'Invalid length';
  END IF;
  FOR __ IN 1..length LOOP
    result := result || substr(characters, floor(random() * length(characters))::int + 1, 1);
  end loop;
  RETURN result;
END;
$$ LANGUAGE plpgsql;

Generate a string made out of 10 random characters:

db=# SELECT generate_random_string(10);
 generate_random_string
────────────────────────
 o0QsrMYRvp

We can also provide a set of characters to generate the random string from. For example, generate a string made of 10 random digits:

db=# SELECT generate_random_string(10, '1234567890');
 generate_random_string
────────────────────────
 4519991669

PostgreSQL TOAST uses the LZ family of compression techniques. Compression algorithms usually work by identifying and eliminating repetition in the value. A long string containing fewer characters should compress very well compared to a string made of many different characters when encoded into bytes.

To illustrate how TOAST uses compression, we’ll clean out the toast_test table, and insert a random string made of many possible characters:

db=# TRUNCATE toast_test;
TRUNCATE TABLE

db=# INSERT INTO toast_test (value) VALUES (generate_random_string(1024 * 10));
INSERT 0 1

We inserted a 10kb value made of random characters. Let’s check the TOAST table:

db=# SELECT chunk_id, COUNT(*) as chunks, pg_size_pretty(sum(octet_length(chunk_data)::bigint))
FROM pg_toast.pg_toast_340484 GROUP BY 1 ORDER BY 1;

 chunk_id │ chunks │ pg_size_pretty
──────────┼────────┼────────────────
  1495960 │      6 │ 10 kB

The value is stored out-of-line in the TOAST table, and we can see it is not compressed.

Next, insert a value with a similar length, but made out of fewer possible characters:

db=# INSERT INTO toast_test (value) VALUES (generate_random_string(1024 * 10, '123'));
INSERT 0 1

db=# SELECT chunk_id, COUNT(*) as chunks, pg_size_pretty(sum(octet_length(chunk_data)::bigint))
FROM pg_toast.pg_toast_340484 GROUP BY 1 ORDER BY 1;

 chunk_id │ chunks │ pg_size_pretty
──────────┼────────┼────────────────
  1495960 │      6 │ 10 kB
  1495961 │      2 │ 3067 bytes

We inserted a 10K value, but this time it only contained 3 possible digits: 1, 2 and 3. This text is more likely to contain repeating binary patterns, and should compress better than the previous value. Looking at the TOAST, we can see PostgreSQL compressed the value to ~3kB, which is a third of the size of the uncompressed value. Not a bad compression rate!

Finally, insert a 10K long string made of a single digit:

db=# insert into toast_test (value) values (generate_random_string(1024 * 10, '0'));
INSERT 0 1

db=# SELECT chunk_id, COUNT(*) as chunks, pg_size_pretty(sum(octet_length(chunk_data)::bigint))
FROM pg_toast.pg_toast_340484 GROUP BY 1 ORDER BY 1;

 chunk_id │ chunks │ pg_size_pretty
──────────┼────────┼────────────────
  1495960 │      6 │ 10 kB
  1495961 │      2 │ 3067 bytes

The string was compressed so well, that the database was able to store it in-line.

Configuring TOAST

If you are interested in configuring TOAST for a table you can do that by setting storage parameters at CREATE TABLE or ALTER TABLE ... SET STORAGE. The relevant parameters are:

  • toast_tuple_target: The minimum tuple length after which PostgreSQL tries to move long values to TOAST.
  • storage: The TOAST strategy. PostgreSQL supports 4 different TOAST strategies. The default is EXTENDED, which means PostgreSQL will try to compress the value and store it out-of-line.

I personally never had to change the default TOAST storage parameters.


To understand the effect of different text sizes and out-of-line storage on performance, we’ll create three tables, one for each type of text:

db=# CREATE TABLE toast_test_small (id SERIAL, value TEXT);
CREATE TABLE

db=# CREATE TABLE toast_test_medium (id SERIAL, value TEXT);
CREATE TABLE

db=# CREATE TABLE toast_test_large (id SERIAL, value TEXT);
CREATE TABLE

Like in the previous section, for each table PostgreSQL created a TOAST table:

SELECT
    c1.relname,
    c2.relname AS toast_relname
FROM
    pg_class c1
    JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE
    c1.relname LIKE 'toast_test%'
    AND c1.relkind = 'r';

      relname      │  toast_relname
───────────────────┼─────────────────
 toast_test_small  │ pg_toast_471571
 toast_test_medium │ pg_toast_471580
 toast_test_large  │ pg_toast_471589

Set Up Test Data

First, let’s populate toast_test_small with 500K rows containing a small text that can be stored inline:

db=# INSERT INTO toast_test_small (value)
SELECT 'small value' FROM generate_series(1, 500000);
INSERT 0 500000

Next, populate the toast_test_medium with 500K rows containing texts that are at the border of being stored out-of-line, but still small enough to be stored inline:

db=# WITH str AS (SELECT generate_random_string(1800) AS value)
INSERT INTO toast_test_medium (value)
SELECT value
FROM generate_series(1, 500000), str;
INSERT 0 500000

I experimented with different values until I got a value just large enough to be stored out-of-line. The trick is to find a string which is roughly 2K that compresses very poorly.

Next, insert 500K rows with large texts to toast_test_large:

db=# WITH str AS (SELECT generate_random_string(4096) AS value)
INSERT INTO toast_test_large (value)
SELECT value
FROM generate_series(1, 500000), str;
INSERT 0 500000

We are now ready for the next step.

Comparing Performance

We usually expect queries on large tables to be slower than queries on smaller tables. In this case, it’s not unreasonable to expect the query on the small tables to run faster than on the medium table, and a query on the medium table to be faster than the same query on the large table.

To compare performance, we are going to execute a simple query to fetch one row from the table. Since we don’t have an index, the database is going to perform a full table scan. We’ll also disable parallel query execution to get a clean, simple timing, and execute the query multiple times to account for caching.

db=# SET max_parallel_workers_per_gather = 0;
SET

Starting with the small table:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id = 6000;
                                    QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────
 Gather  (cost=1000.00..7379.57 rows=1 width=16)
   ->  Parallel Seq Scan on toast_test_small  (cost=0.00..6379.47 rows=1 width=16)
        Filter: (id = 6000)
        Rows Removed by Filter: 250000
 Execution Time: 31.323 ms
(8 rows)

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id = 6000;
Execution Time: 25.865 ms

I ran the query multiple times and trimmed the output for brevity. As expected the database performed a full table scan, and the timing finally settled on ~25ms.

Next, execute the same query on the medium table:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id = 6000;
Execution Time: 321.965 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id = 6000;
Execution Time: 173.058 ms

Running the exact same query on the medium table took significantly more time, 173ms, which is roughly 6x slower than on the smaller table. This makes sense.

To complete the test, run the query again on the large table:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id = 6000;
Execution Time: 49.867 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id = 6000;
Execution Time: 37.291 ms

Well, this is surprising! The timing of the query on the large table is similar to the timing of the small table, and 6 times faster than the medium table.

Table Timing
toast_test_small 31.323 ms
toast_test_medium 173.058 ms
toast_test_large 37.291 ms

Large tables are supposed to be slower, so what is going on?

Making Sense of the Results

To make sense of the results, have a look at the size of each table, and the size of its associated TOAST table:

SELECT
    c1.relname,
    pg_size_pretty(pg_relation_size(c1.relname::regclass)) AS size,
    c2.relname AS toast_relname,
    pg_size_pretty(pg_relation_size(('pg_toast.' || c2.relname)::regclass)) AS toast_size
FROM
    pg_class c1
    JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE
    c1.relname LIKE 'toast_test_%'
    AND c1.relkind = 'r';
relname size toast_relname toast_size
toast_test_small 21 MB pg_toast_471571 0 bytes
toast_test_medium 977 MB pg_toast_471580 0 bytes
toast_test_large 25 MB pg_toast_471589 1953 MB

Let’s break it down:

  • toast_test_small: The size of the table is 21MB, and there is no TOAST. This makes sense because the texts we inserted to that table were small enough to be stored inline.
1…..2idvalue500K……….
Small texts stored inline
  • toast_test_medium: The table is significantly larger, 977MB. We inserted text values that were just small enough to be stored inline. As a result, the table got very big, and the TOAST was not used at all.
1………………………………………………………..2idvalue500K………………………………………………………………………………………………………………….
Medium texts stored inline
  • toast_test_large: The size of the table is roughly similar to the size of the small table. This is because we inserted large texts into the table, and PostgreSQL stored them out-of-line in the TOAST table. This is why the TOAST table is so big for the large table, but the table itself remained small.
12idvalue500K1x…..1x…..2x…..2x…..500K500Kx…..x…..
Large texts stored out-of-line in TOAST

When we executed our query, the database did a full table scan. To scan the small and large tables, the database only had to read 21MB and 25MB and the query was pretty fast. However, when we executed the query against the medium table, where all the texts are stored inline, the database had to read 977MB from disk, and the query took a lot longer.

TAKE AWAY

TOAST is a great way of keeping tables compact by storing large values out-of-line!

Using the Text Values

In the previous comparison we executed a query that only used the ID, not the text value. What will happen when we actually need to access the text value itself?

db=# timing
Timing is on.

db=# SELECT * FROM toast_test_large WHERE value LIKE 'foo%';
Time: 7509.900 ms (00:07.510)

db=# SELECT * FROM toast_test_large WHERE value LIKE 'foo%';
Time: 7290.925 ms (00:07.291)

db=# SELECT * FROM toast_test_medium WHERE value LIKE 'foo%';
Time: 5869.631 ms (00:05.870)

db=# SELECT * FROM toast_test_medium WHERE value LIKE 'foo%';
Time: 259.970 ms

db=# SELECT * FROM toast_test_small WHERE value LIKE 'foo%';
Time: 78.897 ms

db=# SELECT * FROM toast_test_small WHERE value LIKE 'foo%';
Time: 50.035 ms

We executed a query against all three tables to search for a string within the text value. The query is not expected to return any results, and is forced to scan the entire table. This time, the results are more consistent with what we would expect:

Table Cold cache Warm cache
toast_test_small 78.897 ms 50.035 ms
toast_test_medium 5869.631 ms 259.970 ms
toast_test_large 7509.900 ms 7290.925 ms

The larger the table, the longer it took the query to complete. This makes sense because to satisfy the query, the database was forced to read the texts as well. In the case of the large table, this means accessing the TOAST table as well.

What About Indexes?

Indexes help the database minimize the number of pages it needs to fetch to satisfy a query. For example, let’s take the first example when we searched for a single row by ID, but this time we’ll have an index on the field:

db=# CREATE INDEX toast_test_medium_id_ix ON toast_test_small(id);
CREATE INDEX

db=# CREATE INDEX toast_test_medium_id_ix ON toast_test_medium(id);
CREATE INDEX

db=# CREATE INDEX toast_test_large_id_ix ON toast_test_large(id);
CREATE INDEX

Executing the exact same query as before with indexes on the tables:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id = 6000;
                                QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using toast_test_small_id_ix on toast_test_small(cost=0.42..8.44 rows=1 width=16)
  Index Cond: (id = 6000)
Time: 0.772 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id = 6000;
                                QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using toast_test_medium_id_ix on toast_test_medium(cost=0.42..8.44 rows=1 width=1808
  Index Cond: (id = 6000)
Time: 0.831 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id = 6000;
                                QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using toast_test_large_id_ix on toast_test_large(cost=0.42..8.44 rows=1 width=22)
  Index Cond: (id = 6000)
Time: 0.618 ms

In all three cases the index was used, and we see that the performance in all three cases is almost identical.

By now, we know that the trouble begins when the database has to do a lot of IO. So next, let’s craft a query that the database will choose to use the index for, but will still have to read a lot of data:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id BETWEEN 0 AND 250000;
                                QUERY PLAN
───────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using toast_test_small_id_ix on toast_test_small(cost=0.4..9086 rows=249513 width=16
  Index Cond: ((id >= 0) AND (id <= 250000))
Time: 60.766 ms
db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id BETWEEN 0 AND 250000;
Time: 59.705 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id BETWEEN 0 AND 250000;
Time: 3198.539 ms (00:03.199)
db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id BETWEEN 0 AND 250000;
Time: 284.339 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id BETWEEN 0 AND 250000;
Time: 85.747 ms
db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id BETWEEN 0 AND 250000;
Time: 70.364 ms

We executed a query that fetch half the data in the table. This was a low enough portion of table to make PostgreSQL decide to use the index, but still high enough to require lots of IO.

We ran each query twice on each table. In all cases the database used the index to access the table. Keep in mind that the index only helps reduce the number of pages the database has to access, but in this case, the database still had to read half the table.

Table Cold cache Warm cache
toast_test_small 60.766 ms 59.705 ms
toast_test_medium 3198.539 ms 284.339 ms
toast_test_large 85.747 ms 70.364 ms

The results here are similar to the first test we ran. When the database had to read a large portion of the table, the medium table, where the texts are stored inline, was the slowest.

If after reading so far, you are convinced that medium-size texts are what’s causing you performance issues, there are things you can do.

Adjusting toast_tuple_target

toast_tuple_target is a storage parameter that controls the minimum tuple length after which PostgreSQL tries to move long values to TOAST. The default is 2K, but it can be decreased to a minimum of 128 bytes. The lower the target, the more chances are for a medium size string to be move out-of-line to the TOAST table.

To demonstrate, create a table with the default storage params, and another with toast_tuple_target = 128:

db=# CREATE TABLE toast_test_default_threshold (id SERIAL, value TEXT);
CREATE TABLE

db=# CREATE TABLE toast_test_128_threshold (id SERIAL, value TEXT) WITH (toast_tuple_target=128);
CREATE TABLE

db=# SELECT c1.relname, c2.relname AS toast_relname
FROM pg_class c1 JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE c1.relname LIKE 'toast%threshold' AND c1.relkind = 'r';

           relname            │  toast_relname
──────────────────────────────┼──────────────────
 toast_test_default_threshold │ pg_toast_3250167
 toast_test_128_threshold     │ pg_toast_3250176

Next, generate a value larger than 2KB that compresses to less than 128 bytes, insert to both tables, and check if it was stored out-of-line or not:

db=# INSERT INTO toast_test_default_threshold (value) VALUES (generate_random_string(2100, '123'));
INSERT 0 1

db=# SELECT * FROM pg_toast.pg_toast_3250167;
 chunk_id │ chunk_seq │ chunk_data
──────────┼───────────┼────────────
(0 rows)

db=# INSERT INTO toast_test_128_threshold (value) VALUES (generate_random_string(2100, '123'));
INSERT 0 1

db=# SELECT * FROM pg_toast.pg_toast_3250176;
─[ RECORD 1 ]─────────────
chunk_id   │ 3250185
chunk_seq  │ 0
chunk_data │ x3408.......

The (roughly) similar medium-size text was stored inline with the default params, and out-of-line with a lower toast_tuple_target.

Create a Separate Table

If you have a critical table that stores medium-size text fields, and you notice that most texts are being stored inline and perhaps slowing down queries, you can move the column with the medium text field into its own table:

CREATE TABLE toast_test_value (fk INT, value TEXT);
CREATE TABLE toast_test (id SERIAL, value_id INT)

In my previous article I demonstrated how we use SQL to find anomalies. In one of those use cases, we actually had a table of errors that contained a python traceback. The error messages were medium texts, many of them stored in-line, and as a result the table got big very quickly! So big in fact, that we noticed queries are getting slower and slower. Eventually we moved the errors into a separate table, and things got much faster!


The main problem with medium-size texts is that they make the rows very wide. This is a problem because PostgreSQL, as well as other OLTP oriented databases, are storing values in rows. When we ask the database to execute a query with only a few columns, the values of these columns are most likely spread across many blocks. If the rows are wide, this translates into a lot of IO, which affect the query performance and resource usage.

To overcome this challenge, some non-OLTP oriented databases are using a different type of storage: columnar storage. Using columnar storage, data is stored on disk by columns, not by rows. This way, when the database has to scan a specific column, the values are stored in consecutive blocks, and it usually translated to less IO. Additionally, values of a specific columns are more likely to have repeating patterns and values, so they are better compressed.

2…..idvalue1…..3…..2…..id1…..3…..value
Row vs Column Storage

For non-OLTP payloads such as data warehouse systems, this makes sense. The tables are usually very wide, and queries often use a small subset of the columns, and read a lot of rows. In OLTP payloads, the system will usually read one or very few rows, so storing data in rows makes more sense.

There has been chatter about pluggable storage in PostgreSQL, so this is something to look out for!

Continue Reading

US Election

US Election Remaining

Advertisement

Trending