Connect with us


Takeaway Friday – A Stoic Philosopher in a Hanoi Prison

Mish Boyka



James B. Stockdale was a vice admiral and aviator who served for 37 years in the United States Navy, spending the majority of his time as a fighter pilot on aircraft carriers. Shot down on his third combat tour over North Vietnam, Stockdale was the senior-most navy prisoner of war in Hanoi, where he spent over seven years – four of them in solitary confinement and two in leg irons – before being released. He was awarded the Medal of Honor, along with 26 other combat decorations, and retired as an educator and author.

Preparation and experience work most of the time – but not all of the time

Stockdale had no reason to think that the day’s mission was to be anything unique.

The flight in September 1965 was part of his third combat tour of North Vietnam, serving as Wing Commander of the aircraft carrier Oriskany. Despite his misgivings about the purpose of him being in Vietnam, he was a competent and skilled career fighter pilot. Nothing suggested he shouldn’t expect to make it back home that day – let alone that decade.

But sometimes life deals you a lousy hand, and it dealt Stockdale quite an unhappy one.

While trying to aid trapped American soldiers on the ground, he was suddenly falling out of the sky and hurtling towards a small Vietnamese village. His plane was on fire, the control system shot out by North Vietnamese who had used the grounded soldiers as bait, and he didn’t have much choice beyond punching out of the plane.

After ejection, I had about 30 seconds to make my last statement in freedom before I landed in the main street of a little village right ahead. And, so help me, I whispered to myself: “Five years down there, at least. I’m leaving the world of technology and entering the world of Epictetus.”

Just like that, Stockdale’s day had gone from routine to disastrous – but why was the first thing to jump into his mind the ancient philosopher, Epictetus?

It is never too late to learn something that can be meaningful, sometimes so meaningful that it saves your life

Five years previously, Stockdale had enrolled in the International Relations graduate program at Standford University. Twenty years in the Navy had given him the background needed to be a strategic planner in the Pentagon. But something wasn’t right – his heart wasn’t in it.

He eventually landed in Stanford’s Philosophy department where he was introduced to Epictetus, amongst others. He took the study of philosophy to heart, and his decades of real-life military experience focused his attention on the “noble philosophy that has proven to be more practicable than a modern cynic would expect” – Stoicism.

His bedside table on the aircraft carrier was no longer stacked with busy work to impress his superiors, but the “Discourses, Xenophon’s Memorabilia, recollections of Socrates, and of course, The Iliad and The Odyssey.

I came to the philosophic life as a 38-year-old Navy pilot in graduate school at Stanford University. I had been in the Navy for 20 years and scarcely ever out of a cockpit… Then I cruised into Stanford’s philosophy corner one winter morning and met Philip Rhinelander, dean of humanities and sciences… Within 15 minutes, we had agreed that I would enter his two-term course in the middle. To make up for my lack of background, I would meet him for an hour a week for a private tutorial in the study of his campus home.

Phil Rhinelander opened my eyes. In that study, it all happened for me – my inspiration, my dedication to the philosophic life. From then on, I was out of international relations and into philosophy… On my last session, he reached high on his wall of books and brought down a copy of the Enchiridion. He said, “I think you’ll be interested in this.”

Success is never certain, but suffering is – it is best to prepare for it

But what was it about Epictetus and Stoicism that hit so close to home for Stockdale? Why did he think it more practical than other schools of thought?

Epictetus is known amongst Stoic philosophers for his blunt advice – he referred to the lecture room not only as a place of learning, but as a hospital where you leave in pain but are better prepared for the future.

One of Epictetus’s key messages is that the only thing guaranteed humans is that they will suffer. By being prepared to appropriately handle suffering, we will also be better prepared to humbly handle and appreciate any success. For a military man who had seen the brutality of war first hand for decades, you can imagine why such an acceptance of human frailty would be attractive.

But further still, Epictetus lays the source of all that suffering on the suffering individual themselves. A disciplined mind, for the Stoics, along with taking personal responsibility for your orientation towards your situation, is the key to being able to get out of bed in the morning. In this way, you have control over yourself and your suffering without letting the world dictate it for you, regardless of the circumstances.

Epictetus explained that his curriculum was not about “revenues or income, or peace or war, but about happiness and unhappiness, success and failure, slavery and freedom.” His model graduate was not a person “able to speak fluently about philosophic principles as an idle babbler, but about things that will do you good if your child dies, or your brother dies, or if you must die or be tortured. …Let others practice lawsuits, others study problems, others syllogisms; here you practice how to die, how to be enchained, how to be racked, how to be exiled.”

A man is responsible for his own “judgments, even in dreams, in drunkenness, and in melancholy madness.” Each individual brings about his own good and his own evil, his good fortune, his ill fortune, his happiness, and his wretchedness. It is unthinkable that one man’s error could cause another’s suffering; suffering, like everything else in stoicism, was all internal – remorse at destroying yourself.

Epictetus was telling his students that there can be no such thing as being the “victim” of another. You can only be a “victim” of yourself. It’s all in how you discipline your mind. Who is your master?” He who has authority over any of the things on which you have set your heart. …What is the result at which all virtue aims? Serenity. …show me a man who though sick is happy, who though in danger is happy, who though in prison is happy, and I’ll show you a Stoic.”

Your status and security in life can change in an instant, so do not define yourself by what you are

Back in September 1965, Stockdale can feel the bullets whistling past his head and into the parachute canopy above him.

Closer to the ground now he hears the shouting of villagers and can see the angry eyes, fists raised, looking up at him. His parachute catches on a tree near the main street, and but he hits the ground in relatively good shape.

That outcome wasn’t acceptable to the unhappy villagers – so a group of ten or fifteen of them gang tackled Stockdale and beat him for over three minutes. He was saved by a policeman’s whistle, but a few broken bones and badly twisted and shattered leg were harbingers of what was to be the next seven years of his life.

As I glide down toward that little town on my short parachute ride, I’m just about to learn how negligible is my control over my station in life. It’s not at all up to me. Of course, I’m going right now from being the Wing Commander, in charge of a thousand people (pilots, crewmen, maintenance men), responsible for nearly a hundred airplanes, and beneficiary of goodness knows all sorts of symbolic status and goodwill, to being an object of contempt. “Criminal,” I’ll be known as.

And more than that even, you’re going to face fragilities you never before let yourself believe could be true. [ See * at end of post, removed description of “taking the ropes” in case it is your preference to skip reading it] … that you can be made to blurt out answers, probably correct answers, to questions about anything they know you know. I’m not going to pull you through that explanation again. I’ll just call it “taking the ropes.”

No, “station in life” can be changed from that of a dignified and competent gentleman of culture to that of a panic-stricken, sobbing, self-loathing wreck, maybe a permanent wreck if you have no will, in less than an hour. So what? So after you work a lifetime to get yourself all set up, and then delude Yourself into thinking that YOU have some kind of ownership claim on Your station in life, you’re riding for a fall. You’re asking for disappointment. To avoid that, stop kidding yourself, just do the best you can on a common-sense basis to make your station in life what you want it to be, but never get hooked on it. Make sure in your heart of hearts, in your inner self, that you treat your station in life with indifference. Not with contempt, only with indifference.

Shame at not living up to your own ideals is worse than a terrible situation itself

Torture was now a part of Stockdale’s daily life – as it was for any of the prisoners of war in North Vietnam. And that started with the first introduction to the Hanoi prisoner of war camp – you were broken down and isolated. The only way out of this pain was the betrayal of ideals you previously held – giving up military secrets, aiding in propaganda by admitting wrongdoing. Your reward for giving your captors what they wanted was to be isolated further for two months “to contemplated your crimes.”

According to Stockdale, what was actually contemplated by these men was “his betrayal of himself and everything that he stood for.”

Once released into the general population, this feeling of shame caused men to recoil from other prisoners, thinking themselves uniquely fragile. However, once they realized that everyone in the camp had gone through – and said – the same things, there was a turning point. They gained strength and together the men were able to make the best of an impossibly terrible situation.

The keyword for all of us at first was fragility. Each of us, before we were ever in shouting distance of another American, was made to “take the ropes.” That was a real shock to our systems – and as with all shocks, its impact on our inner selves was a lot more impressive and lasting and important than to our limbs and torsos. These were the sessions where we were taken down to submission and made to blurt out distasteful confessions of guilt and American complicity into antique tape recorders, and then to be put in what I call “cold soak,” six or eight weeks of total isolation to “contemplate our crimes.”

What we actually contemplated was what even the most self-satisfied American saw as his betrayal of himself and everything he stood for. It was there that I learned what “Stoic harm” meant. A shoulder broken, a bone in my back broken, and a leg broken twice were peanuts by comparison. Epictetus said: “Look not for any greater harm than this: destroying the trustworthy, self-respecting, well-behaved man within you.”

When put into a regular cell block, hardly an American came out of that without responding something like this when first whispered to by a fellow prisoner next door: “You don’t want to talk to me; I am a traitor.” And because we were equally fragile, it seemed to catch on that we all replied something like this: “Listen, pal, there are no virgins in here. You should have heard the kind of statement I made. Snap out of it. We’re all in this together. What’s your name? Tell me about yourself. “To hear that was, for most new prisoners just out of initial shakedown and cold soak, a turning point in their lives.

You can always be in control of you, in any situation

Stockdale was the senior-most officer in the prison, and as such, he was in command of what was essentially an American military colony on Vietnamese soil. In this position, he was able to multiply what he had learned from Epictetus across the entire American prisoner of war population. He knew that allowing men to be isolated without a sense of purpose would result in them all breaking down – feeling shame that they did not live up to their own definition of themselves was worse than a broken body.

We are in a spot like we’ve never been in before. But we deserve to maintain our self-respect, to have the feeling we are fighting back. We can’t refuse to do every degrading thing they demand of us, but it’s up to you, boss, to pick out things we must all refuse to do, unless and until they put us through the ropes again. We deserve to sleep at night. We at least deserve to have the satisfaction that we are hewing to our leader’s orders. Give us the list: What are we to take torture for?

And Stockdale followed his own advice – he took “the ropes” fifteen times, had his shoulder broken, his back broken, and his same leg twice broken.

The only way to survive without feeling debilitating shame – shame that was worse than the torture itself – was to have control over what you did and what you said.

This was a first step in claiming what was rightfully ours. Epictetus said: “The judge will do some things to you which are thought to be terrifying; but how can he stop you from taking the punishment he threatened?”

That’s my kind of Stoicism. You have a right to make them hurt you, and they don’t like to do it. The prison commissar told my fellow prisoner Ev Alvarez when he was released: “You Americans were nothing like the French; we could count on them to be reasonable.”

An individual pursuing that which is meaningful can find a reason to face any adversity

Epictetus turned out to be right. All told, it was only a temporary setback from things that were important to me, and being cast in the role as the sovereign head of an American expatriate colony which was destined to remain autonomous, out of communication with Washington, for years on end, was very important to me. I was determined to “play well the given part.”

It matters not how strait the gate,

How charged with punishment the scroll,

I am the master of my fate,

I am the captain of my soul.


NFL Power Rankings: Can Bill Belichick and Cam Newton fix the Patriots?

Emily walpole




A look at the AFC East standings is a shock to the system.

It has been rare over the past two decades to need to scroll down to see the Patriots. But there it is: The Buffalo Bills in first place at 4-2, and then (sit down for this one) the Miami Dolphins in second place ahead of the Patriots. If you needed more proof that 2020 is weird.

This has been a rough season for the Patriots already, and it’s not even one-third done. They lost a lot of defensive talent in the offseason. The team and Tom Brady parted ways. New England was hit harder than anyone else by coronavirus-related opt-outs. Then, the Patriots had Cam Newton and Stephon Gilmore go to the reserve/COVID-19 list. A Week 5 game against the Broncos was moved back to last Monday, and then moved to Week 6.

Even in a normal year, there would be questions about the 2-3 Patriots, who haven’t been under .500 past the fifth week of the season since 2002, when they were 3-4. There’s simply not a lot of high-end talent, particularly on offense. The 18-12 loss to the Broncos drove that point home.

Before Sunday, the Patriots had never lost a game under Bill Belichick when not giving up a touchdown, which is a remarkable streak and also speaks to the lack of offense for New England. No Patriots running back had more than 19 rushing yards against Denver. No pass catcher had more than 38 receiving yards aside from James White, who had eight catches for 65 yards. No matter how good a coaching staff is, at some point you need players.

In Week 1, the Patriots offense was a lot of Cam Newton, especially as a runner. On Sunday, his 76 rushing yards were the most reliable part of New England’s offense. A great dual-threat quarterback can be a foundation of a good offense, but it’s hard to ask Newton to carry the offense that way over a full season. Also, there has to be something other than that to lean on.

Of course, it’s an annual occurrence to question if the Patriots have lost it. It’s an October tradition like the leaves changing. Everyone questions the Patriots, then they figure it out and play fantastic football. Last season, those questions came after a win, an ugly 17-10 Week 11 win over the Philadelphia Eagles. The Patriots needed a Julian Edelman pass to get their only second-half touchdown. Brady was uncharacteristically grumpy after the win, upset with the offense’s production.

The difference is the Patriots never really did turn it around after that. They went 3-4 counting a playoff loss, with some rough offensive performances. They’re 2-3 this season. It has been 12 games of below-average football for New England. That’s not an insignificant amount of games. In terms of the roster, they’re not as good as they were last season. Too much was lost in the offseason and there hasn’t been a lot of emerging talent step up so far.

If Belichick gets this team to another AFC East championship, he deserves NFL coach of the year (he probably should win it most years since he’s, you know, the best coach in the NFL). Despite a two-game losing streak, the Bills are pretty good. The Patriots defense is good, but not in contention to be the best in the NFL like it was last season. And it’s hard to figure out the blueprint for the offense to be much better. Newton can only do so much. As great as Newton has been through his career, he’s not Brady.

Maybe this is another season in which the Patriots flip the switch and make all the skepticism about them look silly. If this were any other team, we’d take one look at the roster and wonder how a turnaround is even possible.

If that’s the effort the Jets are going to give, just stay in New York and forfeit the game. Don’t risk injury or coronavirus exposure.

31. Washington Football Team (1-5, LW: 30)

I won’t criticize the coach of a one-win football team taking his chances on a 2-point conversion to win a game. You’re more likely to gain 2 yards on that play than outplay anyone in overtime. “The only way to learn how to win is to play to win,” Ron Rivera said, via “I told them in the locker room, I said, ‘Guys, I play to win.’ That’s part of my philosophy.”

30. New York Giants (1-5, LW: 31)

To finally get a win, the Giants needed to turn away a 2-point conversion at home against Washington, which is clearly one of the worst teams in football. They also barely won despite a defensive touchdown. I kept thinking the Giants are at least a little better than their record. They’re not.

29. Jacksonville Jaguars (1-5, LW: 29)

Since losing in the AFC championship game, the Jaguars are 12-26. They have two straight double-digit loss seasons and will have to finish 6-4 to avoid another one. We make a lot of jokes about Adam Gase and Matt Patricia, but why is Doug Marrone never mentioned in hot-seat conversations?

28. Dallas Cowboys (2-4, LW: 22)

If anything, Jerry Jones is too patient with his coaches. And it’s hard to fire Mike McCarthy after he lost his quarterback. But let’s say the Cowboys lose double-digit games and don’t win a pathetic NFC East. Is it possible McCarthy is one and done? A team with Ezekiel Elliott, CeeDee Lamb, Amari Cooper and Michael Gallup was stuck on 3 points at home against the Cardinals on Monday night, until a late garbage-time touchdown. That’s hard to explain, even after giving the Cowboys some slack for losing Dak Prescott.

27. Philadelphia Eagles (1-4-1, LW: 23)

Don’t let the final score fool you. The Eagles were entirely overmatched for three-and-a-half quarters. Maybe that last rally sparks something going forward, but probably not. This is just a bad football team. And yes, four of the bottom five teams in these power rankings come from the same division.

26. Minnesota Vikings (1-5, LW: 20)

Well, the Vikings are bad. But we might be watching something special with Justin Jefferson. Heading into Monday’s games, he was third in the NFL in receiving yards despite barely playing the first two weeks. Again, why was he playing behind Olabisi Johnson the first two weeks?

25. Atlanta Falcons (1-5, LW: 28)

We’re not going to do the whole “Falcons start the season miserably and then change who they are and finish 7-9” again, are we?

24. Houston Texans (1-5, LW: 24)

Will Fuller is having a nice season. He had a strange game with no catches after it seemed he hurt hamstring in Week 2, but 455 yards and four touchdowns in the other five games. Hopefully, he can stay healthy, which has been a recurring problem.

23. Cincinnati Bengals (1-4-1, LW: 25)

Tee Higgins had six catches for 125 yards. He was the first pick of the second round and the Bengals should feel very grateful that many teams at the bottom of the first round — hello, Packers — passed on him. He and Joe Burrow are a nice 1-2 punch out of this draft.

22. Los Angeles Chargers (1-4, LW: 21)

The Chargers’ next five games are vs. Jaguars, at Broncos, vs. Raiders, at Dolphins, vs. Jets. It’s not crazy to think the Chargers could win four of five and get back in the playoff race. Their season outlook would be a lot more positive if they hadn’t let that Saints game get away.

21. Denver Broncos (2-3, LW: 27)

The Broncos are doing OK filling in for injuries. Tim Patrick has emerged as a nice option with Courtland Sutton on IR. Phillip Lindsay returned and played really well with Melvin Gordon out. The defense had a nice game despite its injuries. But they’re going to need Drew Lock to play better to keep winning games.

20. Detroit Lions (2-3, LW: 26)

Adrian Peterson on Sunday: 15 carries for 40 yards. D’Andre Swift on Sunday: 14 carries for 116 yards. The Lions signed Peterson just before the season, practically changed their offense to force the ball to a 35-year-old back who doesn’t contribute anything in the passing game and did so at the expense of a second-round rookie pick. That’s bad. What’s worse is we all know Matt Patricia and his crew will give Peterson more carries than Swift next week, too.

19. Miami Dolphins (3-3, LW: 18)

It was only a handful of plays but given what Tua Tagovailoa has been through, that was one great debut for him on Sunday.

18. New England Patriots (2-3, LW: 15)

Since a huge game at Seattle in Week 2, Julian Edelman has seven catches for 66 yards in three games. It’s not like anyone else for the Patriots is catching the ball. Edelman is 34 years old, and there’s no guarantee he’ll rebound to his normal form. That’s yet another concern for the Patriots offense.

17. Carolina Panthers (3-3, LW: 16)

There’s no word of when Christian McCaffrey can return from an ankle injury. Even though the Panthers have mostly played well without him, McCaffrey is one of the most talented players in the NFL. They will be better when he can return, and perhaps Sunday’s loss against the Bears would have turned out differently with their best player on the field.

16. San Francisco 49ers (3-3, LW: 19)

That was an impressive game by Kyle Shanahan. Realistically, the 49ers needed that win to keep their season alive. It will be interesting to watch the chess match between Shanahan and Bill Belichick in Week 7. One of the few coaches who Belichick had a hard time against was Shanahan’s father, Mike, who had a 5-3 record against Belichick when he was coaching the Broncos.

15. Cleveland Browns (4-2, LW: 9)

I’m not going to do a 180 on the Browns. I think Sunday was more about the Steelers’ dominance than Cleveland being exposed. Still, the Browns have been absolutely destroyed by the Steelers and Ravens this season, and that’s not the most promising development.

14. New Orleans Saints (3-2, LW: 14)

The Saints should have Michael Thomas back this week, and presumably the fight that led to a one-game suspension is behind everyone. Very few non-quarterbacks in the NFL have a bigger impact on their team than Thomas.

13. Arizona Cardinals (4-2, LW: 17)

We saw why Budda Baker is the highest-paid safety in NFL history. He was phenomenal on Monday night against the Cowboys. His versatility is the biggest asset the Cardinals defense has, especially with Chandler Jones out with an injury.

12. Chicago Bears (5-1, LW: 11)

The Bears are just that team whose record won’t match how good they are. They were outgained by the Panthers. Carolina had more yards per play, more first downs, mostly more of everything except points. The Bears defense is good, but probably not good enough to continually overcome a bad offense. I can’t justify putting them higher than this, despite their record.

11. Indianapolis Colts (4-2, LW: 13)

It’s a bit of an odd stat, but Philip Rivers became the oldest quarterback since Earl Morrall in 1974 to lead his team to a regular-season win after it trailed by 21 points or more (via NFL Research). Rivers got a lot of grief for a poor game at Cleveland, and he deserves a lot of credit for how well he played to save what would have been a bad loss to Cincinnati.

10. Los Angeles Rams (4-2, LW: 8)

It at least needs to be said: The Rams are 4-0 against the NFC East, which is trending as the worst division in modern NFL history, and 0-2 against everyone else. I don’t think the Rams are a bad team. I do wonder why in two of the past three weeks their offense has struggled so much, and how good they’ll be now that there are no more NFC East games on the schedule.

9. Las Vegas Raiders (3-2, LW: 10)

One thing the Raiders need to figure out is a pass rush. They have only eight sacks all season and just one player (Maxx Crosby) has recorded more than one. It’s hard to consistently win with no pressure on the opposing quarterback.

8. Buffalo Bills (4-2, LW: 7)

I’ve been high on the Bills, and I’m not bailing on them now. But that was ugly. The Chiefs were miles ahead of Buffalo in a big spot for the Bills. The offense flailed around in the rain, and the defense continued a very unimpressive start to the season. They’ll get it fixed. I think.

7. Green Bay Packers (4-1, LW: 3)

The Packers hadn’t really played a tough defense before Tampa Bay and their defense was shaky at best. Sunday’s loss doesn’t erase the first four games, but the way the Buccaneers dominated was troubling. Now, the Packers have to prove their offense can cook against a top-flight defense, and that their defense can stop anyone.

6. Tampa Bay Buccaneers (4-2, LW: 12)

The Bucs’ offense gets the attention, and it is good. But the defense, which had allowed the fewest yards in the league after Sunday’s games, is the strength of this team. Holding the Packers to 201 yards, 10 points and getting a pick six off Aaron Rodgers should bring some attention to that unit.

5. Tennessee Titans (5-0, LW: 6)

I just can’t get over Derrick Henry breaking away from the defense in the open field. He’s 247 pounds and he just ran away from Houston’s secondary, reaching an unbelievable 21.6 miles per hour on his run. What an athlete.

4. Seattle Seahawks (5-0, LW: 4)

Everyone in the NFC West has at least two losses other than the Seahawks. While I still think their defense is going to start costing them, they have a cushion if that does ever happen.

3. Pittsburgh Steelers (5-0, LW: 5)

I liked the Steelers on paper but thought some of their first four wins were closer than they should have been. I was waiting for them to hit that level that really stamped them as a top-tier team. Well, Sunday was it. If the Steelers play like that, they can beat anyone. They are on a short list of legitimate Super Bowl contenders.

2. Baltimore Ravens (5-1, LW: 2)

Had Sunday’s game ended with five minutes left in the fourth quarter, we’d be much more impressed with the Ravens. I’m not sure how they let the Eagles almost steal a game they mostly dominated. It’s probably best to shrug, chalk up another win and move on.

1. Kansas City Chiefs (5-1, LW: 1)

The Chiefs, who have the best quarterback in the NFL, bludgeoned the Bills on the ground. And they’re adding Le’Veon Bell to the offense. There’s a reason this team didn’t move out of the No. 1 spot after one loss.


Continue Reading


Building Netflix’s Distributed Tracing Infrastructure

Mish Boyka



Our Team — Kevin Lew, Narayanan Arunachalam, Elizabeth Carretto, Dustin Haffner, Andrei Ushakov, Seth Katz, Greg Burrell, Ram Vaithilingam, Mike Smith and Maulik Pandey

@Netflixhelps Why doesn’t Tiger King play on my phone?” — a Netflix member via Twitter

This is an example of a question our on-call engineers need to answer to help resolve a member issue — which is difficult when troubleshooting distributed systems. Investigating a video streaming failure consists of inspecting all aspects of a member account. In our previous blog post we introduced Edgar, our troubleshooting tool for streaming sessions. Now let’s look at how we designed the tracing infrastructure that powers Edgar.

Prior to Edgar, our engineers had to sift through a mountain of metadata and logs pulled from various Netflix microservices in order to understand a specific streaming failure experienced by any of our members. Reconstructing a streaming session was a tedious and time consuming process that involved tracing all interactions (requests) between the Netflix app, our Content Delivery Network (CDN), and backend microservices. The process started with manual pull of member account information that was part of the session. The next step was to put all puzzle pieces together and hope the resulting picture would help resolve the member issue. We needed to increase engineering productivity via distributed request tracing.

If we had an ID for each streaming session then distributed tracing could easily reconstruct session failure by providing service topology, retry and error tags, and latency measurements for all service calls. We could also get contextual information about the streaming session by joining relevant traces with account metadata and service logs. This insight led us to build Edgar: a distributed tracing infrastructure and user experience.

Figure 1. Troubleshooting a session in Edgar

When we started building Edgar four years ago, there were very few open-source distributed tracing systems that satisfied our needs. Our tactical approach was to use Netflix-specific libraries for collecting traces from Java-based streaming services until open source tracer libraries matured. By 2017, open source projects like Open-Tracing and Open-Zipkin were mature enough for use in polyglot runtime environments at Netflix. We chose Open-Zipkin because it had better integrations with our Spring Boot based Java runtime environment. We use Mantis for processing the stream of collected traces, and we use Cassandra for storing traces. Our distributed tracing infrastructure is grouped into three sections: tracer library instrumentation, stream processing, and storage. Traces collected from various microservices are ingested in a stream processing manner into the data store. The following sections describe our journey in building these components.

That is the first question our engineering teams asked us when integrating the tracer library. It is an important question because tracer libraries intercept all requests flowing through mission-critical streaming services. Safe integration and deployment of tracer libraries in our polyglot runtime environments was our top priority. We earned the trust of our engineers by developing empathy for their operational burden and by focusing on providing efficient tracer library integrations in runtime environments.

Distributed tracing relies on propagating context for local interprocess calls (IPC) and client calls to remote microservices for any arbitrary request. Passing the request context captures causal relationships between microservices during runtime. We adopted Open-Zipkin’s B3 HTTP header based context propagation mechanism. We ensure that context propagation headers are correctly passed between microservices across a variety of our “paved road” Java and Node runtime environments, which include both older environments with legacy codebases and newer environments such as Spring Boot. We execute the Freedom & Responsibility principle of our culture in supporting tracer libraries for environments like Python, NodeJS, and Ruby on Rails that are not part of the “paved road” developer experience. Our loosely coupled but highly aligned engineering teams have the freedom to choose an appropriate tracer library for their runtime environment and have the responsibility to ensure correct context propagation and integration of network call interceptors.

Our runtime environment integrations inject infrastructure tags like service name, auto-scaling group (ASG), and container instance identifiers. Edgar uses this infrastructure tagging schema to query and join traces with log data for troubleshooting streaming sessions. Additionally, it became easy to provide deep links to different monitoring and deployment systems in Edgar due to consistent tagging. With runtime environment integrations in place, we had to set an appropriate trace data sampling policy for building a troubleshooting experience.

This was the most important question we considered when building our infrastructure because data sampling policy dictates the amount of traces that are recorded, transported, and stored. A lenient trace data sampling policy generates a large number of traces in each service container and can lead to degraded performance of streaming services as more CPU, memory, and network resources are consumed by the tracer library. An additional implication of a lenient sampling policy is the need for scalable stream processing and storage infrastructure fleets to handle increased data volume.

We knew that a heavily sampled trace dataset is not reliable for troubleshooting because there is no guarantee that the request you want is in the gathered samples. We needed a thoughtful approach for collecting all traces in the streaming microservices while keeping low operational complexity of running our infrastructure.

Most distributed tracing systems enforce sampling policy at the request ingestion point in a microservice call graph. We took a hybrid head-based sampling approach that allows for recording 100% of traces for a specific and configurable set of requests, while continuing to randomly sample traffic per the policy set at ingestion point. This flexibility allows tracer libraries to record 100% traces in our mission-critical streaming microservices while collecting minimal traces from auxiliary systems like offline batch data processing. Our engineering teams tuned their services for performance after factoring in increased resource utilization due to tracing. The next challenge was to stream large amounts of traces via a scalable data processing platform.

Mantis is our go-to platform for processing operational data at Netflix. We chose Mantis as our backbone to transport and process large volumes of trace data because we needed a backpressure-aware, scalable stream processing system. Our trace data collection agent transports traces to Mantis job cluster via the Mantis Publish library. We buffer spans for a time period in order to collect all spans for a trace in the first job. A second job taps the data feed from the first job, does tail sampling of data and writes traces to the storage system. This setup of chained Mantis jobs allows us to scale each data processing component independently. An additional advantage of using Mantis is the ability to perform real-time ad-hoc data exploration in Raven using the Mantis Query Language (MQL). However, having a scalable stream processing platform doesn’t help much if you can’t store data in a cost efficient manner.

We started with Elasticsearch as our data store due to its flexible data model and querying capabilities. As we onboarded more streaming services, the trace data volume started increasing exponentially. The increased operational burden of scaling ElasticSearch clusters due to high data write rate became painful for us. The data read queries took an increasingly longer time to finish because ElasticSearch clusters were using heavy compute resources for creating indexes on ingested traces. The high data ingestion rate eventually degraded both read and write operations. We solved this by migrating to Cassandra as our data store for handling high data ingestion rates. Using simple lookup indices in Cassandra gives us the ability to maintain acceptable read latencies while doing heavy writes.

In theory, scaling up horizontally would allow us to handle higher write rates and retain larger amounts of data in Cassandra clusters. This implies that the cost of storing traces grows linearly to the amount of data being stored. We needed to ensure storage cost growth was sub-linear to the amount of data being stored. In pursuit of this goal, we outlined following storage optimization strategies:

We were adding new Cassandra nodes whenever the EC2 SSD instance stores of existing nodes reached maximum storage capacity. The use of a cheaper EBS Elastic volume instead of an SSD instance store was an attractive option because AWS allows dynamic increase in EBS volume size without re-provisioning the EC2 node. This allowed us to increase total storage capacity without adding a new Cassandra node to the existing cluster. In 2019 our stunning colleagues in the Cloud Database Engineering (CDE) team benchmarked EBS performance for our use case and migrated existing clusters to use EBS Elastic volumes. By optimizing the Time Window Compaction Strategy (TWCS) parameters, they reduced the disk write and merge operations of Cassandra SSTable files, thereby reducing the EBS I/O rate. This optimization helped us reduce the data replication network traffic amongst the cluster nodes because SSTable files were created less often than in our previous configuration. Additionally, by enabling Zstd block compression on Cassandra data files, the size of our trace data files was reduced by half. With these optimized Cassandra clusters in place, it now costs us 71% less to operate clusters and we could store 35x more data than our previous configuration.

We observed that Edgar users explored less than 1% of collected traces. This insight leads us to believe that we can reduce write pressure and retain more data in the storage system if we drop traces that users will not care about. We currently use a simple rule based filter in our Storage Mantis job that retains interesting traces for very rarely looked service call paths in Edgar. The filter qualifies a trace as an interesting data point by inspecting all buffered spans of a trace for warnings, errors, and retry tags. This tail-based sampling approach reduced the trace data volume by 20% without impacting user experience. There is an opportunity to use machine learning based classification techniques to further reduce trace data volume.

While we have made substantial progress, we are now at another inflection point in building our trace data storage system. Onboarding new user experiences on Edgar could require us to store 10x the amount of current data volume. As a result, we are currently experimenting with a tiered storage approach for a new data gateway. This data gateway provides a querying interface that abstracts the complexity of reading and writing data from tiered data stores. Additionally, the data gateway routes ingested data to the Cassandra cluster and transfers compacted data files from Cassandra cluster to S3. We plan to retain the last few hours worth of data in Cassandra clusters and keep the rest in S3 buckets for long term retention of traces.

Table 1. Timeline of Storage Optimizations

In addition to powering Edgar, trace data is used for the following use cases:

Application Health Monitoring

Trace data is a key signal used by Telltale in monitoring macro level application health at Netflix. Telltale uses the causal information from traces to infer microservice topology and correlate traces with time series data from Atlas. This approach paints a richer observability portrait of application health.

Resiliency Engineering

Our chaos engineering team uses traces to verify that failures are correctly injected while our engineers stress test their microservices via Failure Injection Testing (FIT) platform.

Regional Evacuation

The Demand Engineering team leverages tracing to improve the correctness of prescaling during regional evacuations. Traces provide visibility into the types of devices interacting with microservices such that changes in demand for these services can be better accounted for when an AWS region is evacuated.

Estimate infrastructure cost of running an A/B test

The Data Science and Product team factors in the costs of running A/B tests on microservices by analyzing traces that have relevant A/B test names as tags.

The scope and complexity of our software systems continue to increase as Netflix grows. We will focus on following areas for extending Edgar:

As we progress in building distributed tracing infrastructure, our engineers continue to rely on Edgar for troubleshooting streaming issues like “Why doesn’t Tiger King play on my phone?”. Our distributed tracing infrastructure helps in ensuring that Netflix members continue to enjoy a must-watch show like Tiger King!

Continue Reading


Top fashion designer Dame Trelise Cooper burgled: ‘One lonely hanger is all that’s left’ – NZ Herald

Emily walpole





Dame Trelise Cooper. Photo / Norrie Montgomery

Top fashion designer Dame Trelise Cooper is devestated after being burgled and losing her entire 2021 spring and summer samples.

“One lonely hanger is all that’s left,” said the Auckland-based designer on social media.

Cooper said at the weekend the company’s styling room was burgled and stripped of the Spring ’21 and Summer ’21 sample collections for Trelise Cooper, Cooper, Coop and Curate, along with a number of their unique couture pieces. 1800 samples gone, to the value of half a million dollars.

“All of our hard work through Covid lockdowns and 2020 – gone!”


One lonely hanger is all that’s left.

Over the weekend our styling room was burgled and our entire Spring…

Posted by Trelise Cooper on Monday, October 19, 2020

She asked people to be on the lookout for any Trelise Cooper, Cooper, Coop and Curate garments on the market, saying anyone noticing anything suspicious should contact her with information.

“The garments taken were size 8 or small samples so do not have care labels and many of these garments are not available in store yet.”

The fashion designer was very thankfujl no staff were harmed, “but we are truly devastated by this huge loss”.

A police spokesman said they received a report relating to a burglary of a commercial premise on Lion Place, Epsom, over the weekend.

The exact time of the burglary isn’t known at this stage.

The store was broken into and a significant amount of clothing and shoes were reported stolen.

Police have been making inquiries and a forensic examination of the scene has taken place.

Anyone with information about this incident is asked to contact Police on 105 quoting file number 201019/3913 or Crimestoppers anonymously on 0800 555 111.

Continue Reading

US Election

US Election Remaining