Ben Sigelman and Bryan Cantrill talk Sun Microsystems, DTrace, and Shouting in the Data Center
In this conversation with Lightstep Co-Founder and CEO Ben Sigelman, Bryan Cantrill recalls his experiences working alongside Jeff Bonwick at Sun Microsystems in the 90s, (almost) interviewing with Tom West for a position at Data General, the origins of DTrace, and the infamous Shouting in the DatacenterShouting in the Datacenter video.
Check out the full video below.
The entire discussion is worth a watch, but we highlighted some of our favorite parts for you here.
Dear Tom West
Bryan: Data General was going to interview at our campus, and I had read The Soul of a New Machine, which is about Data General in the 70s. Now, it's the mid-90s and even as a 22-year-old, I should've realized that DG's star had long since crashed to Earth. It's finest days were definitely behind it. But, I looked at the interview schedule and it was Tom West, himself, coming to interview on campus.
I wrote this love letter, basically, to Tom West describing how I couldn't wait for him to come interview and I was really looking forward to the interview with DG. And Data General canceled. They actually pulled out. Which, you're not supposed to do. You're not allowed to. The Career Services Department, basically, their only mission in life is to make sure that this can't happen. I was the only person that had signed up to interview with them and it was very hard to not take this personally. It's like, “Clearly, this kid sent me this letter that was way too forward, and creepy, and weird. And so, there's no way I'm going to go. I'm going to cancel this."
The highs and lows of DTrace
BHS: Was there a moment for you with DTrace when you realized this was actually going to work? Because for me there was a period of, "Oh, I don't actually know if this is going to work or not, but let's find out." But then it was like, "Oh! This actually has [worked]. " I'm curious if you had a similar experience.
Bryan: Yeah. As with many of these things, it comes in fits and starts, right? There's that moment where the initial idea has legs. I definitely remember that moment. What we had basically asked for was six months to prove out some of these ideas.
It was only maybe two months into that when we could dynamically instrument the entire kernel and you could turn on all this instrumentation around function boundaries that we hadn't had before. I remember being able to instrument the NIC driver while the system was running and see the entire flow of NIC functions. It was like, "Okay, this is actually pretty neat." You could see the initial idea had legs then.
Then there was an extended period of, "is this thing ever going to work?" I mean, I would make a habit of, whenever I wanted to debug a problem, I would go to use DTrace to debug it. For years in using DTrace to debug a problem, I would find the problem and then I would also find a bug in DTrace. I just remember [thinking], "am I ever going to go use DTrace and not also find a bug in DTrace?"
There's a moment when you realize, "This is going to work for me." Then there's a moment, and I imagine the same thing with Dapper, where you're like, "Okay. I'm on to something that's actually kind of larger than myself here."
The true story behind Shouting in the Data Centers
BHS: With DTrace, I always think of that Shouting in the Data Centers video. How did that happen?
Bryan: Brendan [Gregg] has gone off to the data center to play this Python script to play tones on a disk drive to see if he could get it to do anything. Brendan comes running in and says, "you have to see this," and goes running out. Brendan is just the kind of guy that if Brendan is running, you run in the same direction that Brendan is running in, right?
We go into the data center and he screams at the disks and all of a sudden we see all the latency numbers go wild. He does it and I'm like, "alright, we've got to video this." The video that you've seen is me grabbing the video camera, and I'm only seeing it basically for the second time, which is why you can hear us both laughing when he's doing it because we're so delighted and surprised by the fact that this is happening.
We cut the video and uploaded it to YouTube, which was very new at the time. I think YouTube had only been around for a couple of years. I mean, I'm thinking, "this will get 1000 views or something." That video I think has got well over a million views at this point. It became the most viewed material ever produced by Sun that was not a Super Bowl ad.
Be sure to subscribe to Lightstep’s YouTube channelLightstep’s YouTube channel for more conversations with Ben and to learn more about Lightstep.
Note: These excerpts have been edited for clarity and concision.
April 20, 2020
5 min read
About the author
Ashley Rahimi SyedRead moreRead more
Explore more articles
Monitoring Apache with OpenTelemetry and LightstepAndrew Gardner | May 2, 2023
Continue your observability journey by ingesting metrics from Apache and sending them to Lightstep.Learn moreLearn more
Monitoring MySQL with OpenTelemetry and LightstepAndrew Gardner | Apr 11, 2023
Learn how to ingest metrics from MySQL and send them to Lightstep.Learn moreLearn more
Monitoring NGINX with OpenTelemetry and LightstepRobin Whitmore | Apr 6, 2023
Learn how to start ingesting metrics from NGINX and send them to Lightstep for more intelligent analysis and monitoring.Learn moreLearn more
Lightstep sounds like a lovely idea
Monitoring and observability for the world’s most reliable systems