Transcript from Cloud Cafe Episode #31: "Opscode Introduces Chef"

Adam’s podcast with John Willis last week covered so much detail that I decided it was worth transcribing. You can listen to the Podcast Here, and you can subscribe to the podcast here.

Cloud Cafe #31 – Opscode Introduces Chef

[0:02] [music]

John Willis: [0:18] All right, hey, this is John Willis at johnmwillis.com doing another Cloud Cafe podcast. This is an interesting one. I think it's been a long day here for the person I'm about to talk to. They've just announced a company called Opscode, but I'll get the big skinny here in a minute. [0:38] Adam, do you want to introduce yourself?

Adam Jacob: [0:40] Sure, yes. I'm Adam Jacob, I'm the CTO at Opscode and if you're a fan of reading a bunch of presentations about infrastructure, I also started HJK Solutions. Today we launched Chef which is our new systems integrations framework.

John: [1:01] Yeah, Chef. I think me and you first banged heads — or not banged heads, but got into kind of an online discussion, when I met Luke Kanies from Puppet. Did an interview with him and I wrote a blog post on Infrastructure 2.0. [1:19] You picked up on that and we went back and forth. I guess Luke's original story of he wanted to know why that I liked dot-com, how the heck could they build they're infrastructure so fast. He called them and they told him, yeah, we're using these guys called HJK Solutions. Then he called you guys and said, yeah, we're using Puppet.

Adam: [1:39] Yes, right, it was something like that. I think actually the way that story actually goes is that Travis Cole, who was the SA at iLike was kind of already using Puppet. When the call when out for iLike to scale and they wound up getting that hardware, the only reason that they could actually build out all that hardware was because Travis had already done the work with Puppet. [2:04] After that they worked with us to kind of consolidate their Puppet infrastructure and kind of integrate it with the rest of their infrastructure. I think I'm the one who told Luke about iLike but I don't know, doesn't matter.

John: [2:21] OK. [laughs] Well either way. But I've been following you and obviously I could tell. I've always been interested in you guys because as I looked around at people that we're doing this kind of really interesting thing, these scalable infrastructures, it just seemed like you guys were one of the few companies out there that was doing it. I've been following you for over a year now so you probably got a tremendous amount of field experience. [2:49] Explain to me how that translated into what Opscode and what Chef is all about.

Adam: [2:55] Sure, HJK was a consulting company. What we started out doing and intended to do was figure out a way to let start ups and any kind of web business have a fully automated infrastructure. [3:11] When I say fully automated, what I'm talking about is everything. Not just config management or just monitoring system or something. It's actually every piece of your infrastructure needs to interact with every other piece in a coherent, fully automated way.

[3:29] The job of your system administrators really ought to be coordinating and orchestrating that infrastructure as opposed to dealing with each little component part. When we started HJK, what we were doing was consulting with startups and building automated infrastructures based on Puppet and some other tools.

[3:51] Basically every time we encountered a situation where we needed a tool, we would either build that tool or find one from the open source media and we'd fill that niche. We got to a place where in building those infrastructures, we built 1,000 of them or so maybe more over the course of a couple years, that there were some fundamental problems just with the way the tools were designed.

[4:13] In particular, they were throwing up a lot of trouble with getting the level of integration you really need in order to have the entire stack, your whole infrastructure, automated. The tools just really weren't built to integrate well with each other.

[4:27] And so, from that, we started work on what would a framework look like, what would a toolkit look like that would actually let you really automate the whole infrastructure, to the point where you could take away the need for someone like me, at least for the initial pass.

[4:47] And from that was born Chef and Opscode.

John: [4:51] OK. And so, Opscode is basically the company, and Chef is the first offering? Is that..?

Adam: [4:57] Yeah, that's right.

John: [4:58] OK. And just to get some basics out of the way. So, is Chef open source, then?

Adam: [5:04] Yeah. Chef's open source.

John: [5:06] OK. Good, yeah.

Adam: [5:07] It's Apache-like.

John: [5:08] OK. Cool. Cool. And so, walk me through this. I've been looking and watching different spaces — obviously cloud, all the different things I've been looking at. And so I really thought there was kind of a sweet spot of kind of what Puppet's doing, the Capistrano, and I know you guys had written an open-source, was it that iClassify?

Adam: [5:31] Mm-hmm.

John: [5:32] And then, even if you look at ControlTier. So, what have you done here? Have you kind of created a new version of Puppet, or have you subsumed some of those open-source projects into Chef?

Adam: [5:48] Sure. So, we think of systems automation and systems integration in the same way that application architecture developers tend to think about service-oriented architecture, right? So, if you think about it as you've got this whole infrastructure you need to automate and all these components within it, and some of those components are things like configuration management, right? Which is kind of a role that Puppet fills. Some of them are ad-hoc changes and deployment. So that's a role that like Capistrano and ControlTier fill, right?

John: [6:20] Right.

Adam: [6:20] And then you've got monitoring and trending, and those sorts of roles get filled by the Nagioses and the Xenos and those sorts of people of the world, Hyperic. Right? [6:31] So, what we've done with Chef is kind of analyzed, out of our own experience, where the best place was. If we were going to rethink those tools and how they work and how they integrate, what's the thing that you need most? What's the thing that's most missing? And the answer for us was we needed a way to actually call out to the config management layer, in a way that kind of exposes the resources under management in a way that is kind of like service.

[7:05] So, when your config management system knows that it should be managing the Apache daemon, right?

John: [7:11] Right.

Adam: [7:13] Why is it that then, when I go and write a Capistrano script, I have to tell Capistrano how to manage Apache, right? Why can't Capistrano just talk to my config management system and say, "Hey, man, deal with Apache," right? And why can't my applications also inform that same structure, right? Like the infrastructure doesn't exist without an application. That's the purpose of the infrastructure. [7:36] But it's very difficult to have your application inform the infrastructure, and it's very difficult for your infrastructure to inform the application. And so Chef is kind of our first pass at a framework that lets you bridge those gaps.

John: [7:50] OK. So Chef would be like kind of a configuration management database, or what I guess Puppet would call recipes. And so the idea is that any infrastructure tool, if it uses–and I'm going to be asking to help you walk me through this…

Adam: [8:11] Sure.

John: [8:13] Any infrastructure tool that's out there, if they use kind of the Chef model or API or interface, they could then retrieve that data, and then, like your example, retrieve how Apache should be managed, pull that.

Adam: [8:29] Sure.

John: [8:30] And so, where does that layer break out, and what are some, if you could, first-cut examples?

Adam: [8:37] Sure. So let's kind of back it up a second.

John: [8:40] OK. Yeah.

Adam: [8:41] In some ways Chef and Puppet both provide a very similar functionality in that they both do configuration management. One of the differences between Puppet and Chef, and there's a lot of them, but Chef recipes are just Ruby. So it's a DSL built on top of Ruby. And one of the things that that lets you do is very easily extend the language and very easily extend what recipes are capable of doing. [9:17] So, let's say that you, in your config management layer, need to know all of the servers that are running a particular application. In Chef, that's just a simple query. It's literally the command is "search" and a recipe, and it would return to you a list of all the systems that match that search and all of their attributes.

[9:39] And you can extend those indexes, or you can extend the language to go out and reach into other third-party tools. So, if you have a database that is running your application and you need to use some of that information — say, a list of customers — and you need to have the config management system take action based on who those customers are, you can just write a library in Chef that knows how to call out to that database and then use that library in the recipe.

[10:07] And so, in that way, Chef kind of forms this central integration point where a lot of things that you used to have to cobble together multiple tools to do, you can now just do in Chef.

John: [10:18] OK. So then do you have kind of a phone-home architecture, like where you just have the Chef server and then agents running on the different nodes that you want to manage, like the Puppet model, where they kind of ping back?

Adam: [10:34] Yeah.

John: [10:34] OK.

Adam: [10:37] Yeah, that's right. So basically, you have a centralized Chef server, and then you've got Chef clients. And it's all just graphs, right?

John: [10:42] OK.

Adam: [10:44] So, kind of the architecture of a Chef run is that you've got these clients. They connect up to a Chef server. They register themselves for an OpenID. So, that's kind of the authentication layer for Chef, is actually OpenID. And that has interesting side effects in how you can scale traffic. But we can kind of get to that later, or you can read about it on the wiki. [11:07] But these nodes register themselves for an ID and then kind of report all the information about the system, and that stuff gets cataloged inside the Chef server and in a full-text search index. And then recipes get applied to the servers.

[11:24] So, one of the things that Puppet does is all the recipes get compiled on a centralized server, and Chef moves that over to the edge. So, basically, the edges synchronize all the data that they need in order to build that configuration, and then the client will build themselves.

John: [11:39] OK. And I know I know the answer to this, but it's just for my dumb, big, brick head. But you are not using Puppet. You are using a purely Chef, new-built implementation.

Adam: [11:51] Yeah, that's true. We are not using Puppet.

John: [11:53] Yeah. OK. I knew I knew the answer, but I had to ask. [laughs] I figured if I was just a percentage confused…

Adam: [11:59] Sure.

John: [12:00] Well, cool. So, a first cut, if you will, or first version, you are basically a standalone — and I'm going to stop saying "Puppet." But you are a standalone configuration management that can do, basically, the things that Puppet can do. And then, are you doing some of the other things you talked about at this point, or have you built in kind of the Capistrano or some of the ControlTier agent functionality as well?

Adam: [12:29] We haven't.

John: [12:30] OK.

Adam: [12:33] Some of that stuff, you absolutely could do with Chef. Some of that stuff will wind up being a different service that just knows how to talk to Chef and integrates really well. And some of those things might be ControlTier and Capistrano. I mean, tying Capistrano into Chef is a no-brainer. It's seconds' worth of work, literally. So that sort of thing's going to happen kind of immediately. You wouldn't really replace Capistrano. What you would do is extend Capistrano to use Chef.

John: [13:03] OK.

Adam: [13:07] Does that answer your question?

John: [13:08] Yeah. But, from kind of your original discussion about what Chef is, that you want to take the guy — like a guy like me, right? I mean, if I was go to out and install and try to do with Chef tomorrow, I wouldn't have to bring you in for three hours of consulting or a week of consulting, right? And that's the goal.

Adam: [13:28] No.

John: [13:29] So you could tie in something like Capistrano, where if I got through kind of the getting started Chef, I wouldn't have to really know — not that there is too much to know about Capistrano or anything like that…

Adam: [13:42] [laughs] Sure.

John: [13:43] But I'd have that kind of built into my Chef knowledge pretty easily?

Adam: [13:47] Yeah, that's exactly right. Yeah.

John: [13:49] Good. Good.

Adam: [13:50] And one of the things that Chef is really focused on is redistribution. So one of the things that makes a world where you can have people build automated infrastructures without having a super-deep knowledge of what's going on under the covers, just kind of getting the best practice from the people who really did spend a lot of time figuring out what's the right way to do this.

John: [14:08] Yeah.

Adam: [14:09] So that redistribution problem is something that we've worked really hard on with Chef, and it's kind of our road map to follow in what we think is going to be a really awesome way. So the cookbooks that you produce when you write Chef recipes are shareable units.

John: [14:24] Good.

Adam: [14:25] Chef kind of knows how to, in addition to being shareable, they're also over-writeable. So you can actually take an upstream cookbook and then have just slight modifications for your own infrastructure, right? So, I like how this particular cookbook does this, but not this other thing, so I'm going to use that and just overwrite this one section.

John: [14:43] Oh, very cool. Yeah.

Adam: [14:44] So that kind of distribution model, and the idea that in order for this work, you really need to focus in on. And every decision we've made, we've asked ourselves, "Does this have some impact on our ability to distribute this to other people? If we made this design decision right here, is that going to negatively impact the ability for people to share these recipes?" And anywhere the answer was yes, we chose to not do that.

John: [15:08] Yeah. Well, that's the thing. We've been talking about doing this, and we waited, obviously, till you announced it. But I knew, with your guys' kind of domain experience in the field doing this stuff day to day, and when I heard you guys were going to have a product, I just thought, "This is got to be interesting." [laughs] [15:28] So it takes me to my favorite subject. So tell me about — because, again, when I look at what is Puppet missing, or how do you take some of these features together and make it easier to do some of the things people are doing in the cloud right now, right? So there's a lot of these kind of add-ons trying to make — it's even a bigger problem there, right? Because people that are going into the cloud, if you're going to build a data center, or 10 or 15 or 30 or 50 servers, there's some expectation that you're going to have some kind of domain expertise, right? Or sysadmin.

[16:00] But people are literally going in the clouds. They don't have any experience, and they want this kind of infrastructure in a bucket, [laughs] or on a stick, right?

Adam: [16:13] Yeah. I said "infrastructure in a box" for a long time.

John: [16:15] There you go. I think I like my new one: infrastructure on a stick.

Adam: [16:18] Yeah.

John: [16:20] Yeah. So tell me that story.

Adam: [16:22] I mean, that's definitely what's happening. And one of the reasons we released Chef — I mean, we released it when we did because it's ready and it's usable now to do real-world, complicated things. And one of the places where it's doing that is Engine Yard's new Solo product is driven by Chef, on the backend. And so, kind of directly in the cloud. [16:48] One of the places where we're going here is that, if you have these resources that are kind of available as a service, and you have this nice systems integration framework, you can start making those decisions and building a system that actually automates your cloud infrastructure, and your physical infrastructure, in the exact same way, without having to really know what's in the cloud and what isn't. Like, if they're just resources under management that you don't really care if they're in the cloud or if they're in your physical infrastructure, or whatever.

[17:20] And to some degree, eventually, you're going to wind up leaving some of those decisions up to people other than you. You'll pick a cookbook and go.

John: [17:32] Yeah, I know. I was just thinking about that this morning. I did a podcast and I was thinking that the ultimate game here is when — literally right now, the early adoptions of the enterprise Clouds today are the things like really self-service provisioning. [17:50] You know, I want a server that looks like this, that I can do some testing on between the hours of 8:00 and 5:00 tomorrow. And I hit a button and I get it. Right? But the real Holy Grail here is when you get self-service to the point of — this is a real stretch — I want an animoto and I want to hit a button. Know what I mean? You know, I want that kind of infrastructure on a stick.

Adam: [18:16] We're a lot closer to that than it seems. But the reality is that to get there, the world needs to look a lot more like the Internet and a lot less like the enterprise. [18:31] The problem is that a lot of the tools that exist today and that people have been using, we think about configuration and we think about infrastructure in the way that the enterprise products do, which is to be careful, to be slow, to be tightly coupled, to be highly defined. Right?

[18:48] But the Internet taught us that it actually is better to be loosely coupled and to not be quite so hung up on definitions. It's actually better to just do it and have that standard sort of form up around you. Reps kind of taught us that having the interfaces be well-defined then having them be really simple. Right? I have it now, I'm going to get this resource and it's going to return whatever representation of it I ask for, and then I'm going to do something with it and I'm going to put it back.

[19:15] And that's the whole API. How much simpler is that than the interactions that we had to deal with in a lot of the kind of enterprise management frameworks. And the thinking's going to be true of getting to a place where you can kind of click a button and get an animoto. Right?

[19:36] The way you do that is all the things that go into building an animoto, just need to be resources that you don't really care about how they work.

John: [19:39] Right, that's right.

Adam: [19:40] And we're on the way there.

John: [19:42] Yeah, and I can see that. Now I guess the obvious question, the engine — it's funny, I was just blogging about that. Obviously it was new within the last couple of days about their kind of AWS offering. So that's cool. I work with a lot of the Atlanta web entrepreneur guys, right, and they tell me, "John, we need that kind of sub-$100 version of RightScale.". [20:06] And I'm like, "Well, there's some things, but you've still got to get your fingers dirty." You're going to have to get a consultant in to get that last mile in, you know?

[20:20] It looks like you guys – the thing that comes up to me immediately is what's missing with some configuration management tools is the provisioning. Can you provision? It looks like you have already provisioning – I mean provisioning for things like Cloud, like AWS, starting, stopping, instant ANIs and all of that. So you have that stuff kind of built in today? Obviously, it must have been necessary for the Engine Yard implementation.

Adam: [20:46] If you think about it as disconnected services, the real question is, is it possible from within a Chef Recipe to call out to the Amazon Web Services, API and provision servers? And the answer is yes. So, because Chef is just Ruby, all you have to do — if that was something you wanted your Chef server to do — is write a library or even just put it directly in the recipe and call out. [21:18] So literally what you would end up doing is extending Chef language which is to understand resources that are AWS and just doing it.

John: [21:27] OK. So you have kind of a provisioning infrastructure? Because, you clearly have a configuration infrastructure based around the Chef, and that's kind of built in. Right?

Adam: [21:40] Chef doesn't really do provisioning in that way.

John: [21:45] OK.

Adam: [21:47] You could use Chef to do provisioning. If you wanted to, there's nothing stopping you from writing a recipe that understands how many EC2 servers you should be running and whether or not they have a lot elastic box devices attached to them. You can totally do all that within a Chef recipe. The question is whether you would want to and the answer is maybe. Right? Hard to say. But, yeah, right now, it doesn't do that kind of provisioning for you.

John: [22:12] Got you. You guys, do you see that as an opportunity? It doesn't sound like, maybe a reference…

Adam: [22:20] Yeah, it's hard to say. Certainly it wouldn't surprise me at all if us or someone contributed to Chef a recipe that knew how to manage those resources. It's more likely if it were something we were going to build, that that would be a separate service that also utilized Chef.

John: [22:39] OK. Yeah. I get back to, are we getting closer to finding that Holy Grail that the entrepreneurial web guy can say, "I can use this. This looks like it handles most of what I've got." You're familiar, obviously, with Scalr or something like that. Right?

Adam: [22:57] Mm-hmm.

John: [22:58] Would it be nice to have a Chef version of a Scalr?

Adam: [23:02] Absolutely.

John: [23:03] Yeah. And so how does the web entrepreneur get there? He brings in a Ruby guy to help him do that last mile or does he look for you guys doing it?

Adam: [23:15] Today, I don't think we know the answer to that quite yet. The world that I think is going to arrive, is a world where the management — how you want to approach managing those resources. Amazon launched their AWS console that lets you just do basic web management. You've got the right scale in the Scalrs and, to a lesser extent, the Elastras of the world who provide you kind of a UI to manage certain types of those resources. [23:50] Tools like Chef that are built for the Cloud environment and are built to deal with that sort of an environment, are going to make those sorts of consoles much more powerful. So how you manage them is going to be less important. How you manage them in terms of the UI and how you manage them in terms of like, where's the thing I go to that understands that these are the systems that are running and that my high load just spiked over this little bit and so I want to scale you a little bit more.

[24:18] Once we get the fundamentals right, those problems become much, much simpler. Right? So, if you think about the things you need to do in order to implement autoscaling, all of those things are actually, in some ways, separate services. And what they need is to be able to understand how much of other services are working together.

[24:40] You need to know: how's my infrastructure performing? If I need to take action, what does that mean in terms of the config management layer? What does it mean in terms of deployment error rate? Does my app automatically redeploy? What does that mean, right?

John: [24:52] Right.

Adam: [24:54] And a lot of that stuff we're just now getting to the point where we have kind of the ancillary technologies that are good enough to really put those together in a way that is actually really great. Most of the solutions to those problems right now, they're really – they're just not broad enough. Right?

John: [25:11] No, I totally agree. Yeah.

Adam: [25:12] It's been a really small niche and for that small niche, they're really great. And the way we're going to get that out of that tiny niche and into ubiquity is getting all the other tools right.

John: [25:25] Yeah. I know. We could spend days on terminology here, but I always think of kind of autonomics or, you know, to a certain extent, autoscaling as the combination of configuration management, provisioning, automation in the sense of something Capistrano or ControlTiers, you know, ability to run commands and then monitoring.

Adam: [25:46] Yup.

John: [25:47] And so today, if those were the four pieces of the pie today, what is does Chef gives us today?

Adam: [26:00] Right now, Chef is giving you the configuration management layer. And it's giving you the orchestration services that you need in order to tie the other services together.

John: [26:11] OK. OK.

Adam: [26:13] So it's the thing that's going to let you tie together the fact that your monitoring system has sent an alert and that that needs to take action on the infostructure. Those sorts of things, that kind of orchestration, that kind of glue, that's where Chef is.

John: [26:26] You're holy grail is what you said, is really for the orchestrator to be able to get to that ubiquitous player, is that right?

Adam: [26:37] Absolutely.

John: [26:38] OK. I'm trying to get to where, I'm thinking the long way here, but to get to what is the holy grail version of Chef, so now is the time to ask.

Adam: [26:48] The holy grail version of Chef is a place where other developers, other tool builders, start to realize the value of building these kind of really smart accessible services and exposing what you used to be really closed-wall garden applications, where the data is really kept inside of the apps, and they start to expose that information differently. [27:13] When that starts to happen what we're going to start to see is people are going to be able to select which tools they want to fulfill those goals and they're going to be able to really simply tie them together. Then we can build other more complex services on top of that.

[27:28] Infrastructure, really, if you think about software and service, and you think about Infrastructure, the holy grail is a world where we have all of those tools that used to stand alone and be in their walled garden and only communicate with each other because the systems administrator wrote a Perl script, which is awesome, it totally worked, but it doesn't really work well, it doesn't scale, you can't share it, it's not repeatable, and the list goes on and on.

[27:52] The Holy Grail answer is that we start to build those in that world where everything is service orientated and you actually can orchestrate them together.

[28:01] In that world, who cares what you're using for the big match player, right? What you care about is that there's a resource that I expected to have take some action and did it, or did it not take that action? Did it work, yes or no, right?

John: [28:15] Yes, no.

Adam: [28:16] You don't care what the monitoring system is. You just care: are you watching this thing and is it in a full state, right?

John: [28:21] Right, absolutely. Not to say that this is the ending, what this kind of summarized, my understanding of what you guys have done is you've built this, with all your best knowledge – and I say that because I watch what you guys say, I respect what you say, and I know guys know what you're doing. So with your best knowledge of what infrastructure is today – which is great, because when you think about the money that company's like BMC has spent on BladeLogic and HP – those are products that totally missed this last three to five year window, you know what I mean? What's going on? [29:04] It's no better time for somebody with the knowledge of Infrastructure to build a new version, the year 2K version Provision, or whatever you want to call it. All right, so I'm rambling.

[29:16] [laughter]

John: [29:18] So you've taken this knowledge and you're first cut is obviously configuration, because you know that, you really got that, but you built this orchestration model. The hope is to get that really right. Now you think you've got it right first cut, you want to make sure you have it right so you can mature in the meantime what some of these guys like monitors and provisioning tools flesh out that you've got it right. [29:46] Then ultimately, either you provide that service, or because it's flushed out so well, people integrate and it becomes this seamless software service variety Infrastructure.

Adam: [29:58] Yes, you've got it.

John: [29:59] How do you like that? Just to kind of go back level a little, I always wonder about what Control View does and I really like what they do on the application lifecycle.

Adam: [30:16] I do too.

John: [30:17] That's why I got real excited when I saw you at Velocity, you talking to Puppet, and I'm thinking, "Boy what a marriage here." [30:25] The obvious question is: does some of these Ruby-based configuration orchestration tools miss out on the Enterprise Java guys or is that just a possible perception thing?

Adam: [30:44] Maybe you do. I think there's always going to be a certain degree of language bigotry, right? Like, "Oh, you didn't write it in Valvo." Well, that's not Enterprise-ready, right?

John: [30:53] Right.

Adam: [30:54] Or, "Well, they wrote in Java. I'm not using that at my website." It's less about the language; it's more about what it does for you and how you relate to it.

John: [31:03] I know that. I think the more intelligent seeker would know that, we always notice the dopes.

Adam: [31:12] Yes.

John: [31:13] I guess the real gut to the question: in your opinion, with your expertise, is there something beyond just the language bigotry, it being Ruby-based and being able to manage Java or not?

Adam: [31:22] Oh, no.

John: [31:23] That's what I figured the answer would be.

Adam: [31:25] The reality is that the thing about the Enterprise and info-structure automation and info-structure tools in general, is that their requirement list hasn't caught up to the requirement list of the big web shops. [31:42] So if you think about big web shops, they don't use any of the big Enterprise Infrastructure tools, right? The reason is that they don't actually fit their model at all: the way they work, the way they grow, the agility that they have, they just don't usually mesh with those really big products.

[31:59] So what you wind up with is this kind divide where the enterprise is like, "Well…" One question you get a lot of when you're building these tools is, "Can I roll back?" The enterprise really wants to know about rollbacks. How often does the web roll back, right? The web rolls forward.

John: [32:14] Right.

Adam: [32:17] Rollbacks are hard. What you really need is an appropriate apologies. The line there is less about the language, it's more just about the sort of requirements that each side thinks that they need. Over time, I think, they're going to wind up seeing the enterprise look more like the web and not the web look more like the Enterprise.

John: [32:38] You're hearing mouth to God's ears…

Adam: [32:42] God willing. Who knows how long that will take, but I think it winds up that way partly; it's the tools and ControlTier. One of the things that's awesome about ControlTier, and there's a lot of things that are awesome about ControlTier, but one of them is it fits that niche really well.

John: [32:56] Right.

Adam: [32:57] ControlTier hasn't straddled that line, so ControlTier does really have some really cool features that the enterprise really likes and respects. It is actually functional enough, agile enough, and works well enough that you totally can use it in a web shop. They've done a great job with that.

John: [33:15] Yes, I totally agree. I don't know if you saw the whole Eli Lilly webcast? It was great. I keep saying 2009 is the year of the enterprise and the Cloud. I think when you start seeing companies like Eli Lilly coming out and saying things like, "Our researchers used to have to wait 8-12 weeks to get a server provision. Now they wait three to five minutes."

Adam: [33:46] Right.

John: [33:47] And he said things like, and this is mind boggling to me, that Amazon has redefined the concept of time. It doesn't does mean just provision. It means this kind of concept of latent demand, you always see when somebody builds, they the capacity planners figuring out for months and years, "OK, we're going to add another server." And then all of sudden that new server is 90% used in 100th of the time they expected. It was all this latent demand. [34:20] I think what Eli Lilly has started, which could be this exposure of when your crown jewel researchers, or developers, or whatever you do, have had this kind of blockage. Like, "Well, I could never even try that. The chances of getting services to do that would take me a year to get." Where now they think in terms of, "I could do that because I could run 100 of these right now."

Adam: [34:42] Yeah!

John: [34:42] I think that story's going to get out. What the problem is going to be is the enterprise is going to have to make the companies that buy into what Eli Lilly has done. They're not going to be able to use the HPs and the Tivioli to get them there. You know what I mean? [34:56] They're going to have to change really fast, from your mouth to God's ears. What they're going to be looking for is things like Puppet, or now Chef, and tools that can get them there quickly to give them this kind of functionality that they need. They need this configuration management.

Adam: [35:15] Yeah. Absolutely. Well, and there's a layer going on underneath that, too, that's just cultural. If you are in web operations and you can have a developer come to you and say, "I need 10 more servers," and your answer is, "I'll have those for you in 18 weeks, " you're done. Right? That's not an option if you run a useful web shop, right?

John: [35:40] Right, right, right.

Adam: [35:42] That's not a choice, right? And you can totally do that in the enterprise, and that's just how it is, and just, "It's going to take 18 weeks to get those resources, Bob." [35:51] Well, culturally, the enterprise is going to wind up looking more like web ops and less like the enterprise, because, in a web-ops shop, one of the ways that you can get to that place is by saying, "You know what? My developers can go in the data center." They get trained by my ops guys, right? And they have to do the server installs the way my ops guys tell them to do it, but now they're unblocked, right?

[36:18] And that kind of thinking and that sort of operational shift, where operations isn't a group. It's not a unit that winds up like in the corner of the company that only gets called up when the servers are on fire. It's the integral soul of everything you do. Operations is your business. It's your whole company. And in that way, that's really the place that's going to push the enterprise away from what they've been doing and to look more like the web.

[36:44] It's that cultural shift is going to come along with the desire for these tools. The tools are going to be so sexy that they're going to be like, "How can we use these tools?" The only way to use those tools is to make these cultural changes. Next thing you know, the enterprise is the web.

John: [36:59] Well, and I take my pitch to the enterprises, that they'll make the same mistake that you made. This shows my age, but with client-server, it was the glass-house mentality in the '80s.

Adam: [37:11] Yeah.

John: [37:12] And then client-server came out, and what happened is the enterprise drew a line in the sand and said, "Not until we understand this, we're not going to give you that." And they would do these year-long studies. And meanwhile, their business units went out and just bought client-server solutions.

Adam: [37:29] Right.

John: [37:29] And we spent 10 years trying to clean up that mess. And if they don't watch it now, the same thing — I'm seeing it every day. The "do now, ask for forgiveness later." The "New York Times" story, right? They're just going to go out and they're going to say, "I'm not waiting a half-year for that." [37:49] And I think the end result of that, too, is a product like yours that, part of that 8-12 weeks, or three months, or whatever it is, right? Or six months. I know a customer, a large bank, that's waiting a year for a server to be…

Adam: [38:01] Yeah.

John: [38:03] And most of that, almost all of that is like compliance and certification. And so one of the points of a product like yours is that the certification is built into the knowledge modules.

Adam: [38:17] That's right.

John: [38:19] And today it's a round robin. It goes to the SAN guys. Then it goes to the security guys. Then it goes to the monitoring guys. Then it goes to these. And if they could all build their knowledge into the recipes, right?

Adam: [38:29] That's right.

John: [38:31] Yeah. So they just get the best of all worlds. It's one size fits all.

Adam: [38:36] You can actually use a continuous integration server to show that you are compliant.

John: [38:41] Yeah. Yeah, that's right. And then you're reporting. Again, people spend a lot of money just to buy other products to go report to them what the heck they do have after they've gone through all that. Yeah. This is all very, very exciting stuff, I think. In a way, even when a first started talking about the Infrastructure 2.0 stuff.

Adam: [39:01] Yeah.

John: [39:01] Do you have any ideas about monitoring, since I'm a monitoring geek? What are best fits for kind of auto-scaling?

Adam: [39:08] Boy, I have a whole bunch of ideas about monitoring.

John: [39:10] [laughs]

Adam: [39:13] I don't know. I haven't seen the monitoring tool that I want yet.

John: [39:17] Ah. OK.

Adam: [39:19] The thing that kills me about the monitoring tools is that why is it that there's no monitoring tool in the world

Jesse Robbins

Chef Co-Founder & Advisor