Eric Kavanagh: Okay folks, hello and welcome back once again. You can see the slide in front of me, hopefully, it’s called “Hot Technologies of 2016.” The years keep flying by. Today we’re talking about “Analyze and Optimize: A New Approach to Monitoring.” Oops, we’ve got a little error on the slide there, don’t look, don’t look! Okay, so, there’s a slide about yours truly. I will be your host, you can look me up on Twitter, @Eric_Kavanagh, and I’ll be glad to tweet back at you.
We have a different format than The Briefing Room here, so first of all we’re going to have a couple of analysts, Rick Sherman and our very own Dez Blanchfield, data scientist at the Bloor Group, they’re going to give you their take on the topic. Then we’re going to hear from Robert Vandervoort, the expert, he’s over at IDERA, which is a very interesting company. They bought a company we know, called Embarcadero, but they have a whole bunch of other stuff, and some interesting stuff, which is now being used in some new and cool ways. Rick Sherman comes first.
Before I go there, let me just going to kind of throw out a couple quick thoughts. I like this concept of analyzing and optimizing through monitoring, and I like what we’re going to hear from Robert today about changing the way you think about monitoring solutions. Because the fact is, monitoring is what you do all the time anyway, if you’re in the IT world. Somehow, someway, or in the business world you are doing monitoring. It might be formal, it might be informal, but there’s some mechanism by which you operate your day-to-day tasks. And if you’re working with machines, you’re trying to figure out what they’re doing. You’re trying to prevent them from failing, for example, or having poor performance.
How do you do that? Well, there are lots of ways to do that. The cloud has really spurred on this whole wave of innovation in monitoring, which I think is quite interesting. We’ve seen companies like Splunk come along and really change the game, and a lot of different companies are now trying to monitor in different and interesting ways. And what we’ll hear today from IDERA is what I think is one of the more creative approaches we’ve come across in quite some time, and I hope it’s one that resonates with you folks out there today. You can ask questions at any time, using the Q and A component of your webcast console. Don’t be shy, send those questions in. And with that, I’m going to hand it off to Rick Sherman. Stand by. Take it away, the floor is yours.
Rick Sherman: Okay, thanks Eric. Hey, everyone. We’re going to talk about that monitoring thing and why there’s definitely been a need to change how we approach things. Now first off, just as a quick, my background – I’m in the world of business intelligence, business analytics, data integration, etc. as opposed to the sort of app side. I’ve been sort of at the back end of these different trends that are happening in the industry. We have the data deluge: big data, small data, data coming from all over the place, inside and outside the enterprise.
We have the internet of things, things coming in from monitors, devices, and then we have an explosion of things other than just relational databases out there, both on-premise, and on cloud, etc. But what all of this means for monitoring, for system performance application monitoring, and management, etc., as well as for data integration and for business intelligence, is that we used to have a nice simple world, at least it was simple from the IT perspective, which is it used to have a set of servers that they – everything was on there, the applications, the data, and it was all on-premise, so they controlled the whole world. It was a lot easier to manage. But what’s happened has been that the enterprise has gotten much, much, much more complex.
We have an explosion – forgetting just big data – we have an explosion of applications both on-premise, and in the cloud, to improve business productivity, to enhance different business processes, for businesses to interact with other businesses and with their customers, be they businesses or people. We’ve had an explosion, as the other slide shows, of different kinds of different databases, big databases, relational, cloud, etc., and we have had much more, better utilization of servers, operating systems, both with real and virtualized servers out there, to better manage, better utilize the individual servers themselves. And, of course, we have a whole network of things happening between all these applications, databases and servers.
A couple of other things, particularly in my world, has been that all this has spurred even more application synchronization. We have more and more application servers, databases that are used to move data, synchronize data, integrate data between different processes, both inside and outside an enterprise. And of course we have the data integration that’s needed to support that.
With that in mind, and with the fact that we move from this nice, safe world of an on-premise set of servers that we managed, to sort of this enterprise and extra-enterprise bit of applications and data, we’ve moved to “how do we actually manage that environment?” And the reason why this webinar is interesting is because the current state of affairs hasn’t been too good. We’ve had a lot of different tools to look at databases, servers, SharePoint, operating systems, data movement, etc. – they’ve all been scattered as such we’ve had the silos so we’ve been able to manage or monitor a specific server, specific application, specific database, but we haven’t been able to put them together. Now, since they’re all interactive and interrelated, it’s more than just the individual piece parts, you need to put them together, and as such we’ve had – I guess this is my high school picture – we’ve had people that have had specialized knowledge of these tools to get deep into the bowels of the systems in order to manage them.
They’ve been expensive and costly, time consuming, and we’ve been sort of stuck in the mud in that we keep looking at and trying to manage these piece parts and haven’t been able to really manage the enterprise. Where that’s left us, or where that’s brought us to, is the need. The need has been to get into enterprise monitoring. We need to be able to look at applications both on-premise and in the cloud, databases, the same way. Servers, networks, virtualized, non-virtualized systems, the data integration, the applications synchronization that’s out there. As in business intelligence analytics, the first thing you need to do is capture the data about all these different services and the infrastructure, the applications.
The second thing you need to do, is to then put that data together in order to look at how they’re inter-related to each other. You can’t do anything until you figure out how these pieces are inter-related and bring that together. But how we’ve moved up from the piece parts to sort of more of a comprehensive or enterprise application management, has really been growing out because we’re capturing the data, because we’re integrating the data, is to be able to enhance the analysis of that application management and monitoring.
The first thing we need to do is figure out what is happening to these individual systems or piece parts. Second thing we need to do is understand why it’s happening. That requires more in-depth knowledge of the applications, the databases, servers, and how they’re inter-connected and how they’re related to each other and what one thing will trigger something else. I mean, often we run into problems where something happens and it’s really not the root cause, it’s just the symptom of something else. We need to figure out why it’s happening, but we have to collect the data and be able to monitor the piece parts.
Finally, we’ve got to get into a little bit of the predictive analytics or predictive monitoring. Or we’re starting to figure out why something is likely to happen or what’s going to happen next. If something fails or is about to fail or hits some threshold, we’ll need to be able to trigger and understand what that implies, what else will happen next. We’re capturing the data with the monitoring, we’re starting to analyze the what, why, and what’s next, and then we finally get in to managing based on the data and based on the analysis.
Remember, it’s nice to capture data, it’s nice to analyze data, but that data has to, that analysis and data actually has to be actionable. You need to be able to be reactive, react to what’s happening, and be proactive in trying to fix it, itself. So we also have to have not just monitoring tools and visual analysis of it, but it’s also critical to be able to actually fix things on an automated or systemic fashion. This is sort of the need that’s grown in the enterprise and again from the BI and business analytics perspective and data integration perspective, we’ll often have issues trying to figure out what the break points are. Why isn’t something scaling, why is something failing, why are business users not feeling that the service level agreements are being met? We can do all this great stuff with the applications, with the data, but the systems that support it have to be managed in order to enable all these great things that are going on out there. Dez?
Eric Kavanagh: Right, take it away, Dez.
Dez Blanchfield: Thank you, wow. We probably have a couple little areas that we completely agree on there. A quick bit of background on my life in the world of monitoring things. In fact, almost 20-odd years ago my brother and I used to work together in environments that looked a lot like this. This is a network operation center. This is a current one, and we managed everything from routers and switches, and servers, and firewalls, and systems running applications, and the applications on there and databases on there and a whole range of servers were interconnected.
At the time, there weren’t that many tools available to do monitoring. There were quite a few free and open source tools, but the few applications stacks that did end-to-end monitoring were expensive and hard to get your hands on. And so we actually sat down and wrote one, believe it or not, and the internet was sort of just becoming a thing, and we used to run tools on these unique systems, Solaris systems, to collect system activity reports and disk usage and memory usage and so forth, and log it to a file and run a script on it. We actually used to email the collected data into a central server, pull those log files entries out of the emails as they came in, analyze them, stick them in a database and draw pretty graphs about them.
We thought we were pretty clever and pretty cool because we could tell what’s going on, but the thing that struck us before long was that, although we could actually report on the historical state of the nation, it didn’t really tell us a lot about the current state of the nation in the immediate sense, because the data we were collecting was being emailed somewhere, so it was invariably a couple of minutes before it went from the server that it was collected on, across the network, and via email and into a mail server and pulled apart and put in a database, so really it was pretty graphs but it’s all in arrears, all these historical.
In fact, in the top left-hand corner of this pretty picture of like 18 LCD panels pretending to be one virtual desktop, there’s a graph, a little green graph in the top left corner that looks very similar to what we used to do, mapping things out. And we had this constant frustration that it was almost impossible for us to kind of tell what was happening at the moment, or even what was going to happen in the future. No matter how many times we tried to do some sort of predictive graphing, and this is nearly twenty years ago roughly, from memory.
This is a picture of an actual network operation center screen, it’s 18 LCD panels all glued together pretending to be one great big Windows desktop, and this is often the state of the nation currently for the types of things that organizations or telcos or large enterprises run to keep a track of what’s happening in their world. Whether it’s their networks and their routers, and their switches and applications servers, what’s interesting inside this particular screen, or this screenshot is, photo, is that it’s not one great big window, it’s not one great big web browser stretched out, it’s lots of little tiny windows overlapping. If this thing ever crashes or reboots or has to be shut down for some reason and powered back up, some poor fool has to sit down and reopen all those individual applications and tile all the windows manually to get that same view. It’s extremely laborious and it’s risky because if someone doesn’t know the order in which they put them back in, it’s nearly impossible to re-create and it’s a pretty sad state of affairs given this is currently what most network operation centers look like. Someone’s got to physically run multiple apps and mobile systems and they are looking at the past. So not a lot has changed in many ways with what a lot of companies think monitoring actually should be.
In that 20 years ago, we used to have this view that if you could ping a server it was up, but the reality was that we found that just because you could ping a server, as in ping it and echo an ICMP type for a shot at the thing that would echo back and say, “I’m alive” didn’t mean it was actually up. And even if it did ping back, sometimes, the servers and the apps on them weren’t running. And so, monitoring is a whole science. It’s come a long way, but even then many of the modern application stacks that we buy in the monitoring world and the service management world, don’t do predictive. Things were a lot simpler back then. And we used to – the types of things we’d think about were, “Well, is the server up and responding, is the operating system online and can we connect to it. Are the applications up and running and we can monitor that, are the app services responding? The web server looks like it’s running, but can we connect it to port 80 or 443 on it? Can users connect to the services that are on there?” And quite often it came down to something as simple as the help desk phone ringing, and if it wasn’t, then the biggest decision we had to make for that day was whose turn was it getting doughnuts.
Then along came this concept of hyperscale everything, and particularly hyperscale computing, and by that I mean the volume, the speed and the size of things that we’re dealing with now. And a lot of people talk about the unicorns of the world and Facebook and LinkedIn and Google of the world, but there’s actually a lot of organizations of small to medium size that have very, very complex business and IT environments that they’re trying to monitor and trying to get a handle on, a grip on, and put their finger on the digital pulse of the business, and unfortunately they fail dismally, just due to the sheer levels of complexity, which have increased by order of magnitude, in my view, on almost every level.
If you look at two really basic pieces of what a modern enterprise is having to deal with, in one case even something as simple as the big data platforms that we take for granted now. On the left we’ve got the framework of what used to be Hadoop version one, a very batch mode, batch-orientated version of what Hadoop was about, the MapReduce framework running on top of the Hadoop file system and a bunch of tools that we effectively plugged in, they’re like Pig and Hive and other tools. On the right, essentially the second rework of the framework of Hadoop all built around YARN and a slightly more high performance computing architecture and better scheduling. When you look at these individual frameworks themselves, they’re extremely complex and the things you can do within them are even more complex.
When we look at the cloud paradigm, we’ve got a scenario where, this is a model of what OpenStack looks like and OpenStack is – an open source of cloud platform built up of many, many little modules and this is just a rough diagram of sort of the key components that make the OpenStack cloud work. And it’s extremely complex, very, very powerful, but extremely complex. And trying to monitor anything in the previous style of the Hadoop world, with Hadoop and now Spark, and all the pieces of that ecosystem, rolling out cloud such as the OpenStack-based platforms, even understanding the complexity environment, never mind trying to find out what you’re monitoring and what service you’re monitoring, and why you’re monitoring it, and what you look to get from monitoring it. These are really large problems we’re facing now with some of the most fundamental pieces of our world and the cloud ecosystems we’re trying to run either on-premise or in public or hybrid.
Then some of the frameworks such as the big data world, such as Hadoop and so forth, these are really big challenges and the speed at which things in them are changing in them, also makes it difficult to monitor and gain any futuristic insight. And we’re still kind of stuck in this world of saying, “well, what happened five minutes ago?” As you heard earlier, with the challenge of the onsite or on-premise, the off-site and that’s when you just think about things inside computers or data centers. You’ve got mixture of physical services, which are sort of, and virtual servers and they’ve changed, what we used to think about as a physical server with one application stack, now is invariably an environment runs virtualized. Infrastructure, whether it’s Hyper-V or VMware or OpenStack or Xen.
Now you don’t have to have one server running one application stack, it’s running a hypervisor, it’s running multiple stacks. And I’ve just listed a couple of common ones in VMware, Hyper-V, OpenStack, but there are dozens of others and many people using them. And the cloud combination of infrastructure servers, platform servers, and software servers, and each of those in their own right has levels of complexity that we’re just trying to get our head around managing and monitoring at the base level, let alone trying to figure out what’s going to happen.
And if that wasn’t bad enough, we’re now at the point where we’re defining things in a software sense, in that we’ve got software-defined networking and [inaudible] defined networking. We’ve got network function virtualization, and trying to manage and monitor a software-defined network of which components include such as network function virtualization, virtual routers, virtual switches, virtual firewalls, virtual interfaces on servers, bonded virtual interfaces, all the way up into sort of the combination of services versus apps, and trying to work out the difference of monitoring those.
And now we’ve got some more fun challenges in that we’re moving quickly from virtualization to containerization and the recent creation of the open-source version of Google’s toolset for virtualization Kubernetes and the HashiCorp project Docker and the ability to create forms of containers. Now, the interesting thing about trying to monitor a form of container, even individual container, is that once upon a time we had a physical machine and a virtual machine and then the entire app stack, and ecosystem on those – whether physical, virtual – now you have an environment where you could have a Docker instance that could run for as little as a couple of milliseconds, they get substantiated, it receives a request, it deals with it, it delivers the service that was required and then it dies. We sort of moved from, what I think Randy Bias was quoted saying once, and that is, we need to move from treating servers and services as pets, and trying to keep them alive all the time, to now as we just treat things as cattle and monitoring that is an even more interesting challenge.
We’ve got the hybrid environment, so sort of traditional application stacks, such as traditional database environments. The new environments such as the use of the Hadoop and Spark big data environments, of linear growing, growth and storage, linear growth and scalability, elastic environments for some of these compute platforms. And the demand for mobility, people doing BYOD. How do you monitor a laptop that your company doesn’t own? How to do you monitor the applications and services and the security on there? And the exponential explosion from machine to machine and the internet of things that come along. And machine-machine and internet of things is a near impossibility at the moment for some of the platforms that have traditionally been used in the normal sense of monitoring, particularly when you get to the scale of industrial devices.
For example, the Dreamliner 787 airplane, when it was created, the first edition, it had something like 6,000 sensors in the machine itself, the entire airplane. Now, I understand the latest version of the Airbus, I think it’s the A320, has 10,000 sensors in it that takes monitoring and managing information coming from devices being monitored to a whole new level. We have this ever-increasing challenge not just to keep up the basic capability monitoring something and seeing that’s online and available, but this, now this demand for predictive analytics being applied to it.
Because we’ve been doing predictive analytics on a whole range of things around the business we run, and the systems we run, and the type of services we deliver. And so what’s now looked and realized, that actually we could provide predictive analytics on a monitoring service and tell you not just what happened a second and five minutes ago, but what’s going to happen in five minutes based on what we know so far. And I think it’s an extremely exciting time to be thinking about how to manage services, because if we could do any form of predictive analytics, things that we now look at such as auto-scaling and so forth in our cloud and virtualized environments, where if a server realizes that it’s a little overloaded, it can instantiate another copy of itself and stand up the ecosystem and handle more workload and then when the work load goes down, it almost scales downwards and puts one of its machines asleep and goes back to its normal state. To be able now to use predictive analytics and future view of what’s happening by monitoring things all the way from the infrastructure and the hardware, all the way through the end line services. The whole end-to-end journey, the mind boggles on what we’re going to be able to do for what is essentially now an always on call that we live in. And with that in mind, I’m going to hand over.
Eric Kavanagh: All right, let me hand the keys to Robert Vandervoort. Covered a lot of ground there and I’m curious to see what you guys do and that, like I said, I love that whole philosophy. So either share your desktop if you want to do that, or move the slides. Take it away.
Robert Vandervoort: Alrighty. If I know where that button is, that’s what I’m working on here.
Eric Kavanagh: You’ve got to click Start, top left.
Robert Vandervoort: Ah, okay.
Eric Kavanagh: Click on that, you should be able to see a shared screen. There you go, take it away.
Robert Vandervoort: Saved the day. Awesome. All right, so Dez, that wasn’t intimidating at all. Oh man. No, good talks, guys, good talks. So yes, definitely, I’m of the same mind, we’re going, kind of, to the moon. I mean, we’ve got to figure out how we’re going to be able to follow this thing, as it tracks an interjectory that it’s taken, and that’s really difficult. Man, I can tell you from working for a software company that does this and being in daily development meetings, these are things that we talk about, these are very real concerns. How do we keep up with industry? We don’t want to be that, the decade-past monitoring system.
With a lot of thought and, as I told some of the guys in the pre-chat, one of my favorite books and this hopefully doesn’t say too much about me, but it’s “Zen and the Art of Motorcycle Maintenance,” I would consider it kind of a philosophy book, and it’s actually a non-fiction novel, but whatever. He talks about quality, and what is quality and what is the quality of things, and so this is whole metaphysics of quality has emerged and I’m not going to try and give you guys a philosophy lesson today, but a little bit. This whole pragmatic monitoring, what is this? It’s what I came up with basically after a lot of thought of this whole issue, and this kind of paradigm that we’re moving into, shifting away from just, like you said, servers as pets – a great way to put it.
It’s literally the definition of the two words. One, pragmatism: dealing with things sensibly and realistically. Basically just being practical, it’s a fancy word for practical. Monitor: duh. We want to poke at something, we want to stick a thermometer in it, we’re re-measuring, re-measuring, re-measuring and reviewing it. That the idea then is a couple of these two things in that we’re monitoring things in a practical fashion. It’s very easy to get caught away, and I can tell you with so many people that I deal with and with being in the pre-sale side, I’m dealing with technicians in different companies, all different kinds of technicians, all different kinds of companies, verticals, whatever, and it’s always the same kind of stuff. A lot of times when we get into these deals, folks are like, “Well, I really want to monitor my servers, I want to know what’s the, what is my CPU, what are processes doing, and I want to make sure I don’t run out of space on the drives.” And I’m thinking now, alright, this is really simple stuff. But I really kind of want to try and wrap our heads around a little bit different process here.
First off, the technical questions that always come in when we start talking about monitoring – these all focus really on availability – is our hardware/software working to the point of the ping? Yes – okay. No – ping does not mean your software works. It might mean that your server’s online and if that’s certainly the approach that you’re taking, as well, let me see the web server and see why it doesn’t respond, you’re going to find out that, “Hey, look, it’s responding, now. I’ve got to go remote to that web server, and take a look at this, and can I get to it on the box?” There’s this whole crazy troubleshooting effort that goes into it when you don’t have anything monitoring, which surprisingly there are. I’m not going to name any names, but there’s some fairly large companies that don’t do much of any at all, in the way of monitoring.
Of course, to me this is an obvious thing, because I work for the company that makes the software. Anyway, does the web page respond? Not only is this thing up and alive, but is it actually telling me what I want to see? You can’t just say, “Oh yeah, the web page responded in 40 milliseconds,” it can be a full report page. We got to make sure we can get deep enough on these comings, these questions, as far as the answers are concerned, we can answer the question in a way that actually serves the question. Availability, performance – is the hardware/software performing well? There’s tons of performance counters who we talk about, all these different technologies. Whether it’s Hadoop or IAS or Apache or whatever, they all have some set of performance counters. Everything Microsoft pretty much is going to have WMI performance counters. You’ve got your SNMPs, loads of different ways of finding out what’s going on under the hood, how its feeling.
And then the last thing here is capacity planning, so, doing some analytics on the stuff. We’ve got all this long trail of historical data, well we want to know is – and this is sort of an emotional need, we’re not, just because we work in IT doesn’t mean we’re not emotional animals, there’s that sense of security of – if you have something that fails a lot, one, you’re thinking “well when is it going to fail again, is this something that’s really a problem?” And while we do have a great ability of recognizing patterns in things, not only in life, but in the world around us, and over a timeline as well, but things might not be as problematic as you think they are. Or they might be more problematic than you think they are. When we’re trying to make good business decisions, this is definitely an issue. We’ve got to have real metrics, we’ve got to be able to substantiate our feelings, and our perceptions of that world, put it into numbers and empiricize it – science!
So anyway, philosophy time: Charles Sanders Peirce. He’s the guy that basically started pragmatism, and so I’m going to bust some 1800s language here, “Consider what effects that might conceivably have practical bearings, we conceive the object of our conception to have.” What he’s saying here, is “What is that thing? What does that thing do?” So whatever the thing does, is what it is to me. A web server is a thing that spits out web pages, it doesn’t have, you don’t have to think about it any more complicated than that. Is it made up of a lot of complicated software? You bet. The operating system alone is probably well more complicated than any of the stuff that actually runs on it. But that doesn’t matter. When we’re trying to test these questions, we need to know, does the web page work? All right, this is all really pretty simple stuff. Our concept leads to effects of our whole, of our conception of the object to the point. Let’s conceive these objects. This is the difficulty. Most of the folks I talk to they’re, again, concerned with monitoring a server, “I want to monitor my network hardware,” or, “I want to do this.” It’s one specific piece of hardware or it’s a specific technology and it’s usually, whatever one is the biggest pain in the neck to them.
Chances are they already have some other monitoring software in-house, doing another piece of it. I’m like, “Well, hey, why can’t you” – I like to play the devil’s advocate a little bit – “can’t you use that other piece of software to do that?” “Oh, well it doesn’t really do that very well.” “Okay, well what about this?” “Well, whatever.” And to me, all these questions are a load of hay. I’m in pre-sales, don’t hold it against me too hard, but I am an engineer so, conceiving this object. So we need to be able to understand what the object is, what are all the moving parts. If somebody says, “Well, database server,” I’m like, “Okay, what does a database server serve?” “Ah, well mostly our ERP.” “Okay, so you have performance issues with your ERP.” “Yeah, but we think it might be the database.” “Okay, look, let’s talk about the ERP. ERP runs on Oracle.” “Check.” “Okay, you’ve got a web front end on this sucker or is it all client server?” “Oh, well, it’s actually kind of both.” “Okay, cool, so you’ve got a web front end, you’ve got client server connectivity to it, where’s the storage, what kind of server does this thing run on, what does your network look like?” I ask them a hundred questions, it seems.
It’s very non-incumbent at all, also that people just don’t know. “I started here four months ago. I really am not that familiar with the environment.” Okay, well you’re trying to diagnose fairly complex issues when you’re not familiar with the environment, I feel you, but this doesn’t help the paradigm. We need to understand. We need to build this understanding. And so, often when I ask them, “Hey do you got a book, is there a chart, do you have a diagram, is there an email, can you do ask somebody?” It’s usually the latter. “Oh I got to go ask Bob, but he’s actually on vacation, he comes back, let’s set something up two weeks from now and we can get access to that system, hopefully,” and so on and so on. And so, immediately I’m totally feeling his pain. Okay. We need to be able to build this understanding in whatever this tool is that we use. And so just keep that one in mind here.
And the business questions can’t go unanswered, I mean, very often talking to technicians, they’re in the trenches. We’re fixing stuff. We’re a lot of times in firefighter mode, sometimes in a little bit of shock and definitely some awe. Not to quote any past presidents, but anyway, so the business questions you hear, they’re very much aligned with the technical questions. And really what you guys need to do, if you are those technicians, is try to align these business issues with the technical issues. They do really kind of come one to one. Write down the list – availability, performance, and capacity planning. Are we using our resources wisely? Where does this money go, that we spent? We bought all these shiny servers, what are they doing, do we know that they’re being used correctly? Who knows? Unless you’re measuring it. Hot spots and cold spots. All that stuff that the points are in bold, so if you guys get the slide show later, hot and cold spots is a network in trouble. How is there internet and WAN connectivity? Of course your bandwidth providers want to sell you more bandwidth. Do you really need it? How are you using it? We’re talking about performance. Do we have any kind of thing in place that says we’re supposed to hit certain goals? We’ve got to respond to things. Most folks don’t.
And I know I sound very passionate, hopefully I don’t sound too preachy here, but have an SOA. Set goals for yourself. We’re talking about fifteen is half way to thirty. Yes, set goals for yourself, it’s not, there’s nothing wrong with that at all. Set unicorn goals. Set completely unreachable goals. No server can ever go down for more than ever. They have to be on 24/7, doesn’t matter if our employees only work nine to five, I don’t ever want anything to break, of course I don’t. I may have personal expectations, but we can actually express these in a business sense as well. Meeting SOAs, we definitely do escalate management. Are the current operations sustainable, so can we keep doing this. Is this madness? Can we sustain this?
Again, I’m not mentioning names, to try and be fair, but in a prior employment we had one of those, “Oops, we need to buy a new drawer for the sand, because it’s full.” “Hmm, well, we’ve got two months until the next quarter, are we going to have that kind of cash?” “Well, we need it now.” “Well, how do we do this?” Of course I’m like, “I can go down to Fry’s and get some hard drives” and they’re like, “no, you can’t do that, so, sorry Robert, can’t get the Drobo and plug it in.” Although, some of you, I’m sure, is probably nodding your head and has seen that before.
Anyway, so capacity planning, we need to make sure not to just, from a storage perspective, but in this hyperscale environment, as we’re virtualizing and abstracting all this compute resources, it’s just a bunch of CPU cores and gigabytes. We’ve got to know how it’s used. I need to know if I’m trending to run out. If I’m completely fine, if I’m actually low. Where is it going, how long do I have, doc? Do I have nine hundred days until I run out of space, or do I have nine? There’s a big difference there. You don’t want to get caught like that. So, a lot of talk. How does that time fit into this picture?
Well, number one, first of all, after this I’m going to show you guys kind of like old model/new model stuff, but kind of understanding how the product actually fits into this, we have to measure the effects. You’ve got to be able to measure all the little things in order to understand the big picture, but as our sales guys say, you don’t have to boil the ocean to do that. From the technical side of things, and this kind of does move into a gradient here, but from the technical side we need to measure those virtualization environments. Things that start up the hypervisor. How were the resources that are extracted being used? Are they being used wisely? How are those ESX hosts doing, and so forth.
The OS, because certainly if anybody spent any time looking at metrics and vSphere – not to point at any particular virtualization platform – it’s not going to tell you why your SQL server is on fire. It won’t. It’ll say, “Hey, it’s using more than what’s provisioned for it because you allowed that.” Okay, great. “You’re ballooning your memory.” Okay, great. What is ballooning my memory? Did my anti-virus go haywire? Who knows. We’ve got to hit the OS. Obviously, right? Seems obvious. Processes, file systems, am I running out of space, that kind of stuff. If you’ve got a Linux file system, you’ve got a logical volume management, you might have a dozen file systems on that one virtual hard drive and you’re not going to see a single one of those in the virtual layer. Anyway, preaching.
Network ties it all together, and just we’ll leave it there. Is networking complex? It can be extremely complex, it can be pretty straightforward, all points in between. We need to understand that network is how things get around. The more complex environments get, you go hybrid cloud, all this, IoT, oh my goodness. I mean I certainly, myself, I’m a home automator, and I like to see all my metrics, I just discovered some services, whatever, I’m not going to favor anybody, but whatever, pulling those metrics out, being able to visualize that stuff. I can imagine the guys that receive that data from everybody all over the place from hundreds of thousands of devices, it’s insane. A lot of stuff goes over the network, SAN network. Out the pipe out to the internet. We need to monitor that.
We need to know if there’s any problems if there, are any [inaudible] issues, etc. What we call service monitors. So when I was talking about the ERP or SharePoint, or whatever, the service monitor monitors something that runs on all this wonderful shiny stuff, it’s iOS, it’s Apache, it’s fill-in-the-blank, it’s the database engine, it’s a Windows service running. If I connect SSA to a router to pull some configuration information and see if it’s changed, or, what circuit am I running on? Whatever. It’s some kind of test, okay? Visualizing the object. We have plug-ins, and so kind of keeping up with the industry here.
And I’ll make sure I move along here, anybody, give me a little sanity check if I talk too much. But plugins allow us to be flexible, they really – in order to be agile, we need to have something that’s divorced out of the life cycle of time proper, since we do have about four major releases a year. I think we had, honestly, in the last six months, I think we’ve had four in the last six months. We do keep up with it from a development standpoint, but you don’t want to wait around. Say, well you’ve got SharePoint 2013, you’re moving off to 2016, you might not want to wait until December for us to come up with another release that does that.
The plugins either allow you to do that yourself, using any one of probably several pre-baked scripts that are out there, or just write your own and have that write in times core functionality and we can do that for you as well. I would put out there, that just from the sales-y perspective is that we actually do support these. Which is, it’s very different paradigm than the open source community – which I love, very dear to my heart, very involved in – but if you’re buying monitoring software you want to be able to have somebody to call. You’ve got to have a phone you can pick up and be like, “Blah doesn’t work,” or “What does this mean?” Just keep that in mind.
The application – and this is really where we’re starting to move into the business value of things. And also from the, kind of, keeping-yourself-sane level. I mean, all that little stuff down there at the bottom, if you were to learn all that stuff you’d be getting emails all day, I guarantee you. Rule gets created, emails get ignored, things go unmonitored, shelf ware. Very bad place to be. It’s also a bad place to be from a stress perspective. Anyway, regardless and that’s why we do that. So been there, done that. Application level is where, really, I feel we should be alerting. We need to set up criteria and we have to obviously have that small world built, but we set up this criteria to say, “Hey, this is what our application is built on. Here’s the database, here’s the web front, here’s the storage, here’s the network, dingdingdingdingding, here’s the web pages, etc.” And then I can say, “Hey, your applications are not happy.” In the service level agreement, at this point it’s a no-brainer and that in itself it’s almost a no-effort thing, because all the effort was really just out there in building the understanding and the application out of these little pieces.
Service level agreement, you simply say, “Hey, I want this thing up four nines.” Boom. Done. It’ll alert you to when you’re trending to fail. It’ll tell you why you’re starting to fail and it even looks at historic data, I can tell you why you’re not meeting your goals, which is very different to being something what I would consider largely a smoke alarm. That is the business end of things. What I love about Uptime coming into it, I’m actually an IDERA veteran, I’ve been with the company for four and a half years now, when we bought Uptime software – it’s a Toronto-based company – I was really skeptical, like I am of absolutely everything, but it really impressed me because I’ve had to deliver those reports, those BI reports to management, of are we meeting SOAs and I’m usually pulling out from silly places, like my ITSM software, will only collate my incidences and let me know how many downtime I had, which I know myself and a lot of people just don’t make tickets. It’s really probably worked in our favor if anything, but it’s not good for the business. The product actually thinks of those things.
Here are these two paradigms and the one that we’re largely in now, and the one that I’m trying to pull everybody’s head out of, is the bad way of thinking about monitoring your stuff. All right? Why is it bad? It’s because it’s serial. I have my monitoring station, I monitor a server, it’s got metrics on it, I alert on those metrics. You can see I try to make it purposefully cluttered on the right-hand side, over there. Based on an understanding of a bunch of boxes need monitoring, so that’s really it and it’s noisy because of all those metrics, it’s really noisy and your CPU is high, your memory is high, your file system’s running out of space, your web page’s response time is five seconds, you’re, you know, blah blah blah.
That stuff, it’s noise. Unless you can in some sort of, like, calm fashion, zip through your emails and mentally collate all these things and to try and understand the bigger picture, it doesn’t really serve the point, which is alerting on what’s going wrong. It’s just symptomatic and it’s difficult to decipher the impact and it provides very little business value. I pretty much guarantee you that your CIO does not care how many CPU ticks were used on your SQL server. He’s more concerned with was the service that you guys provide actually working well and did people have problems accessing it and what is the customer thinking, and that kind of stuff.
Rage guy, yes, not fun. This is how I found that the BlackBerries were very resilient. While the ball may fall out, they will survive a flight of stairs, or five. Anyway, sorry BlackBerry.
New way of thinking about monitoring your stuff – I mean IT systems and Apple cases. This is where I want our heads to be and I just picked two really simple things here. I love that open stack by Graham, I’m probably going to try and steal that at some point, but we’re moving to this connected understanding. How are things connected together based on this understanding of the dependence and the functional parts of all these things? Again, it’s that object, we’re going back into this whole pragmatic thing again. It’s quiet.
Two alerts – your ERP is not happy because your database is running slow and your web page is running slow. One can say, “Hey! ERP is not happy, web page is slow and the database is slow.” It might be the database. Now, to be fair, I’m not going to tell you, “Yes, the reason your web page is slow is because the database is slow.” I don’t do that. I’m not a route cause APM solution, but when we built this understanding and we get emails like that, it makes a whole lot of sense and from your troubleshooting effort instead of saying, “Hmm, that didn’t work,” and remote and whatever, or perfmon, all these multiple tools bouncing around all over the place, this at least, at the very least streamlines your troubleshooting efforts incredibly. But I haven’t even gotten to the graph side of things yet. This is just the – not looking at a screen point and I don’t like staring at monitoring tools, honestly.
It’s easy to understand then, right? We know what’s going on because we built it, we built the understanding. But the best part about this, I think, is that it shares a lot of the knowledge with other folks in the teams. We talk about silos all the time and is it the app or the database or whatever. It’s so much that’s it’s actually become marketing campaigns for some companies, it’s all database tools, you probably seen them.
So, knowledge – knowledge is power. A little bit of peripheral understanding with how systems come together. Does your help desk guy needs to know all the ins and outs of your network and how SharePoint functions and how your ERP connects. Probably not, but it’s very helpful when I can look at a dashboard and somebody calls in and says they can’t access something, I can be like, “Oh yeah well, it looks like we’re having problems right now with our edge router. So if you’re off campus, SharePoint’s going to be a problem for you, but were on it.” People like that, they don’t like, “Mmm hmm” turns out.
Anyway, this is business value, right? Besides start, run, IP config, I heard, “ughhhh” a lot on a help desk. Anyway, but this is providing that business value, because we understand how the parts are moving. We understand when things are going wrong, we’ve got those SLAs, we’re doing capacity planning, all these things that might seem like a unicorn at first, when all your concern is how well your servers are performing, are things that are very straightforward to set up and that’s the key. I could cry rainbows.
Set and meet expectations. This is the SLA bit. Have them. I think I’ve already probably harped on this enough but we’re monitoring everything already. We’ve already built the understanding, the applications, dependency and connectivity, this is the hard part, just understanding your own environment. It does the thing, it’s not getting any easier. Already alerting, people are already getting relevant emails here, I can even do escalation paths and I’m not going to try to show you all the stuff in any kind of software demo here, there’s definitely forums for that.
I’m already automating fixes, Uptime can even react to things. I mean there’s always those silly, stupid things like the print spooler that crashes for some unknown reason, it’s still on Windows 2000 and one day you’re going to upgrade it, you swear – whatever. It takes several minutes out of your day and somebody knowing that it’s broken for you to fix it, right?
Automated, that kind of stuff is just automation fodder. Already made an awesome dashboard, you know, subject matter dashboards – it’s really a thing. Anything that I collect in Uptime I can grasp in some sort of sensible fashion. So if you’ve got to deviate it’s like, “I really wish I had some performance dashboard for my SQL.” Done. You want an application dashboard that includes technologies across the entire stack? Done. Capacity planning? Done.
So pretty straightforward. Set those goals, make those SLAs understand why you don’t need them. That’s really the key here, you know, it only takes a second, literally takes only a couple of seconds really, not minutes, longer for me to explain it, but just to say, “Hey, here’s my expectation, here’s those things that I’m expecting to work,” and then Uptime tells you what’s not working.
Anyway, I was going to steal the double rainbow pictures but I’ll probably get in trouble for that. More exciting than a double rainbow, oh my god – it’s the website here. I’m going to pop over. Do I still have a few minutes? Let me get a sanity check time here, how are we doing?
Eric Kavanagh: Yeah, show us some stuff.
Robert Vandervoort: All right, cool. Like I said, I covered a lot of ground; that saves me from showing you the not-so-sexy bits that are all just text and settings and whatever. What I want to show off is like the graphical end. Like I said, I don’t like to stare at monitoring tools, I want to be able to walk away from this thing. I want it to be my babysitter, if you will, but I don’t want it to be like, “Hey, can your kid have a goldfish, I’m in a movie, yeah, whatever.” “Okay.” Ring, ring, “Hey, is it okay if your kid goes to the bathroom? He says he has to go.” “Yeah, okay, whatever.” I want a responsible babysitter only bug me. So alert noises are a big deal for me, if you can’t tell, I probably have some kind of advanced form of monitoring PTSD.
I do want to point out that from Uptime’s perspective we’ve got all these different profiles. I’ve made a couple crazy things in here just to kind of like showcase how Uptime can do different things and work with people, which is a big deal. I didn’t tell you guys really my background. I have an IT background, honestly going back since I was 13 working in the back room of a computer store. Perhaps that might not have been the most legal thing in the world, but whatever, and I’ve never stopped. I’m 37 now, I have a psychology degree because to me people are way harder to figure out than computers. But from a UI and a UX standpoint, I don’t want a tool to tell me how I have to do my job, or how it should work, or I want to bend the way that it wants to do things. I know I’m kind of like drilling some philosophy and an understanding, hopefully it’s going to make things easier for you guys, don’t take it like, “Hey, you gotta do this” or “I’m telling you what to do.” But this is kind of my thing.
Anyway, HipChat integration, spoken alerts. I mean, this one will actually make that 18-monitor NOC that you were looking at, tell you what’s wrong verbally. Imagine your wall goes, “Warning, SharePoint is in a critical state because your database is slow, blah blah blah, it’s been this way for seven minutes.” Yeah, it’s kind of jewelry, maybe it’s hinky, whatever. I’m trying to show you that it’s a very flexible tool. We have script-based outputs, can do anything you want.
HipChat, I use the heck out of HipChat and Skype – probably more so than my email, probably to the chagrin of many a salesperson, but anyway – Integrating HipChat as well, it doesn’t matter what it is, River, Flack, whatever you want to do, very straightforward to do.
Anyway, from the user’s perspective, we actually start off with your contact info and your work hours and off hours if you have it. Once you get to the point of actually making the alert, Uptime already knows how to get in touch with you, which is really key. I mean, how many times has it been, “Oh, I didn’t notice the email.” “Well, maybe I should send it to your Gmail, send it to your personal, I’ll post it to your Facebook wall.” Anyway, I haven’t gotten that far yet, but maybe next weekend when I get bored.
The global scan in this complicated crazy environment. Number one, we’ve got to keep stuff organized. Keeping it organized is key, we allow you to do that, we do auto-discovery and all that kind of stuff that you’d expect, but allowing you to kind of structure your data center in the way that makes sense to you. I like to think it’s kind of a physical, logical and technology, and then from the virtualized standpoint we do it just like you’d see it in VMware where we’ve got your data centers and your clusters and resource pools and all that lovely stuff.
That filters right through the same understanding, again, it’s working the way that you do and the way that makes sense. The same understanding filters through these dashboards. The global is basically everything that’s wrong, and so all I care about is Houston and all the other QA, SA whatever stuff I don’t really give a darn about, just the Houston things. I can focus in on that and then again from anybody that has the concerns of security or keeping things segregated by user group or whatnot, we can absolutely do that. The only thing I might ever see is just Houston, or something as narrowed down as just “Houston Network Components,” so that’s definitely a thing.
Resource scan – how are those resources being used across the entire environment? That’s this. This is your ninety-thousand-foot view. I can drill down into any areas that have problems versus others. And you notice IBM Agency, just kind of throw as an aside, it’s really not on the side. One of the most critical things of building the understanding of the application, getting that pragmatic model down, is getting everything in the door, and I don’t say that just because I like having license counts on the deals of what we do. It’s really, if I can do everything without my IBM P-series stuff, that stinks.
We’ve got monitors for AS/400. People give me a hell about that sometimes, it’s like, “AS/400 ra-ra-ra.” You’d be surprised how much AS/400s are still running really important systems out there, or the newer I-series stuff, that’s a thing, we do that. HP-UX, AIX, I mean just about every major operating system in the world we have an agent for. Bringing it in the door and getting it monitored is key.
Looking at the application layer, again, let me get out of the granular and go up top. This is what that dashboard looks like. This might be the only thing that I ever look at on a regular basis, I want to just come here and just say, “Hey, my CMS is really angry, why?” Now granted, I probably don’t pay as much of attention to my emails as I should, but I’m here every day looking at the server gates, it’s what I do. I’m a dentist, these are my teeth.
Login tests. So there you go, I’m testing the actual login time, this is user experience stuff, this is super pragmatic. I don’t even care about Apache metrics. If everybody is like logging in lickidy split and all the transactions are working good, who cares about bytes sent and received, unless I’m trying to do capacity planning. From a firefighting standpoint, from a “Do I care, do I need to pay attention to it?” I want to know that stuff pretty intuitively and pretty automatically.
If I’m the CIO, I care like about this, I don’t care about my Apache performance dashboard. If I’m your web guy, you bet I do. I mean, I need to come in here, and pardon the slowness here, but I need to be able to come in here and see a lot of deep metrics across the board and notice patterns. Here I see that my demo Apache 01 is restarting, and Uptime it’s “boom, boom, boom, boom,” what’s up with that?
Those are patterns that I might not even know if I’m not even looking at it. Again, the real granular stuff, but it really serves this purpose. Those servers are part of the CMS and if I’m seeing issues on a web page and my servers are recycling, I found out more about that environment in just a few seconds looking at the dashboards that I’ve set up than I could definitely by remoting to it. I’m not even sure where I’d start on some of that, to be honest with you.
Anyway, effort; and everybody is sort of thinking, “This is just crazy.” From the effort standpoint, how do I monitor stuff? Can you write a script for it? Yes. What we try to do is provide very common things, very common technology that are out there from the database standpoint. We’ve got, I want to say every major database engine. I don’t have any of the NoSQL, I don’t have any of the time series stuff, but every major relational database is in here from web services standpoint, IAS, Apache Tomcat, just dinging on down the line here. And then for the stuff that you might not see listed, there’s a lot of other stuff of course, but we’ve got these plug-ins. This is just a really easy way to go out, we’ve got public depositories on GitHub, you can see the code, you can make it your own, you can change it, whatever, it’s available for you there. So from a technology or software standpoint, if it’s a SAN, or if it’s SharePoint or exchange or whatever.
That’s how we have it doing that and then essentially these are going to provide you with the metrics that you care about, and that’s the hardest part. I’ve written several of these plug-ins and the hardest part for me is like, “What are people going to want to know? What is actually important?” You look at any WMI system, there might be hundreds of things. Well, fine, I’ve got to just slow that down, nobody is going to want to see 400 metrics because then you’ve got to make sense of that world and there’s no value there.
Anyway, then SLAs. There’s tons of subject matter dashboards. I would encourage you guys, I mean if this is something that’s interesting you, obviously we can do demos and whatnot, we can do personalized stuff, we’re not trying to boil the ocean again. But you know, if I get an email saying that “My SLA is running, I’m exceeding it, here I’m failing miserably, I want to know why, what’s going on?” I can just drill right into that in this detailed report and see what are the particular things that are causing the SLA to fail, or even go back over time and understand if that’s a trend or not. Where are the red spots? This almost looks like a DNA analysis or something, we’ve got server outages – sorry, these are login test outages where I wasn’t able to log in. We’ve got response times and things in here and I can just really easily zip down to the things that are important for whether I met those goals or no. And again, I don’t expect you all to read all of these things, but there’s a lot of data here. It’s pretty convenient just to be able to have that in front of you. But the reality is, is why I’m failing is because of these login tests. All the back information here is provided to you as well.
Reporting is provided with the tool so you don’t need Crystal or SSRS or anything like that, the reporting engine’s built in; you can customize all the individual reports that are in here. I can have them run on a recurring basis. I can save them out for other people to see and use. You’ve got different output formats. You want to have something emailed to your manager every day Friday at 4 p.m.? Ha ha ha, you can do that!
So pretty robust again, from the capacity planning standpoint. We don’t want to just focus in and we were talking about being able to predict things and doing predictive analytics. Aside from just having the ability to visualize the here and now and the historical trend, I want to be able to see capacity planning projections and it’s that fast, I’ve got compute memory and data storage capacity trending out on my whole vCenter and I can eyeball it and tell you I’ve got at the worst 132 days until I run out of space, I’d better do something about that.
This is a real lab and I am actually the proud daddy of, like, a lot of stuff, and it’s just got my work cut out for me here. But I know this stuff and so if that happens, that’s my problem, that’s my fault for not changing something or doing something about it. I’m well aware of this stuff. If I’m in a meeting and somebody goes, “Hey, we need to add a bunch of servers to the lab” – they’re not going to do that to me, but if they did, I could be like, “You know what? I’ve got the gigs. I’ve got the gigahertz. I’ve got you covered,” or not, and at a glance instead of having to open up another tool which is sort of another point and agreeing with all these things, I do this sort of as a joke.
Houston office, we were talking about traffic. My dentist and I were talking about traffic, she grew up in Iowa, she said, “The one thing I like about small towns is there’s not a lot of traffic.” Well Houston, if you live inside the loop, you don’t leave it, as you can see down here. I can integrate any web basically like an iframe, if any of you guys are familiar with HTML, I can integrate any web in any one of these gadgets. Whether it’s like your website or it’s a traffic camera outside of her office or whatever it is, I can do that. The gadgets are super easy to add.
I mean, dashboards – I show TV magic. This is like, “Oh look, it’s done, it’s all nice and polished,” but the reality is getting into these dashboards is a very easy thing to do. There’s a lot of different ways of displaying data across all the different data points that we have. Things like this pin-on image tend to be very popular with folks because when you’re trying to build up an understanding of an application you can just upload that video and then pin on the elements that make it up. You see, I can tell you where all the problems are along the way.
These things are just extremely helpful, I think – network topology, understanding what plugs in to what, what’s dependent on what, whatever that may be, work board, switches or websites, or whatever, are all things that are built in. And again, across different technology stacks. I haven’t brought it, I know we are running out of time here, I want to make sure that you guys have time for the Q&A and all, but there’s just tons of information we can gather from all different kinds of sources: log aggregations, APIs – whatever, you name it – SMP, WMI, etc., etc., etc., alphabet soup. So it’s about gathering that data, building the understanding and then alerting and acting upon it in a pragmatic way. And so that’s it in a nutshell.
Eric Kavanagh: Great. That was a fantastic presentation from everyone. I gotta tell ya, I loved it. We’ve got a couple of extra minutes here to throw questions. Rick, why don’t you throw a question or two, and then Dez, and then we have just a couple questions from the audience that are kind of specific about implementation. But Rick first and then Dez.
Rick Sherman: Okay, great. Well first off, I especially loved the demo to kind of put it all together, especially about adding the servers, monitors, plug-ins, etc. I think that was terrific. One of the questions I have, you mentioned that it was a recurring theme like in presales that can people understand what the architecture or the apps are. They want to monitor stuff and then there’s this piece part. How do you go about educating them on how to break down the typology? I realize that there’s many things you can pick off, but how do you educate them? Because I’m not sure if they can quite grasp how much you can do.
Robert Vandervoort: Yeah absolutely, I’m a big fan of self-deprecating humor, so I usually just kind of start from that angle. I have ADHD if you can’t tell. My wife doesn’t like going in with me to Home Depot with me anymore, let’s just put it that way. I use the analogy of, if you’ve got a squeaky hinge or a leaking whatever, go in there and figure, “I want to fix my faucet.” Think. Go to your Zen place, “I want to fix my faucet.” Don’t think, “Hmm, what can I fix in my house?” because you’ll be there all day and you’ll forget about the faucet seal and you’ll leave with gutters.
What I try to focus people on is the application. You’re telling me this hurts and that hurts, let’s take an app. Is it your ERP? Cool. Let’s get the app in a POC, find out for me, doesn’t matter who you gotta talk to or whatever information you’ve got to drudge. What is that application made of? Database servers, file servers, you know, whatever, whatever, all the end point of the application. Find out, get all access to it. If you need help getting any tool, cool, we’re here. But let’s focus on a particular application because that’s where the value’s going to be in the end. I mean you could easily add hundreds or thousands of servers and start going at that angle, but then you’re very much in that serial model which is very – it’s not only unsustainable from a POC, but it’s also just not where we want our heads to be.
Rick Sherman: Yeah, and would you set up the dashboard, etc., sort of to give you that business view, that sort of composite view of pieces that support that entity, whatever it is they’re trying to monitor?
Robert Vandervoort: Absolutely. I generally suggest, okay, we want to have – what I call these are app maps where we have our application dashboard and it has to have all these pieces. Make the diagram if it doesn’t exist, slap it into Uptime, figure out what needs to go there. At least discover all this stuff and get it under the hood of monitoring and then start adding the services that actually add up to make that application function. Like in the case here with SharePoint there’s – and kind of just a cool point – these applications can be built of other applications. In the case where you have like a SQL cluster, that’s really an application. It’s multiple servers, multiple services and things. AD is an application etc., etc. I can build these aggregate views out of those as you see here in the SharePoint. We want to be able to build this. If I can’t build this, I haven’t added enough stuff. We do all the little bits in there that makes it tick.
Rick Sherman: Do you kind of work backwards in a way?
Robert Vandervoort: Yep, think backwards, work forwards.
Eric Kavanagh: Okay. Dez, take it away.
Dez Blanchfield: I’m keen to get your insight, just briefly because I know we’re short on time here, so I’ll just keep it to one deep question if I can. Can you give us an insight on where you think businesses and organizations are at currently as far as the view of the value of, not just service monitoring, but the type of approach you’re taking around the pragmatic end to end. Specifically, from the commercial benefits. So a lot of us have come from a technical background and we love to be able to ping things and see if they’re on. But from the business point of view they’re often not interested because it’s like, well as you said it calls us on that’s what we pay you for.
Are you seeing a transition away from just keeping the lights on to now putting KPIs at commercial level and an operation level on the deep integration of service management-level monitoring for the whole [inaudible] framework thing working properly, so that people look at your tool from the point of view that we can keep the lights on, but have we actually put a dollar value on the value of seeing the whole end-to-end view and insuring that, “Okay, things are on, are we using wisely as you said, are we meeting our SLAs, and if so, what does that mean to the business?” Are you seeing a transition towards that yet or are we still a little way away from that?
Robert Vandervoort: There’s definitely a want. There’s an urgency there. People, I asked to open a question, it’s a loaded question obviously – do you have SLAs? And it’s almost unequivocally, “No, but our managers are kind of talking about it” and so on. I’m like, “Cool, how are you going to get there?” “Well, we’re not really sure. We’re kind of looking at ServiceNow or we’re doing this.” I’m like “Well, you’ve got to understand, ServiceNow is a thing, it’s an idle framework, basically follows lock step with it,” alright, not to favor any particular ITSM platforms. But it’s not going to answer your SLA questions. It’s only going to talk about how many man hours did you spend fixing a printer or how many resources went into a particular server if you had to buy parts for it? It can’t answer the real-world question of what servers really need to be, end of life or whatever. Not anywhere to the degree.
When we talk about, like, from the SLA standpoint, there’s several of our customers that absolutely have SLAs that they lose money. It’s like delivering pizza, if it’s late, if they’re going to let you down then they don’t get money. So there’s direct business impact there, those guys tend to care a lot more about this stuff than the rest and that’s why one of the things I really urge folks to do is just create an expectation for yourself, for your team, for IT. It doesn’t have to be real or written or promised to anybody, but when you go and create the expectation, having something flipped on its head saying, “Hey, this is why, why am I not meeting server availability.” Well it’s just one stinking server. We can focus on the one server and, “Hey look, we’ve got like perfect uptime.” And this is actually my case, I’ve got like a one right here, but you get the idea.
So yeah, to answer that question, yes, absolutely, I feel there’s more of a want than an actual move towards that because people are still grappling with how do you get a tool that can really answer the question, how does it monitor enough stuff and most folks have multiple tools. It’s because the network team went shopping for a network monitoring tool, and the dev team went shopping for an APM tool, and the database guys shop for their tool and none of them really talk to each other except in the lunch room.
Dez Blanchfield: Yeah, that’s a never-ending headache for me in my life. It’s like for the last 25 years I’ve had that constant issue that when you go into the organization because they’ve broken down and segmented into logical blocks. Like that photo of the very, very front of mind, is a network operation center and they worry about the network and as long as the network is running, they get paid and their job’s done and their handoff. So yeah, but it’s interesting.
One last quick question and partly my personal interest, but I know a lot of people are going to want to know the same thing. How do we get a hand on this tool and how do we get started with it? Where are we going to find it, where do we get more information and can we get a demo or a trial or something to that effect?
Robert Vandervoort: Absolutely, yeah. I hate that word, absolutely, there’s no such thing. Idera.com is where you are going to go for that. There is a little like jack icon, it says “IT Management,” you’re going to click on that and then there’s two options. One’s for the cloud-based one that we have, and the other is for Uptime Infrastructure Monitor, which is what this product is we’re showing you today. The trial should be for about 30 days or so. Don’t put some BS in the form, put your real info. Our sales guys are really pretty hands off, nobody’s told me the sales guys are annoying. But really it’s because they’re your best pathway to folks like me on my team.
If you have those technical questions and the documentation is not cutting it for you – because what documentation ever does – you’ve got direct lines of support, concierge level if you will, as well as extensions because most folks are going to want to go and connect to the vCenter and you find hundreds of things. You’re going to blow out a trial license, so they’ll ask you those relevant questions to make sure you get squared away for a POC, or if you want a one-on-one demo that’s definitely the way to do that.
Dez Blanchfield: Fantastic. Well, thank you very much, I’m looking forward to that and hopefully we’ll see you again and we’ll talk about adding blog chain to it. Eric, we’ll hand it back to you.
Eric Kavanagh: There you go, sounds good, folks. I have a couple of quick questions I’ll throw over at you real quick. One is: is Uptime Infrastructure Monitor a web-based or client-server application, can you answer that?
Robert Vandervoort: Web-based. 100% web-based. On-premise.
Eric Kavanagh: Good, and another attendee asks: do you need to install some kind of proprietary daemon on the individual servers for IDERA to monitor them?
Robert Vandervoort: I saved these for everyone, so let’s look at these instructions. So agentless, I say agentless, agentless, agentless just like I say wired unless you have to wireless, and I’ll save you some of the other not-so-proper analogies about wireless. But anyway, we do have agents for just about every OS, the only thing you miss out on if you don’t use them is a TLS1.2 encrypted pathway to said server that it runs on, as well as the ability to run scripts directly on it.
Outside of that, Windows, Net-SNMP, our Windows has WMI, Net-SNMP for the rest of the world, SNMP for all your network stuff, etc., etc., etc. So no, I always say no, you don’t have to unless you want to. And then as far as like the technology installing it, it comes with everything you need, that right-hand side of the diagram, runs from MySQL, Java, PHP, Apache. You don’t need to find any other servers to run it on. It will even run on Windows 7 service pack one on up. We’ve got a Linux-based and a Solaris-based distribution as well, so technically you don’t even really have to pay for server licensing to slap it on, just some extra hardware.
Eric Kavanagh: Cool, I have to say this was a fantastic presentation, so thanks to both of our analysts today, and thanks to you, and of course to IDERA. I think this is great stuff and I think you guys are looking forward in a very positive and compelling way, and we will hear again from IDERA later in the year, folks. We’ve got several more events lined up with them. This has been fantastic, thank so you so much for your time. The archive usually goes up within about a day, so hop online to either Techopedia or InsideAnalysis.com to get the details there, and we’ll talk to you next time folks, take care. Bye, bye.
Rick Sherman: Thanks guys.
Eric Kavanagh: Yeah, and Dez –