Eric Kavanagh: Ladies and gentlemen, welcome back again, it’s Wednesday at 4:00 Eastern and for the last couple of years that’s meant it’s time for Hot Technologies, yes, indeed. My name is Eric Kavanagh, I will be your host for today’s show. I love this topic: “Health Check: Maintaining Healthy Enterprise BI,” that’s what we’re going to talk about today. There’s a spot about yours truly.
So this year it’s hot – Hot Technologies was really designed to define particular kinds of technology and you can imagine out there in the world of enterprise software there are lots and lots of vendors who sell all sorts of different products and what winds up happening is there are these buzzwords that wind up getting used and getting glommed onto by various vendors for very different things. And so, the purpose of this show is really to help our vendor friends and help our audience both identify and wrap our heads around what specific kinds of technologies really are and what these words all mean when you get right down to brass tacks.
So, I’m going to stand in as one of the analysts today, we also have Dr. Robin Bloor on the line and Stan Geiger from IDERA. Let’s just talk quickly about the importance of business intelligence and analytics just in general. This is a basic decision tree, if you will, or a flow chart just kind of talks about how you work through issues in your company, having discussions about different topics, putting proposals together and then you find out what people think. Do they agree? Do they disagree? What is the consensus, if you have some, and how do you work through that process?
Well, this is all obviously very generic, but it’s a good reminder of the process by which we propose ideas in companies, make our decisions and then move forward. And the bottom line is that data is required for each and every one of those components. That’s even more true these days in the world of big data, because of course, big data is like this giant truth engine out there. Big data is really what’s happening; it’s representative of who is where, what they’re doing, what they’re buying, what their social media handle is, tweeting for example. Of course, all that stuff can be hacked – you have to watch out for that – but the point is that data is the reference architecture, if you will, for reality.
So, you want data at every point in this decision-making process. Now, consensus is important. If you want happy users, sometimes a boss may have to go against the grain of what everybody wants. We were just talking about Steve Jobs right before this webcast started and he was notorious for that kind of thing. He’s got a famous quote where he recommends that people drown out the noise they hear around and then stick to their vision, if they know what they’re doing is right. So, you don’t always need consensus, but usually it’s a pretty good idea. But the general purpose of this slide and this commentary is to drive home the importance that we want to make our decisions based on data, not just on instinct, although gut is usually really good at helping you know where you want to go, and then you really look to validate that, or invalidate that, with your data. And I would say don’t be afraid to look back through that in there, just as a nice little marker, or reminder that when you look back on occasion that you can at least get some frame of reference and understand where you’ve been coming from and be honest about the mistakes that you’ve made. We’ve all made mistakes, it happens.
So, if you have performance issues in your business intelligence systems, well, there’s the old expression “patience is a virtue,” not in the world of IT, I can tell you right now. If users are waiting a long time for their queries to come back, or they’re not getting their reports, that erodes trust, and when trust is gone it’s very difficult to get that back. So, I’ve put a line in here – about 40 seconds these days is like 40 minutes in a lot of cases – if a query is going to take 40 seconds, people forget what they’re even talking about, what they were asking of the data. Just imagine in a conversation if you ask someone, let’s say your boss, you say, “Hey, I’d like to know why it is that we’re going down this route.” And you had to wait 40 seconds in a conversation to get an answer? You’d walk out of the room! You’d think that your boss has lost his or her mind. So, that latency that we have in some information systems, when there are performance problems, that is going to truncate the analytical process, the analytical flow, or as some people call it, the conversation that you’re having with your data. You need to speed in these systems, whatever you have to do to get that done, and we’re going to talk about that today, that’s what you need to do, because without that fluid flow of ideas back and forth, you’re really damaging the whole process of analytics. So, and once again, I throw this comment out: lack of trust is a silent killer. People won’t really raise their hands too much if they don’t trust you, but they’ll just kind of look at you sideways and wonder what’s going on. And once that trust is gone you’re going to have a very, very difficult time getting it back.
So, artificial intelligence, well we keep hearing about machine learning and AI and “Oh, isn’t that going to solve all these problems?” Robin and I have been hearing for years now about self-tuning databases and all this fun stuff – there is some of that going on, but just ask yourself the question: how often does Siri get it right for you? How often has Siri accidentally popped up and go, “I’m sorry, I didn’t get that.” That’s ’cause I wasn’t asking you anything. I just accidentally hit that darned button. So there are lots of flaws still, and by the way in the left-hand side, that’s the ASIC chip from an Apple Newton – remember that puppy from years and years ago? That was one of the first smart devices, and that’s kind of a long time ago, that’s like early ‘90s or mid-‘90s I want to say. That the Newton came out and it wasn’t very good, but it had the vision; they knew where they were going, but even now, with the iPhone AI and machine learning, these are widely misunderstood concepts, I would say.
And certainly with respect to machine learning, it can be very useful and actually can be used in some of these environments where you’re trying to understand what’s going on with your complex information architecture, where things are going wrong. Machine learning can be very valuable in that context, but only if applied in a very acute way. So, I was just at, in fact, a big event out in California, one of the big Hadoop distributors Cloudera had their analyst summit and I was talking with their chief strategy officer and said, “You know, it seems to me, that really machine learning only does two things: it segments and it refines.” Meaning it’ll give you different segments or clusters of activities including anomalies, which would be a segment. And it refines, meaning it helps you improve a certain kind of decision. The classic example you hear about is there is a human being in this photograph, for example. So that’s something machine learning can do, and it is useful in certain contexts, when you’re talking about troubleshooting, because you can look for patterns of behavior in CPU usage, in memory usage, in speed of disk and what the disks are doing, and all that kind of fun stuff. So it can be useful, but it’s really something that has to be very focused to generate any value.
So, one of my other favorite things to talk about – and we’ll kind of see a little bit of this, I think, when we take our demo today from IDERA – in many ways I think human beings are still learning to speak silicon. There’s a material science underneath all of this, and for those of you who have done troubleshooting and really taken a hard look at complex information architectures, when you’re trying to understand what’s going on, even in like a Hadoop cluster for example, really you’re usually just looking at histograms. And then you have to correlate what these different histograms mean at a particular moment in time, and that takes intelligence; that takes human intelligence and experience. So, I’m not fearful at all that ML, or machine learning or AI are going to take away too many jobs in this world any time soon. I think there’s always going to be a need for human beings, who frankly know what they’re talking about to help us out and make this all happen.
So, let’s keep moving along. So, what happens if you’re not data driven? This is a famous painting, “The Blind Leading the Blind” – this is not what you’re looking for, folks. You don’t want this kind of environment in your organization. So what we want is we want our decisions to be driven by data and we want the decisions to be driven by good data, good quality data and that’s only going to happen if you gather the correct data, if it’s nice and clean, and if your systems are running properly, if your BI systems are healthy, your analytics systems are healthy and users are getting what they want in a timely fashion.
So with that I’m going to wrap up and hand over to the inimitable Robin Bloor. Robin, take it away.
Robin Bloor: Okay, well, thanks for passing me the ball. I was thinking while you were speaking, Eric, I was just thinking about BI and there was a vendor presentation I attended recently when someone remarked that in a particular vendor, running a particular system in a big, bad data warehouse they would, at a given point in time could do 70,000 BI transactions that would lead to information being presented to a lot of people. It did occur to me that if actually you have that kind of workload, and you even waste a few seconds in terms of executing the software, then it’s actually going to be very expensive, and if you waste minutes it’s going to be horrendously expensive. And then I remembered that an awful lot of the world runs on spreadsheets – there are, I think they were called “shadow systems” weren’t they? In the first instance, where people would just put together systems using spreadsheets and email, and they would make things happen, because the IT department can’t build applications for everyone, so they kind of do that. And a lot of BI, I think, gets involved in systems like that anyway.
Anyway, having said that, let’s get onto talking about what I’m going to talk about. BI’s a feedback loop for corporate systems, it’s really that simple or that complicated, depending upon exactly what role it plays in the organization. But if we look at this is a diagram from about four years ago, when we were trying in one way or another understand what was happening on the side of analytics. But pretty much, everything that’s hindsight, looking back at what’s previously happened, and everything that’s oversight, in terms of the way the system works, tends to be BI. It didn’t used to be the case that what was foresight, predictive analytics was BI, but that’s actually increasingly becoming the case. Eric mentioned machine learning, a lot of machine learning can actually in one way or another just be run against a stream of data and can give you predictive analytics for the coming five minutes, or even almost in real time, so you can respond to a customer, with a calculated knowledge of what’s actually happening.
But the center of this diagram, the inside comes from analytics. What normally happens is that various analytical activities are pointed at particular collections of data and something new is learned, knowledge is learned about the business. And that piece of knowledge is then strapped into the business processes that can feed from it. And usually it’s manifested in one way or another as BI alerts appearing, or just various things being put on dashboards, and so on and so forth. When we actually did this, there’s four terms there and they happen to end with the word “sight” which is very nice. But in actual fact it isn’t everything in the field of what people want to do, there’s also the problem of optimization and optimization doesn’t yield simple analytics. It’s very complex problem and a lot of optimization problems are not uniquely soluble. You can only have good solutions, you can’t prove you have a better solution. And that’s an area of activity, where there is activity going on, but it’s less so than most other areas of analytics. So, people say we live in the age of analytics – well, we do compared to ten years ago, but it can go much further than it’s already gone.
So, the begetting of BI, the desire for knowledge begets user requests, which beget analytics projects, and the analytics projects beget data lakes, and data lakes plus analytics begets insights and insights beget BI. That’s a story I just told; I just thought I’d write that out. What I’ve kind of done here, I mean, the whole point of this slide and actually most of the other slides is just to actually emphasize how complex the world of business intelligence actually is. It’s not a simple thing, I could have made this particular slide way more complicated than it actually is, but you have at the bottom here, you have external data and internal data that in one way or another is going to be put into a staging area, which nowadays this is kind of data lake stuff, although not everybody has data lakes. And people that do don’t necessarily have successful ones. And then, there’s an ingest cleansing activity and a governing activity required on the data before you can actually really use it. And then, you serve that data up and you either report on it, or analyze it and the analysis leads to action.
And if you actually look at the various kinds of analysis that exist, this is an incredibly long list, but it’s not necessarily a completely comprehensive list, it’s just what I thought to write down, when I was actually creating this slide. So, there’s a lot of things that go on in a BI environment that visualizations, OLAP, performance management, scorecards, dashboards, various kinds of forecast, data lakes, text mining, video mining, predictive stuff, there’s a vast spectrum of stuff that actually goes on. If you look at it in a different way, the corporate reality, basically really this is a similar diagram to the last one, it’s just done in a different way. I separated what you would call BI because it is regular and it’s known what’s required, that doesn’t mean that what’s actually happening is efficient, but at least you will have regular things happening in, let’s say Tableau, or in Click, or in Cognos, there’s a subject source, and so on and so forth, various regular reports or capabilities will be going on. And then you have the analytics apps and they’re different. Because the analytics apps is really about exploring data and in my mind it kind of equates to research and development. And then you have workflow. Under workflow mix your stuff up with operational apps and office apps, if that’s necessary – and that’s the corporate reality as I see it – although in most organizations it’s not that well organized.
So BI disruption, this is just a set of things to mention makes BI harder than it used to be, because the old BI world consisted primarily of fairly clean datasets being in one way or another captured, probably from a data warehouse and fed into specific BI software. And in those days, I really am talking five or ten years ago, but in those days, the data volumes weren’t expanding, the data sources were known. The speed of arrival of the data was known, although often some BI would not be happening fast enough for certain users’ liking. There wasn’t any unstructured data, there was almost not social data, certainly no IoT data, you didn’t care about data provenance. The computer value didn’t have parallelism in terms of the infrastructure in order to be able to in one way or another do things extraordinarily fast. You didn’t have machine learning, and the number of analytic workloads was fairly slim. And all of that’s changed, data volume now can be growing very dramatically. The number of data sources it just keeps on going up. Yes, streaming arrival of data very fast, lots of unstructured data, certainly social data which will need cleansing, but other data which might need cleansing, certainly IoT data, is the deal now.
Data provenance is an issue and we do care about it. The computer power is there, which is neat, because that makes all sorts of things feasible, and you’ve got machine learning now as a phenomenon that leads to the creating of more BI capability and new analytic workloads that’ll do the same. So, BI is not a static situation and I think that’s the last thing I’m going to say, before I hand it over to Stan. Oh no, it’s not, there’s something else. Future BI landscape, the internet of things, event-driven architectures, real-time everything, OK. That’s enough BI of the user, by the user, for the user the issues in summary. Data flow performance timeliness, data coverage, data cleansing, data access skills, visualization, shareability and actionability.
So now I can pass it on to Stan, unless the BI service is dependable and timely, it isn’t a service. Stan?
Eric Kavanagh: All right, Stan, I’m giving you the ball, take it away.
Stan Geiger: OK. So, what I’m going to talk about is just my background. I’m a senior manager at IDERA in product management and one of the responsibilities I have is our business intelligence offering product. So I’m going to expand on a little bit on what Robin was talking about and talk about the key area with business intelligence is monitoring your platform health. It’s like he said, now it used to be where we had all this data and it would take weeks to analyze and then we’d come back with reports and things. But the BI landscape is changing such that we’re getting closer to almost real-time analytics now. And in a lot of cases, actual real-time analytics. So, I talk about this slide a little bit, this is just kind of an overview – and just as a full disclosure is that I’m going to talk about it from a Microsoft perspective, but all of these concepts go across whether your BI platforms are in Oracle, or you’re using Informatica and Oracle, or just mix mode, hybrid environments. I’m just going to use in reference to Microsoft environment, but this is pretty standard.
Robin had a slide in there that touched on this, is that you’ve got source systems, where I’ve got all my data sitting, and now it used to be these were all in relational databases and data storage such as that, but now we’ve got Hadoop and internet and things, and all of this unstructured data sitting out there, and we can now bring those into this BI architecture. So the middle tier there talks a little bit is the data storage in aggregation; this is where we pull data in, we might clean it, we might restructure it, and then put in some type of data store and then the presentation layer sits on top of that, and that’s where your users are getting access. And we’re doing analytics on that data in those data stores, and we’re doing dashboards, and we’ve got Tableau sitting on there, reporting services, things like that. I always laugh because when I was a BA architect, we always laughed about Excel, because let’s face it, Excel is the BI tool of the masses, still.
So, a little bit of an overview there, but just to talk about kind of the platform architecture, you’ve got your source data and I talked about that in multiple data stores. And then I’ve got my storage in aggregation in the Microsoft world, you’ll have your SQL Server database, maybe where your data warehouse is, you maybe have your data warehouse in the cloud, with as your data warehouse. You’ve got analysis services, which is your OLAP tubes and things like that for doing aggregations and things around looking at things across multiple dimensions and things like that. Then you’ve got your presentation layer, which I talked about briefly, of all these things that sit on top of those data stores and aggregations. And I always like this quote, “You don’t know what you don’t know,” which is true. If you’re not monitoring and you’re not looking at what’s going on, on all of these areas of your BI platform, how do you know when you have a problem other than when the users start sending you nasty emails and the phone starts ringing about why are my reports not running? Why is everything taking so long?
So, in that vein, what you’ve got to do, you’ve got to be able to monitor your platforms that you’re serving business intelligence from. And I basically broke that down into three areas: you’ve got availability, performance and utilization. Availability meaning whether the resource is available: is it up or down? Pretty simple there. But also looking at when you do have, you may have the platform may be available, but you may be having issues there, so you’ve got to be able to do root cause identification; you’ve got to be able to have alerting and to let somebody know what’s going on, before things get to a critical state. That leads into the performance side, too, you’ve got things from a performance metric level, at the server level, where the services or the BI services, or BI platforms are hosted; you’ve got resource-level performance where maybe I’m accessing data from a SAN, for example. The SAN being the resource, network resources, you need to be able to monitor the performance of all of that, to be able to identify bottlenecks and keep your users happy, and if you’re in an environment where you’re doing real-time analytics, you need to be able to identify bottlenecks or problems before they start happening.
And the last theory is the utilization: what are the users doing? Who’s connected to my BI sources? Who’s running what? What queries are they running? What reports are they running? Knowing this information helps determine and do capacity planning, for example. It also shows what’s being utilized in your BI environment. We had a customer that they wanted our monitoring product for BI just so they knew what parts of the BI environment they were utilizing so they could move resources around. For example, if they weren’t utilizing certain reports, or certain analysis services cubes, then they would move resources from that to other areas that were being highly utilized. Another quote that I like, I like really great movies like “Tremors,” so tell you my movie, so I like this quote from Burt Gummer, who was played by Michael Gross, he’s kind of the survivalist gun guy and he says, he shows up and he pulls out this huge 50-caliber sniper rifle, and one of the guys says, “Damn, Bert.” And he replies by “When you need it and you don’t have it, you sing a different tune.” In other words, you know what? He was prepared for anything and he came prepared for anything, and so what I mean by this is if you’re not monitoring your BI environment from resource and utilization and things I just talked about, then you don’t realize you need a tool or an environment or structure that’s monitoring it until you don’t have it. And then you realize I really did need it going forward, and that’s kind of the way a lot of our customers are.
So, having said that, we will move into, and we’ll take a look at what we’re doing here at IDERA to solve some of these issues. And—
Eric Kavanagh: Okay, there you go, I see it.
Stan Geiger: You see it? Okay. So, what we’ve got here is this is our BI Manager product. And we monitor, IDERA traditionally has been a company in the SQL Server, Microsoft SQL Server environment. And then we bought in Embarcadero, so now we’ve expanded out to some other platforms, but our BI product traditionally monitors the BI stack in the Microsoft environment. And that would be analysis services for your multi-dimensional and tabular analysis, reporting services, reporting tool and then integration services, which is an ETL platform, similar to like Informatica.
And through our product you are able to monitor all three of those environments through one product, and what you’re seeing here is the overall dashboard, and the thing to note here is when I talked about it alerting, it’s one thing to monitor, but that’s not enough – you need to have an alert mechanism. In other words, I need to be able to be notified before things get to a critical state. So, what we do here, there’s a whole set of metrics that we capture that are configurable because depending on your environment, certain thresholds, you may be okay with a thirty millisecond read time, in your environment. Other environments it may be more critical that that threshold be lower, so it’s important not only to have alerting, but to have it configurable because environments are different depending on resources.
So, basically, this is an overview of all of the environments that are being monitored here, and I’ve got three instances here: one for analysis services, one for integration services, one for reporting services. And you see I’ve got a couple of alerts here. And because these are red it tells me these are critical, because I have multiple levels that I can set those alerts, and the alerts can be emailed out to people who are responsible for looking into what the problem is. So, just briefly we’ll take a look at and I’ll come back to the alerting, so we can go into the analysis services piece and it’s, I’m sure it’s waiting to load here. And basically, what we do, we have a data collection; it goes out there periodically and goes out there and collects and snapshots kind of what your environments are doing. So, I have mine set for every six minutes, so every six minutes it goes out there and polls the environment. I had my VM asleep for a while, so it’s going to take a second for this to come back up. There we go.
So, we take a look at the analysis services piece and so I’m going to click on my instance here, and remember I talked about one of the things we monitor is performance at the server level, because a lot of people have multiple things running on their server. I may have a database running on my server, as well as analysis services, for example. So, if something’s going on in the database or I have an issue at the server level, it’s going to impact whatever’s running on there. So, we’ll monitor things across the server at the server level, things like how’s the disk performance, and you can see we capture metrics around all of this. And all of this is configurable. And I take a look at what’s going on, CPU-wise, just, and again, this is at the server level, not at the analysis services level in my example here. But actually at the server level.
And I can look at things like what’s the memory, the memory overall usage for example, what’s available? So now I get an idea of what’s the health of the server itself. Then we can start taking a look at things that are particular to, in this case analysis services. I can look and see how my cube processing is going here, for example, and this gives me a measure of the health. If I start seeing that the processing is taking longer, or it’s not the rows are not being written nearly as quickly, then I can start taking a look at – and this goes to the correlation piece that I believe Robin was talking about, is that it still takes a human to be able to do all of this. We talk about AI, machine learning, but it still takes a human to be able to correlate these events around things. We can take a look at things like what’s going on as far queries, what queries are being run and how long are they taking? I can sort, so I can start to get an idea of which queries are taking the longest amount of time. You can take a look here at elapsed time, I can take a look and see OK, what was that query and who was running that query at that time?
So then I can start to put a story around this as far as when I start seeing things start spiking, I can go back and look and see what users were doing at that point in time. And you’ll see one of the things that we do is we put this time picker in here to allow you to pick a window of time. So for example, I can go back to those alerts, and it was actually a link on those alerts that I click on, and it would take to me that point in time when that alert occurred. And then I can start piecing the story together, I can see oh, well, the disk reads were up, or had memory issues or whatever, and then I can jump over the query activity at that same point in time and I can actually start correlating who was running what queries that might have caused those spikes in there. And then, you can start doing things like I can start tuning, that’s when I start tuning. This is like a car, if you build a race car and you just drop the engine, and start the key the engine might start, but if I need to go 180 miles per hour to win, I need to know that engine can run 100 miles an hour and I need to go in there and start tuning that engine in order to be able to get there. And that’s what this enables you to do, is to be able to give you enough information to start tuning your environment, to increase the health and the production of that environment, and the efficiency.
And then, we monitor things across memory that’s particular to analysis services, in this case. And this is where you can start to see where things might start to go awry, when you start seeing things spiking above between your memory limits, things like that. The other thing that’s good to look at, anytime you’re running any type of queries, you want you want data to get cached, because when it gets cached, it’s in memory and not having to read from disk, which is a lot more efficient than having to read data from disk. So you can start taking a look at things that are going on, excuse me, in the data cache for example. I had a bunch of queries running earlier, to get this data, and you can see I had most of the time, the cache hits and lookups are overlapping, which is good. But I had a period here where the hits were a lot lower than what the lookups were, which tells me that I had something going on that was memory intensive, such that the cache was getting flushed a lot quicker, so data was having to be read from disk. And we can see that when we look at the storage engine. This is the same point in time as that other graph, and you can see the spike there, where the queries from file really jumped up during that period. And that means that data was being read from disk. Now, I can go back and then correlate that to the queries that were running, and not to make everyone’s ears bleed, but in analysis services, it uses a language called MDX, there’s ways to write queries more efficiently, so it uses the cache more efficiently and less storage. So, there’s an example of tuning that engine, and giving you all the pieces needed to be able to correlate that.
Just quickly, we can also flip it the other way, when we look at the queries, we can look at now the sessions, who’s actually connected at this point in time and what are they running? So this kind of gives you the opposite view of the queries and who’s running them. This is who’s connected and then I can see what they’re running currently. The other thing, just to quickly go over, is you can see all the objects in my multi-dimensional MOLAP cubes. And I can get information on it. So, for example, I can sort by this read column, and I can see that the most utilized object is time dimension and the second most utilized is the customer dimension. And this helps people who develop and build things to more efficiently build their cubes. I may want to change my partitioning strategy of the data, for example, on these highly utilized dimensions in my cube, and therefore that’s going to increase the performance of queries, for example. It may decrease the performance of processing the cube, because now I’ve got more partitions, but from a user perspective it’s going to tune that engine, to be more efficient for utilizing these objects.
So, move on, talk about integration services here. Integration services, I mentioned, is an ETL platform in a Microsoft environment. What we do here – and this is consistent – we monitor the server performance, and these would be the same metrics we looked at, because all of my services are running on the same server. But again, this is an overview of what’s going on on the server. And then I can look at the activity for integration services, my ETL processes. So, I can get an idea of when these processes ran, whether they were successful or not, I can highlight a particular run of an ETL process and then it will show me the breakdown of the steps within that ETL process, whether it was successful or not and how long it took.
Now, if I had a failed package here ETL process, I could go down to the details and see the error message and it would show me which step in that package where that ETL process failed, along with all the messages associated with that. So, what that does, is that gives me, and I can get an alert if it fails, so if I get an alert, I can go in here, see, go to that alert, see the package failure, look at the steps, see where it failed, look at the error message and I immediately know what I need to do to fix that: redeploy it and then start it over again. So, what this allows you to do is we call it shortening that window between identification of the problem and resolution of the problem. So, in prior life, when I was responsible for this kind of thing, we had ETL process that would run at night, to load our data warehouse. If I had this information, first thing in the morning when I came in, if something failed, then I can quickly address it and get that process back up to make sure that the data warehouse was up and running and refreshed by the time the users came in and started accessing reporting.
The other thing is I’ve got two processes that run, is to look and see how it ran over time. That’s important because if I start seeing these processes, for example, taking longer, seeing these times ramp up, then I may need to take a look at, for example, my maintenance window, I may have things that are going on on that server. Take, for example, backups; I may have a backup going on that’s causing my process to wait until it’s done. I may need to reschedule or juggle my processes around things that are starting to impact my ETL.
And the last piece is reporting services. Reporting services is Microsoft’s, basically their enterprise reporting tool. And some of the things, again, we can look at things at the server level, we can look at things across the report server, reporting services server, itself. I don’t have a lot of stuff running here; I have some subscriptions that run every 15 minutes, to run a report. So, you won’t see a lot of active connections because it gets on, connects, runs report, disconnects and sends it off.
But in high transactional environments where a lot of reporting is being done, being able to monitor these things is key. So, you can see where I had things going on here, so it gives you a pretty good idea of what, from the actual service and platform level, is going on. And then, as I talked about in the slides, is who’s running what and what are they doing? And one of our customers bought this product just for this piece because they wanted to know what reports people were running, and who was running these reports. So this is one of the things in this report execution that you can see here. I can see what report, I can see any parameters that were in that report, I can see who’s running it, I can see the format of the report. And then I’ve got all these metrics around it, so if again, I can rank these things, for example, what report took the longest to retrieve data, and I can go right to that and see which report that is. And again, this all gives me data in order to be, to tune that engine again. Now, I can start tuning my reporting environment around that.
And the last thing, is I can take a look at the user activity, who’s connected again to currently, what are they doing? I can actually, in an environment where I had multi-users these are all sortable so I can rank, I can see who’s utilizing the environment the most. So, just to quickly go back and take a look at those alerts. Here was that alert; I can click on this link here and it will take me to the graph for that point in time and show me which one was under alert. So you can see here, that’s the one that ’cause it was the average milliseconds for write, for example, read and write. So, again, just trying to get that point of identification of the problems. And it’s really important to have a holistic tool, not just something that looks at that one thing, because human’s gotta come in here and correlate these events that are going on, so you need to be able to look at what was going on at that point in time across the multiple areas of that environment, and that’s one of the things we do through this time picker here.
Eric Kavanagh: Yeah, this is Eric here just with a quick question, ’cause I think you probably hit the nail on the head, and this is what I was talking about at the top of the hour, that a human being has to come in and draw these correlations between different environments. I’m curious to know, is there some educational material that you guys can share, or maybe do you do some kind of engagement with folks to help them identify some of those patterns? Like you had a really good example a minute ago, about when one of these is spiking that tells you that something is going on in memory because it kept trying to dump the memory. And it gives you a clue, but how do people map these statistics against real-world problems, is the real question.
Stan Geiger: Yeah, that’s a good point and one of the things I was just talking about, road map for the product, is later this year we’re going to release a version and one of the things we’re going to start adding is to each one of these graphs, is a description of what this graph means and why you should care, and what the impact of this is. So be able to click a question mark or something on this chart and then pull up a window that’ll give you a lot of that information and tell you these are the possible causes, these are the areas that are impacted, and to guide in you in a direction of being able to go in this case, like you said, here’s that spike, I know from my personal experience what this means. And then I can start going and start drilling into an area and finding the root cause.
Now, we have a lot of that, actually, in our diagnostic manager product for SQL Server, for the actual database. We have a lot of that type of functionality in a product like that, and also we have some analysis bolt-ons to diagnostic manager that clue you in a lot quicker. And that’s where we’re going down the road with this product.
Eric Kavanagh: And I’m guessing there are signatures to certain kinds of activity. Does this tool allow you to identify when a certain kind of event took place and catalog that, such that over time it’s going to recognize a similar pattern down the line and help you figure out maybe if it’s a new user, for example, using the same tool? Help you understand, oh, this is because these servers went down or because this region went down? Is there some way to catalog signatures of problems, such that you can easily identify them later?
Stan Geiger: No, actually, but that’s actually an interesting concept, because it’s almost like, what is it – principle component analysis, I guess – where you identify patterns and you log those patterns and so if you see them again you can go back and see, OK, this was the cause at that point. Yeah, that’s something, it’s not on the road map but it’s something that I’ve been thinking about from the product management standpoint.
Eric Kavanagh: I can imagine. Oh, go ahead.
Stan Geiger: No, I was going to say – and we get a lot of requests, because I don’t know what your experience is – but what we find is DBAs know databases like the back of their hand, but the BI stuff is kind of like a black box when it comes to platform health. And there’s not, they don’t have a lot of knowledge base around that. I do, just from having worked in it for like five to ten years, right? But typical people who are responsible for finding these, or getting alerts and figuring out what was going on, it’s kind of a black box to them.
Eric Kavanagh: Yeah, I can imagine. I’d be curious to know, too, so you were showing in that one screen how you can see all the queries that are coming through, how long they took to run, and who generated them. Can you also see the actual structure of the SQL query itself and kind of do some analysis around that? Like maybe sometimes people put together SQL queries that are kind of bulky, let’s say, and cumbersome, as opposed to a master who really puts together a nice, tight query. Is that something that you can visualize through this tool and then help you [inaudible] that’s the problem?
Stan Geiger: Yeah, so what you can do is, like what I’ve done here, is I’ve just sorted by lapsed time, for example. So I can see the ones that took the longest and then I get the text but then it’s still up to somebody who is more or less the subject matter expert to look at that and go, “Oh, OK, here’s why that took so long.” That’s something that we have kind of a workload analysis, we call it SQL Workload Analyzer for the database side, that I’ve been fooling around with the idea of maybe down the road coming up with a similar thing, so that it identifies these queries and then gives you recommendations on how to tune those queries. But one of the issues is, is that this MDX query is a pretty specialized language.
Eric Kavanagh: Yeah, I can imagine. But you can see, for example, who the people are, so it’s not too hard to figure out if one person, if one guy is responsible for ten of the longest process queries, then if nothing else you can call him up, or call his manager or someone and say, “Hey, this guy’s chewing up a lot of bandwidth,” and maybe it turns out those are the most valuable queries for the business, right? You have to put it in the context of what the business value is, from the queries themselves, it’s not just a clear numbers game, right? It’s to find out, well, this guy is our power user, and he’s the one changing the business, right?
Stan Geiger: No, you’re exactly right. I mean, that’s one of the way customers use this, is to be able to do that. Like you said, you may find one area, because one of the things I talk about, I always slag on Excel, but you can connect to analysis services in Excel and run pivot tables off of OLAP, and it generates its own queries, and sends them and sometimes they’re not the best form, so you can go back and identify those and actually rewrite those and give them to the user and let them run them outside of there so that it doesn’t take a half an hour for them to return back to their pivot table.
Eric Kavanagh: Exactly. And when we talk about queries, you guys cover the gamut of queries, so you mentioned MDX, what about some of the other queries like a DAX query, or some of these other—?
Stan Geiger: Yeah, we cover, yeah, any DAX and MDX both. So one of the things I didn’t mention, or I did, maybe, but we do support both tabular and OLAP in Microsoft and DAX being – I think you and I talked about this a while back – is we’re seeing a lot more tabular now than we are OLAP. ‘Cause it’s just easier to bring up the tabular models and things like that, and so you’re going to see obviously DAX queries, but we’ll pick up those, also.
Eric Kavanagh: Yeah, that’s interesting. Do you have any context around why that’s happening? Is it maybe because more and more people are getting into this stuff and because OLAP of course is not something new, that’s been around for what, at least 30-odd years?
Stan Geiger: Right, well, it’s kind of a combination, one of the things is designing cubes is an art. And cubes were built to pre-aggregate data so it’s real fast to get data out, but processing the cube takes a while because it’s got to do all those aggregations. And then, hardware got cheaper and memory got cheaper and then everybody was coming out with columnar store and in-memory databases, really. And also tabular is probably the closest to traditional relational databases and it’s just a lot easier and quicker to bring up tabular models than it is with OLAP. But the drawback is that it resides in memory, the whole thing resides in memory, so it’s very memory intensive and the data doesn’t aggregate until you request it. So, but having said all of that, we’re starting to see a lot more tabular out there.
Eric Kavanagh: That’s interesting. It might also be because this industry is kind of flattening out a bit, and what I mean by that is we’re getting a lot more people who are interacting with data and using various tools, and certainly when you talk about Microsoft, I think that’s definitely the case that you have many, many more users for small and mid-sized business, and even some larger organizations who are digging into the stuff, getting access to tools, running queries, and they’re not maybe as familiar with the whole process and the technologies around building cubes, to your point, right? ‘Cause it does take some thought, and it’s also expensive, right? It takes time, it takes energy to build these cubes unless you’re using some of the newer technologies out there. Like, we’ve talked to companies like Snowflake, for example, it’s doing pretty interesting stuff, but I think you do have a lot more people using the stuff and they’re probably going with what you just described, which is the tabular format, as opposed formally building cubes, right?
Stan Geiger: Yeah, well I mean, I guess Excel – when what was it, Power Pivot, I believe – that’s actually tabular, if you take a look at it; it’s the way you build tabular models. And then the next iteration was, I can tell you my tabular models that I build and I deploy it up to SQL Server so that I can share it with everybody else. So, it’s kind of a natural extension off of Excel almost.
Eric Kavanagh: Yeah, that’s a good point. What we’ve seen over the last, I’d say five to seven years, is just a tremendous expansion of the use for these technologies, right? And Microsoft, frankly, has been a pioneer in that, really democratizing the power data through analysis services and through Power Pivot, right? I mean, that was a game changer for the industry, right?
Stan Geiger: Yeah, no, you’re exactly right. I mean, I have a slide when I give a longer presentation that shows the transition of going from the semantic model, which was the OLAP, to the tabular. And I think I have a quote from Microsoft; they want to data in the hands of the users, not just over the wall in the IT shop, they want to get more of the data in the hands of the people that are consuming it.
Eric Kavanagh: And that gets right back to that first very simple slide that I showed, which was the basic decision-making process for any organization, and now – and I think this is a great thing – we’re getting more and more people from the entire hierarchy of the organization paying attention to what’s happening, bringing their story to the table and you do that with data, that’s the bottom line, I mean, you can use other means, but if you back your story up with data, you’re going to have a much stronger arguments than those who don’t, right?
Stan Geiger: Exactly, yeah. Like, yeah, that’s exactly right. I mean, that’s why now, it used to be “Hey, I need this report,” so now I got to go through the report request and I got to go through over here, and to get my report, and now I can sit there right at my desk and really just, I have access to the data generated, make my business decisions.
Eric Kavanagh: That’s right. You know, I came back from a conference just this past week and there was a hysterical comment from a guy who runs a rather large BI environment for the store Target, and he was referencing self-service analytics and self-service BI, and obviously that’s a big issue these days. I’m sure it’s something that’s driving a lot of activity for what you guys do at IDERA because when you want to roll out self-service, first of all you better have a healthy BI environment, right? If you’re going to get all kind of people out there asking all kind of questions in all kind of ways, you are going to want to have something like this tool right here, to be able to understand who’s asking which questions and where. And the funny quote I’ll throw out just for kicks here, as you said, “There’s a fine line between self-service BI and go F yourself.”
Stan Geiger: Yeah.
Eric Kavanagh: I thought that was hysterical. But are you seeing that self-service trend really drive a lot of awareness around what you’re doing with the technology?
Stan Geiger: Yeah, because like you said, if you’re going to allow self-service BI, then you’re probably going to get some performance issues, because of just: A) the amount of access, the amount of people going at the data, and B) the amount of poorly formed queries and ways of accessing it that you have. So, you really, it’s really imperative that you monitor the environment so that you are able to keep everybody happy that’s trying to consume the data, right?
Eric Kavanagh: Yeah, I think that’s exactly right. It’s a blessing and a curse: it’s good that people are trying to use the stuff, but again, to your point, if you don’t have the right tool at the time, you’re going to be an unhappy camper because to roll out self-service without a tool like this, it seems to me it’s just asking for a mountain of trouble.
Stan Geiger: Yeah, I mean, it’s similar to when I was building data warehouses, it’s like if you didn’t get your dimensions and fact tables right, then you turned it loose for ad hoc reporting, you might want to crawl under a rock.
Eric Kavanagh: That’s awesome. Yeah, it’s good, again, it’s good news that people are using this stuff, but I think I have to believe that self-service is going to drive a lot of activity for what you’re doing, because you’re talking about ramping up the amount of tension and the amount of pressure on these systems by orders of magnitude. Not just by one, or by two orders of magnitude and it’s that point that you really want to have some visibility and you want to be able to see who’s doing what, where, when, how and why. Ask those questions and then make some decisions about how you can monitor and change the environment and change your policies of who gets access to what, right?
Stan Geiger: Right. And you know, it also, knowing, seeing that utilization also allows you to go in there, and potential, like I mentioned the object within the cube, I can do things to improve that, that as far as the way I build and design things. So, it’s imperative that not only that to look at the performance of things but to be able to view how your scheme and your design is performing at that level, too, in order to be able to make tweaks to it. And it’s just going to get bigger and bigger as things like power BI is the big deal now, with Microsoft, so now I can build my own dashboards and widgets and things, and not have to be a BI developer.
Eric Kavanagh: That’s right. Yes, it’s good stuff, it’s getting everywhere, but you’re going to need some way to manage that environment or you’re going to get unhappy users. That leads to unhappy management, which leads to people getting fired. There’s a pretty clear domino effect when things start to fall part, but this is great stuff.
So I kind of chewed up the last five minutes here. Robin, did you have any questions?
Robin Bloor: Well, I think it’s fascinating, actually, to be honest. It has me thinking about the fact that we had very constrained environments and self-service is actually changing the world and a lot of that’s actually happening really because an awful lot more data has come into the environment than happened before. The only question, ’cause we haven’t got much time, but the only question I’d be interested in asking that is as you were explaining the way that – ’cause I thought it was a very good demo – the way that the BI monitoring works. I was wondering what do people that don’t have this kind of stuff actually do? Because it must be a very difficult, there are a number of things where you make a difference, root cause is well, you don’t necessarily always get to the root cause, but you can get to the root cause with some of the things that you’re looking at, that when you said that a number of people buy the tool just to know who’s running what, and that my mind spinning, because it’s like you don’t know who’s running what, then stuff’s out of control. So, what does the environment look like when it is out of control?
Stan Geiger: I mean, you could get all this information that we have in the tool yourself, but you’d have to write a bunch of homegrown scripting and ’cause the data’s all out there it’s just you’ve got to know where to get it, which requires a level of expertise, right? So, in environments where you don’t have that level of expertise, basically, what you get is, is hey, is it up or down? I really don’t know if it’s running efficiently or not, but it’s up, right? And then I start getting phone calls or people going, “Hey, my report is not in my inbox, what’s going on?” or “I just submitted this report through reporting services” or they may be doing a query over here in analysis services, but it’s taken like half an hour, and it used to only take like 30 seconds, what’s going on? Well, now you have to do the fire drill and try and figure it out, and without a tool it becomes very difficult.
Robin Bloor: Well, right, that was the thing that was just becoming increasingly apparent to me, as you demonstrated each of the dimensions of what you’ve actually got here. The other thing, it’s like at a very, very primitive level, if you don’t have alerts that tell you that stuff’s going wrong, then it’s just an expensive— you get into an expensive situation, trying to cure what’s happened, because you don’t find out until stuff starts falling over badly, right?
Stan Geiger: Right, you don’t know what you don’t know.
Eric Kavanagh: You got it. Well, hey folks, we’ve burned through an hour and change, here. Very big thanks to our own Robin Bloor, and of course our friend, Stan Geiger, from IDERA Software. They’re going to be at Enterprise Data World, in fact, if any of you are going down there, yours truly will be there as well in Atlanta. Our good friend, Tony Shaw, is doing a great job running that conference four years now, and hey what’s old is new again. It’s all hot stuff. Hopefully, we’ll see you out there, if not, check back with us next week, we’ve got a bunch of other webcasts lined up.
Always curious to hear your thoughts, send an email to [email protected], that goes right to me, if you have any questions or suggestions, or other technologies you’d like to learn about in Hot Technologies. And with that, you’re going to bid you farewell, folks. Thanks again for joining us, we’ll talk to you next time. Take care. Bye bye.