Eric Kavanagh: Alright, ladies and gentleman. Hello and welcome back once again. It is a Wednesday at 4 o’clock Eastern and, for the last few years, that’s meant in this world of IT and big business and data, it’s time for Hot Technologies. Yes indeed, my name is Eric Kavanagh. I’ll be your moderator for today’s event.
We’re going to talk about the systems that run business, folks; we’re talking about PeopleSoft, how to manage the performance of complex environments. I always like to mention, you play a big role in these events, so please don’t be shy. Ask your question at any time; you can do so using the chat window or the Q&A – either way it gets through. I would love to hear what you want to know and that’s the way best way; you get the best value for your time. We do archive all of these webcasts for later listening, so just keep that in mind.
If systems are running slowly, just keep in mind how life used to be. This photo is actually from 1968, courtesy of a lady named Danelle, and I have to say this really is a stark reminder of just how much things have changed. The world has gotten remarkably more complex and of course business needs and user experience tend to go hand in hand. But these days, there’s a little bit of a disconnect. There’s a mismatch, as we often say, and the fact is that business people always want things faster and faster, IT teams who have to deliver are the ones who get put under pressure to get the job done and it’s an intense world out there.
I have to say, competition has heated up everywhere. If you just look at any industry, you can see that there are major developments these days – Amazon buying Whole Foods, for example. You can rest assured the grocery industry is taking a hard look at that one. We see this all over the place, so it is really incumbent upon business leaders to make sure they figure out how to – and here’s the buzzword these days – digitally transform, how to move beyond the old switchboard to much more new and robust systems. That’s what we’ll talk about today.
One of the issues that faces a lot of organizations, especially ones that have been around for a while, are these legacy systems. That’s an old IBM mainframe from back in the day. There are legacy systems everywhere. One of the jokes is that a legacy system is a system that’s in production, meaning the moment it goes into production, technically it’s a legacy system. There are always going to be new ways of doing things.
And there are some very interesting developments in the last few years about finding ways to virtually reconcile systems to not necessarily just improve the performance of one system, but to find a way to create sort of an offshoot or an off-loading tactic to handle performance in other ways. Today, we’re going to talk more about how to improve the performance of a system like PeopleSoft, which of course is incredibly complex. But when done well, when loaded, when implemented, when managed well, it can do wonderful things. But when it’s not managed well, that’s when you have all kinds of problems.
So what happens? You have to be realistic about things and in any environment, if users don’t get what they want, sooner or later they go to shadow systems. It happens all the time. Shadow systems can be very productive, they can help people get the job done. But of course there are lots of issues. Certainly in the whole area of compliance and regulation, shadow systems are a big no-no. But they’re out there and I think it’s important to remember that your systems, if your main system is not working quickly or not working efficiently, sooner or later there are going to be workarounds and those workarounds can be very hard to unearth, they can be hard to sunset because they wind up being critical to the business. They can be hard to integrate, so just keep in mind it’s out there and it’s just another reason to improve performance.
Just recently I heard of this expression and I have to throw it out there: “the tyranny of urgency.” I think just hearing that you probably know what I’m talking about and what happens in most organizations is the workload reaches a critical mass, and people are doing as much as they can, and it becomes very difficult to change anything. You wind up suffering from “the tyranny of urgency” – everything has to get done right away. Well, upgrading a system does not happen right away.
Anybody who’s ever lived through upgrading an ERP from one version to another version knows that it’s a relatively painful process, so just be mindful of this: If you see it in your organization, recognize it. Hopefully you can get through to someone or if you’re a senior person like a CIO or CTO or CEO, recognize that this is a very dangerous scenario because once you’re behind the eight ball, it’s really hard to get out from behind the eight ball.
It’s like the whole marathon conundrum: If you wind up way far behind in a race of some kind and everyone’s ahead of you and you’re all still running, it’s going to be really hard to catch up if you fall too far behind. So just watch out for that and keep that in mind.
And with that, I’m going to hand it off to Matt Sarrel to give us some insights about how to handle complexity with PeopleSoft environments. Matt, take it away.
Matt Sarrel: OK, thank you, Eric. Hello, everyone. And so, let’s see, I’ll start off by telling you why I think I’m the right person to be talking to you about managing performance. So I have 30 years of experience in technology. I kind of like to say I worked my way up through being a hands-on, a network administrator, director of IT, VP of engineering at a couple of start-ups. Then I made this transition into being a technical director at PC Mag. There’s my picture there, but basically I look like a little kid.
And then going on and being a journalist at a variety of different publications like the eWeek and InfoWorld, being an analyst at Gigahome, networking with the Bloor Group and running a consultancy as well. And there’s me: This picture on the left is what I look like now. This picture in the middle is sort of where I’m very happy – in a room full of wires and blinky lights, and where it’s cold – it’s got to be very cold and everyone else has to be uncomfortable for me to feel comfortable temperature-wise. And there’s my contact info, should you have any follow-up questions.
I want to set the stage here and just talk about performance, as Eric talked about. We’ve now entered this world where users have this expectation that has been set by consumer apps and websites. And people used to be willing to go to work and sit there and wait for their systems because it’s what they needed, and now people aren’t really willing to sit there. So it’s a question of whether they want this motorcycle flying around the track. They probably don’t want the guy riding his bike and carrying his daughter to school. But which are you going to provide?
And it’s hard because – really I was kind of generous with this one to three seconds as good – people want an immediate response too, and they want access from anywhere. That anywhere could be anywhere in your building or on your campus, or it could be anywhere in the world at any time depending on how well your business works. And I guess what I’m building up to is that when we talk about performance, it’s important to think about performance from the angle of the user experience.
It’s important to define performance goals before measuring and tuning. I have this picture of a tuner and then a tuner. The actual man who’s a tuner, he needs to know what he’s tuning for or there’s no point actually putting his hands on the piano and tuning it. So defining goals beforehand, that’ll sort of keep it real instead of adapting goals to fit the current situation. It’s important to monitor metrics over time and realize how systems change with user load application performance, which is affected by resource scenes and usage patterns.
It’s always important to correlate all this together with a user experience or support incidents, establish a baseline for the performance that you expect to be able to deliver and when you’re approaching deviations from that baseline, have proactive alerts so that you can take action before we hit “fail whale” status. And you know that requires the ability to be able to determine and address the root cause of the performance issue very quickly and easily. And again, this is the earlier, the better, right?
We know, from past history looking at development efforts, the earlier you can find and fix performance problems, the better you are. If you wait until all your code or your system is live to start performance testing or to start uncovering problems, I’m not gonna say it’s too late, but again, now you’re the guy who got a bad start in the marathon and now you’re playing catch-up instead of jumping right out and getting ahead. So how do you do this? Do you anticipate your average and your peak load?
And you go ahead and you size your physical servers or your virtual servers or your cloud instances or your containers and your container resources and then run a proof of concept and run a pilot? These are the times that this is sort of, the end of where you’d want to catch something, although still you’re better off catching it in production than ignoring it in production. But really, by the time you’re in your pilot you should already have established your methodology and procedures around continuous monitoring and improvement.
OK, so a lot of companies – we talk about digital transformation. DevOps, in the DevOps revolution is playing a huge role in that digital transformation. And this is an end-to-end process that really never stops. So it’s like the two hands drawing each other, and this is good stuff. It’s an infinite loop between these two hands of plan, code, build, test, release, deploy, operate, monitor, back to plan. It feeds itself and we automate it so it goes quickly. It creates a production performance monitoring feedback loop and it uses it to proactively uncover performance problems and fix them before they impact your entire user base.
And another thing, now that you’ve got it, IT developers and operations staff moving very quickly and aligned, you can also easily align these efforts with the business staff as well. Enterprise software performance is a complex beast. One might liken it to a football team sitting in front of a chalkboard taking direction, and everything works separately and everything works together. I always think of it as the old story of when I got my first car and I fixed one thing. I fixed the air conditioner and then what happened was that then the rest of the cooling system failed. So you’ve got your pain points and everything’s going together and making adjustments. You have to organize everything in such a way and build the processes so that when you make your changes, you understand how everything impacts everything else.
And also be careful and double-check. Test, invalidate, implement. And again we come to this issue of building continuous monitoring and performance improvement programs. And this is, in fact, my last slide. While we talk about this complexity, and it’s a beautiful complexity just like the inside of this watch, we have so many moving pieces to PeopleSoft. Each thing affects everything else all the way up and down the stack. And there’s so many different places where you can look for keys to performance issues that you could very easily get lost without the right tool and without the right process. And again on everything, in many cases what I think we’ve learned is you can troubleshoot infrastructure, but the huge variable is going to be your custom application code. And so having the right processes in place for testing and continuously improving your application code are what’s going to be key.
And so that’s the end of my portion, and I’ll turn this over to Bill.
Eric Kavanagh: Alright, Bill, let me give you the keys for the WebEx here. I like that beautiful complexity – that’s a nice one. You had a couple really good quotes there, Matt. OK, Bill, take it away. Go to “quick start” if you want to share your screen. All you.
Bill Ellis: Thank you, Matt, and thank you, Eric. Just to confirm, can you all see my screen now?
Eric Kavanagh: Yes, indeed.
Bill Ellis: So we’re going to talk about IDERA’s product Precise for PeopleSoft and the visibility they can provide to help you succeed at managing the complex application stack. A way to position the difficulty is that one application, a minimum of six technologies, numerous end users and it makes it very difficult to answer even simple questions. Is an end user having a problem? Who is the end user, what are they doing, what’s the root cause?
What we typically see is this situation – and this can apply to PeopleSoft as well as other applications or PeopleSoft interacting with other applications – is within the data sets, or it could be the cloud these days, an end user doesn’t really care about that complexity. They just want to complete the transaction, approaches, inventory lookup, reporting time card, those types of things. If things are slow or not available, typically all of these intelligent, well-intended people are unaware until the end user complains.
That’s kind of a visibility gap right there, and then what can happen is it can kick off a time-consuming and frustrating process where people might open up a tool and they look at, unfortunately, just a subset of the application stack. So kind of the difficulty in answering those basic questions remain.
And a lot of times there might be an issue and you’ll go to the WebLogic administrator and he’ll say, “Well, the memory, the garbage collections all looks great. I really don’t think it’s WebLogic.” You go to the DBA administrator and they say, “Well the database, it’s running just the way it was yesterday. The top ten look good. Maybe the storage administrator hit you with some metrics like I/Os per second or throughput, which are frame-level metrics and might not reflect on your particular application, much less the database or particular process.”
And so they all have these metrics that seem to show that the problem is elsewhere, yet this end user is having a problem or has reported a problem, but how can we solve this problem in a better way? And the better way, the Precise way – or this is one way we’re offering – is to measure user transactions starting in the browser through the network, into the web server, into the Java Jolt, into Tuxedo, into the database including DB2 and then finally into storage.
And what this shows is that total time says, “Well, who’s having a problem?” And then we can identify the end user by how they signed onto PeopleSoft and we can also capture via the Tuxedo translation what PeopleSoft panels are executing.
So the timings are fed into a historic repository that we call the performance management database and this becomes a single piece of music that greatly simplifies the who, what, when, where, why. Precise also includes recommendations. Probably the most important thing is because we capture all of the information all the time – at both the technical IT staff level – you can measure the before and after. So you can bring measurement by measurement or Six Sigma to the whole operation of performance.
And so let’s take a look at like “a day in the life.” First of all, you might open up the Precise alert screen and this is where you’re going to get early warning. The very top alert is you have activity alerts. So that’s users exercising transactions and we’re basically not meeting our SLAs. Likewise, we have a status when availability – and this is basically saying that a portion of our application infrastructure is unavailable – so we can drill in and we can actually see how the Tuxedo instances in the form and you can actually see that one of the instances is down. All of the activity is being pushed to this one instance and it’s having to deal with that. We’ve basically created a bottleneck.
Now, just as a thing, for the activity that’s running on this, you can actually start to get into findings that, even though we have this overall infrastructure issue, there’s ways to improve the processing efficiency within this particular JVM for WebLogic. And this is where there’s this really an important thing: A lot of times people are moving like into a cloud and they say, “Well how much CPU and how much memory do you need?”
Well, the other side of that coin known as capacity is processing efficiency. If I use less memory, if I use less CPU, I just simply don’t need as much. And so like Matt said earlier, everything is sort of related. Now what I can do is I can open up the PeopleSoft transaction screen and in the screen, the y-axis is response time, the x-axis is time across the day.
We have a stack bar graph here that shows client time. That’s actually the browser, web server. The green is Java time, the kind of pink is Tuxedo, the dark blue is database time. This profile didn’t happen by itself; it happened because of the particular PeopleSoft panels – they had been executed and they are presented to you by response time. There’s actually a timing of every step within the application as well as a stack bar graph that shows the application here panel by panel. I’m also able to drill in and find a particular user or rank my users.
This screen allows me to specify a particular user by sign-in name. Think about how remarkable or how powerful this is. A lot of times, it’s not just about the infrastructure and how it’s set up, it’s how end users are using the system. You might have a new hire or somebody has a new job function: It might not know how to use the application correctly. This can actually help identify training opportunities.
The other side of the coin is if I can focus in on a particular user – here I’m looking at that user in their particular transactions and the response time that they experienced – I’m able to address directly the user experience of a particular user. It’s no longer about generic metrics at the system level, it’s about the end-user experience and that’s very powerful. Portions of your environment are certainly going to be internal, HR, etc. There can be other parts that are customer facing. Either way, you want to provide the best, most productive customer experience possible.
Now for a particular panel, I can go in and drill in to answer questions. So this is kind of the deep dive that we can do to kind of uncover what’s happening and you might do this deep dive before you call an end user or if an end user had called you, you’d be able to initiate a process to say, “Well where exactly is the root cause?” And it’s not going to be like a CPU utilization and an overriding, it’s going to be at the application code that they exercise.
Let’s drill in and we’ll take a look at that content management and you can actually see an analysis of that transaction: starting the browser, entry point to the web server into Java Jolt and we’re actually showing code that’s executing down into the Tuxedo panel, finally to the SQL statement where Precise reveals the text of the SQL statement that is executed by this particular PeopleSoft panel.
Everybody that we talk to has tools, but what they do not have is context. Connecting the dots or following the transaction from the browser all the way to the SQL statement is context. What this does for, like your DBA, is rather than look at things at an instance or a database level, I can now investigate at a SQL statement level.
So I can say, “Well what are the bottlenecks for an individual SQL statement,” and this is extremely powerful. Please consider that this transaction cannot run faster than the SQL statement and every significant business transaction interacts with the system of record. The database, like it or not, is the foundation of performance, and if I can be so granular as to focus on individual SQL statements that are vital for a business transaction, I can really take my game it to the next level.
Being able to identify that allows IT to not bark up the wrong tree, but to address the foundation root cause of different issues that can come up. Now what I’m able to do is for a particular SQL statement, I can then analyze exactly what’s happening at that SQL statement. So here we’ve dropped to the database expert view.
One of the things that distinguishes Precise at the database level is that we sample on a sub-second basis. This is in comparison to our competitors that only look once every 10, once every 15 minutes. So that the level of granularity, the level of resolution is orders of magnitude better than our competitors.
And once again, since the database is part of our foundation, we will allow your DBA to really take performance to the next level. So I can see that this SQL statement actually spent 50 percent if its time practicing accessing the stored subsystem, 50 percent of its time using the CPU. Click the tune button and I can then go in and drill down on execution plans and exactly what drove that usage pattern.
Now a quote from one of our customers – if they weren’t in Oracle Shop they used an Oracle tool called OEM and OEM is really kind of database or instance focused – it’s DBAs constantly looking at what are the top 10 list? But with Precise we’re able to connect the dots to the individual SQL statements and so that granularity allows the DBA to really tune at the transaction level and not just at the much higher database level.
The second point that was really vital to this customer is that Precise, by translating what is a complicated your URL into a PeopleSoft panel name – if I’m in IT and I can talk about tree manager, content manager, a particular HR page, that way the person I’m trying to help knows I’m actually looking and understand what they’re looking at because it’s no longer these hieroglyphics, it’s the name that they are familiar with.
One of the questions that we’re asked – it seems like all the time, so I thought I’d just kind of proactively answer the questions – how in the world do you capture that PeopleSoft user ID? Let me kind of go through the steps. Here is a PeopleSoft sign-on screen. To access it, I had to navigate to my web server, and this screen appears. When the application is instrumented with Precise, this screen actually contains a Precise script and I can reveal by doing a right click, view source. And this will actually show me the code that makes up the underlying page and up here in the page frame is actually the Precise for web code and this allows me to capture the sign-on screen, the IP address, the browser type, a whole bunch of information about rendering and the true end-user experience. And so when I put in my username and click sign in, Precise is then able to measure what I’m doing.
I open up, go to the tree manager, I want to do a search operation, fill in the field and I click search. A result set is presented to me, so I’ve clearly traversed the entire application stack all the way down to the database. How does Precise show this? Let’s go ahead and take a look. Open up Precise, I go in, I can see the activity, I can click the activity tab that’s going to bring up this screen. These are the untranslated URLs. I can show the users and here is my user ID that I just signed in on and here is my activity.
You could see that I was using Firefox version 45 to bring this up. I exercised the application 12 times and abandon is basically when someone leaves a web page before fully it renders, which suggests a business issue. So that’s how we were able to pick up the end-user ID. It’s very nice, people really appreciate when you know exactly what was going on.
Now we want to shift gears a little bit weird. We were looking at the transaction later. We did a deep dive on a particular transaction and looked at its SQL statements. Now I want to shift gears and take a look at some of the other technologies within the PeopleSoft application stack starting with WebLogic.
And so here is a WebLogic instance and you can see the activity over time. You have a finance report. It tells me right off the bat, memory is used near maximum. One of the things that we find is most people run the entire application stack, or at least a portion, under a shared environment, very often it’s VMware. You have to kind of balance how much resources you request and how much do you need. You don’t want to be a resource hog. Likewise, you don’t want to put a processing constraint by not asking for enough memory in this case.
The configuration is vital to performance management as well. So we can actually get into memory garbage collection and all of the JMX WebLogic counters so I know exactly the health of my WebLogic form.
Now into Tuxedo. Tuxedo at many shops is kind of a black box and it’s a very important part of PeopleSoft. It’s kind of the glue that holds everything together and so I kind of almost think of it as an extension of the operating system. It’s something that you use and configure very carefully. Incidentally – this is a little side note – in the opening comments Eric had mentioned “the tyranny of urgency,” and I think that that really comes into play when PeopleSoft shops are considering moving from the classic UI to the fluid UI because you’ll find that you are behind the curve due to the way the fluid UI exercises the PeopleSoft environment.
Now you have issues at WebLogic, at Tuxedo, at database and at the storage here just because HTML5 does a tremendous amount of messaging. It’s probably at least 10x what the classic UI does and that additional messaging means additional traffic. So the configuration of Tuxedo has to be modified to accommodate the additional traffic. A couple things about this screen is over on the right side we have over-time graphs for weighted response time, average response time as well as execution count.
Over here we have information about all of the Tuxedo domains within the environment. We divided out the services, users, server processes as well IPs. I can shift this to execution count and present those in descending order so I can see what is being executed the most times. I can also scroll down to reveal the domains; most people have multiple domains in their environment, to basically spread out the activity, and I’m able to set SLA compliance, therefore alerts at the Tuxedo layer.
If you have queuing, you have different issues that come about because of the configuration. You typically – because it’s global on impact – you typically are not going to make changes on the fly. You kind of want to gradually increment the system as part of QA process which bounces back to a point that Matt had made earlier about addressing performance issues early in the process. It’s much better to have to the configuration correct when you go to production rather than go to production and find out that the configuration doesn’t match the usage patterns. I really like the introduction that Eric and Matt had provided today. I thought that they were really on target in terms of the challenges you face in managing and evolving PeopleSoft environment.
Now, I said this once before – I think it’s worth saying again: Every significant business transaction interacts with the database. And so let’s kind of explore how Precise can provide additional information. In here is a particular Oracle instance. The same exact approach that we saw – the y-axis is execution time, the x-axis is time across the day, but now the stack bar graphs are execution states within Oracle. This is showing us what are the processing constraints on the system. Down here there’s actually a findings report that tells me you’ve got this high redo log buffer.
I’m also looking at this select version from PSVersion. It’s actually consuming a lot of resources. Incidentally, because we’re sampling and we provide this high-resolution view of what’s actually happening on the system, you might be surprised what are the true resource consumers on your system, because if you’re just looking every 10 minutes, it’s not going to show you what those resource consumers are. And so by knowing what the true resource consumers are, you can actually address the true processing on bottlenecks or on the system.
Now here we’ve jumped over to the activity tab and this is the activity. You can see we’re looking at CPU, storage subsystem, application locks, OS waits, RAC, commit, Oracle server, communication, and internal aggregate together. This is the y-axis, this is the total execution time.
Down here are the SQL statements that drove this profile and one of the things that you see are these low latency – two milliseconds but with almost 4,500 executions means that SQL statement is actually the number-one resource consumer on your system, and that’s good to know. It’s also not waiting on a lock or a wait. It’s using the CPU 100% of the time. It doesn’t mean there aren’t things I can’t do about it. There’s plenty of things I can do about it if I know what SQL statements and objects are being accessed. And so these are some of the ways that we can help.
Now down here there’s this drill-down and this can put us in context of the individual PeopleSoft programs and each of these programs kind of serves a different purpose within PeopleSoft. You can actually start to address at the database level how the application is being used.
And if I select a particular program, I can then isolate the SQL statements that that program submitted so I can be very application focused rather than database-technology focused when I’m basically looking and viewing database optimization and database configuration. I want to just bring this to your attention. Oftentimes many large organizations are divided into infrastructure DBAs and application DBAs. Precise, by showing the application as well as the resource consumption, we actually are able to bridge the gap and this solution is useful to both types of up DBAs on the system.
Now, this part really kind of is our show off what we can do at the database level. And what happened here is we had a screen freeze, there was a select from PS_Prod and what we did is we click this tune button and what this does is it brings us into this SQL workspace. Now, for you people who are not DBAs, this might not look real exciting. For people who are DBAs, you might find this to be pretty exciting. What we’re showing here is the duration of this particular SQL statement versus changes on the system. And this is showing Wednesday, Thursday, Friday, the duration is about 2/10 of a second. Saturday and Sunday this company does not work – lucky them. Come Monday, there was a change: The access plan changed. The new access plan is all of the sudden way up here. That’s actually slow enough it’s resulting in a screen freeze.
Now if I’m a DBA, I need additional information to know the true root cause. I need to know the choice databases optimizer made. So Precise offers this comparison that shows the execution plan that was fast and efficient when things were running great as well as the execution plan that was slow and inefficient. This filter join is common to DBAs that run PeopleSoft. What filter does is it looks for every row in one table, it looks at every single row in the joining table – that takes a lot of CPU. It’s extremely inefficient because there’s no filtering of just looking at the subset of rows that are needed, but by the SQL statement and that inefficiency results in the slower execution time. Therefore, they ultimately slow PeopleSoft panel in screen freeze and Precise was able to get to the true root cause that you would never know about unless you had a tool that reveals the application code, the SQL statements and so forth.
That was kind of the deep dive. We’re now going to pull the view up to the 10,000 square-foot view of dashboards. In Precise, dashboards are really not for the technical team – it’s really for you to use to share information with operations, maybe with the application team, maybe with your chain of command. And so one set of dashboards might show PeopleSoft panels and the client time so you know what the end-user experience is. Another dashboard may have been configured for operations and this dashboard might look at have there been any alerts freeze? We actually have alerts at the OS, the web, WebLogic, Tuxedo and the database levels. No alerts here, average response time. You can see that we’re running about a third of the second. Here I can actually look at my infrastructure show me all the VMs in my environment and I can start to get into processing, load balancing and I can also look at my Tuxedo domains. This particular environment has six different domains and so I can see those domains and I can actually get into web balancing.
Now, Precise’s historic repository that PMDB, the performance management database, has tons of metrics. And sometimes somebody wants to know about the browser access count or you could do access count by the type of browser or performance by the type of browser. There’s a whole bunch of things that can be done to provide additional visibility on your system.
Here, this one, we’re actually looking at the WebLogic memory usage and you see this nice sawtooth pattern, the memory usage. There’s the garbage collection, it retrieves the un-references. It goes back up and so this is a very nice pattern that you like to see. So this is kind of looking at the PeopleSoft environment as a collection of subsystems and this would be appropriate for operations. The most basic question is, “Well, what’s happening at the server?” Precise has all of this visibility. It also provides the server metrics as well. And so here you’re actually able to measure CPU, memory, I/O, server, users on the system and so you have that full visibility. And that’s a way – that combined with the long-term trending – is how people use Precise for capacity planning.
And I just want to throw a little note there. Typically a shop will have so much budget for hardware, for server, so much budget for staff. How are you going to invest, where are you going to place your bets? Using Precise, you get an edge because you see how the storage subsystem is being used. If you’re doing a lot of random I/O, Precise is going to show you that. It’s going to help justify the investment in solid-state storage. That might be more important to your shop than buying additional CPU if the CPU utilization happens to be low.
You want to invest where the true processing bottlenecks are, where you can actually have a payoff. And by Precise addressing everything from application coding processing efficiency all the way down to capacity, we allow you to assess and document where those needs are with numbers.
Now the last piece is alerting and the alerting is actually the way this started. Remember that? We saw an alert that there was a performance SLA and we saw that a WebLogic instance was down. So let’s take a look at the alerting interface. And once again, what’s happening? One of the things I want to point out on this view is that Precise not only has these performance alerts and status alerts about availability, we also have trending alerts. The reason that trending alerts are important is that if your system is idle or has one or two users, probably things run great. It’s not until you start to add users and they start to do more and more activity that you start to contend for data, for resources at the Tuxedo level, at the WebLogic level, at the network level, at the database level. And that contention results in performance degradation and then finally you might cross a line and that’s a performance alert, and that’s basically you’re not meeting the SLA goals for the organization. And so these sets of alerts are very nice.
The web tier, over on the left-hand side, the web tier actually measures the end-user experience and then you get into the technologies within the underlying application stack. This is kind of our architecture screen of how do we do all of this. Ideally you would like to have a Precise server that’s independent of the monitored environment or environments. One Precise server can handle numerous applications.
For PeopleSoft and for the Oracle and DB2 database, we do require a local agent. If your PeopleSoft environment is back-ended by SQL Server, there is an option to do agentless. We also have agentless for Sybase. The heart of our security model is that data is collected over here, whereas users of Precise authenticate into Precise. It’s totally separate processes, separate credentials, separate authentication, and so that’s part of our security model. And there’s additional details.
I think that this is enough of an introduction to the architecture for now. If there are any burning questions, please do ask them, as Eric had mentioned.
Just as a quick recap, this solution is designed for 24 by 7 in production. It’s highly recommended that you use us in QA. If you do in-house development, start using us in development. We’re going to translate the complicated URL, URI into a PeopleSoft panel name. When I talk about production, we’re extremely low overhead so you have visibility, you always know what’s happening, you’re identifying the end user.
I did not have to go in and define these transactions – there’s just natural connection points from the browser, the URL, the entry points, the web server connection into WebLogic, the invitation context down to the [inaudible] which provides the SQL statement. Then we’re able to capture the SQL statement and what it is doing. Precise is database intelligent and I think that this is a distinguishing factor for us and it allows your DBA to collaborate, enhance application visibility.
The final point is because we’re always on, we’re always collecting, you can always measure before and after and quantify the improvement or, in the rare case you may have changed the performance, you would know that and you could roll it back immediately. Most of our competitors, what they do is if you need to see additional information, you have to turn on additional visibility and typically that additional visibility imposes a lot of overhead. With Precise, you always have visibility and you can always solve the problem. So if you’re to go to the Precise website, please check any of the Precise products, whether it’s Precise for Oracle. We’re listed as Precise Application Performance Platform and there is a button there to request a demo.
Actually, if I share my screen I think I might just navigate there to show you what that looks like just so you can see this right upfront. Here’s the IDERA website. You go to products. I can choose any of these Precise components and I just want to see it in action. This will kick off our process for sharing additional information that might be important to your site. Or if you would like to know more about migrating to the fluid UI, you are welcome to contact us.
And which that, Eric, I’d like to pass the baton back to you.
Eric Kavanagh: OK, good deal. I have to say once again – a rather comprehensive and impressive presentation there, Bill. You mentioned a whole bunch stuff that I’d like to ask about. We don’t have much time – about nine minutes – and I’d like Matt to get a chance to ask a couple questions too, and have at least one or two from the audience.
But you mentioned something I thought that was very, very interesting with respect to how Precise can aid in procurement for the IT team because you can point out, you can make a case to whomever makes that decision that what you need is more solid-state storage, for example, or what you need is improvements to the network or whatever the case may be. But that’s a big deal. Do you often see companies recognizing that and using that or are you trying to evangelize that some more?
Bill Ellis: Well, actually both, and the thing is that usage patterns, even for a package application like PeopleSoft, the usage patterns are distinct at each site. I had the fortune of doing a PeopleSoft migration at a bank, and banks use the general ledger system very differently than most organizations. You could actually have individual transactions that were done at a branch, they all post to the general ledger.
And so rather than posting dozens or hundreds of general ledgers, you’re actually posting hundreds of thousands. And so that’s how I got involved in Precise is because of the usage patterns and it allowed us to address, but the needs of the application both at a code level, a configuration level, as well as at the infrastructure level. So absolutely I’m a big believer and I want to evangelize that as well because you shouldn’t be making the hardware decisions simply based upon utilization. You should base it upon the needs of your environment.
Eric Kavanagh: And there’s a question from an attendee, and then, Matt, I’ll turn it over to you for a question or two. Well, this is a good one and that’s funny because it’s a big, long answer you could give. The attendee asks: “How do you collect performance metric at the user’s end after deployment and during testing?”
I think you did a pretty good job of diving into just how deep and rich those performance metrics are. You talked about even sub-second for some of these things compared to every five minutes or 10 minutes. That’s when you’re going to get the level of detail necessary to find your answers, right?
Bill Ellis: Yeah, so the crucial thing is that the individual collectors of the performance information are technology based. So when we do a deployment, we need to know about how your application stack is built, starting with the operating system, its version, what version of Tuxedo, WebLogic, what version of People tools that you’re running.
And it’s really the design of those agents that does that, the data collection that allows us to reveal that the level of visibility Precise provides. And that visibility, I think, sometimes can be a little intimidating to folks. But if your goal is to really get in and improve things and take performance to 11, that’s really the level of visibility that you would like to have. And if Precise can provide it and it’s low overhead, the question is why not? So I think that that’s a great question and please do contact us if you would like to discuss that further.
Eric Kavanagh: OK, good. And Matt, did you have any questions?
Matt Sarrel: I think I’m OK. I mean, I’ve been dealing with WebEx crashing over here so.
Eric Kavanagh: Oh no. We need Precise to understand exactly why.
Matt Sarrel: Yeah, I guess the question that I had thought of while you were talking, Bill, was if you could discuss a little bit about how multiple teams can get on the same page when troubleshooting performance issues, because I know that’s something that comes up over and over again is who’s responsible for what and how can everyone work together to deliver the best quality to employees.
Bill Ellis: Yes, so IT staff tends to be expensive. In most shops, you’re divided into teams based upon technology, given the complexity of the technology. One of the big things that happens is there’s a performance issue and there’s a lot of times the conflict, the war room convenes. And that’s where everybody has the metrics to somehow exonerate their tier because they don’t have the context. They’re looking at what’s happening at the WebLogic level rather than what’s happening at the transaction-code level. Or they’re looking at the database level rather than the transaction’s individual SQL statement.
And by being able to pinpoint the problem tier and the problem code within that tier, what it does is it frees up the other teams not to go or spending time in resources looking for a problem that’s not within their area. If it’s a database problem, head off to the DBA with the information that they need to solve the problem. They’ll be glad to do it.
But likewise, don’t waste the Tuxedo, the WebLogic assistance team focusing on the problems in the database. Likewise, if the problem happens to be in WebLogic configuration, don’t take the DBA’s time in some kind of war room trying to defend themselves. Just go and fix the problem in WebLogic.
We find that IT staff appreciates Precise because of the time savings, because typically those war rooms are not budgeted into the time plan for each FTE organization. It’s kind of like additional time. And so being able to handle those issues more efficiently is really vital. And for the organization that rolled out the fluid UI, being able to scale in production and solve the problems they actually experience in production was really vital not to individual staff or teams, but actually to IT management overall because it would have been really bad news if they had to roll back. So, great question, because it’s not just the technology. It’s really always about the people.
Matt Sarrel: Right, it’s the people and the processes. Yeah that was the only question that came up for me during the demo. If there are any others from the audience?
Eric Kavanagh: Yeah, I’ll just throw one last one at you, Bill, and Matt talked about this briefly in his presentation. We’ve started to see this crop up. It is still very forward looking, but containers and the use of containerization and Docker and things of that nature, how big of a curveball does that throw you guys?
Bill Ellis: So the word means different things depending on different technologies. So we are evolving our products to take care of containers at the database level and at the application level. And as part of that, it’s kind of the whole environment with the movements, the cloud, and we do operate within the cloud. But there is a discovery process and so depending upon how these applications – including PeopleSoft – are evolving, we’re evolving our monitoring solution so that we can provide the level of depth that’s been so valuable in the past.
Eric Kavanagh: Yeah. And I have to say, every time I see these demos I’m just amazed at the granularity that you have and that’s what you need to be able to piece together an understanding and you do need to have some education around what is the normal situation, what’s standard.
And you folks offer a lot of content around that – helping people identify what is normal, what’s not normal. You talked about trending alerts, for example, these are all mechanisms that you can use to better understand is something wrong, is something not wrong, and then of course from there have to drill down to find it, but you have all the data.
Bill Ellis: Yeah, and that’s a really important thing; I think Matt had talked about that. What is normal? Different environments have a different level of normal. If you’re running with high-end hardware, Oracle logic and data, what’s normal in your shop or what’s achievable in your shop is going to be different than if you were running under a less powerful infrastructure. So the first thing is to find out what’s normal, start calculating that baseline and that way you can begin to make improvements from there.
Eric Kavanagh: OK, that’s a good point. We do have one last question coming in, it looks like. Just one last question I’ll throw to you, Bill. Any difference between SQL and database performance monitoring from the point of view of system-level and application-level data? What’s the difference between monitoring SQL and database performance, from your perspective?
Bill Ellis: Well, nothing happens in a database until its SQL statement executes. The SQL statement contention is what – control locking, waiting, the contention for resources at the data level and at the SQL Server level. And so if I am able to see both the driver of the SQL statement and its impact on the system, I have caused an effect; I’m able to link what the application DBA cares about with what the infrastructure DBA cares about until I’m able to really get the most out of the Precise tool.
If I’m an infrastructure DBA and I’m looking at things like utilization, I’m really kind of managing with a broad brush versus if I’m able to look at an individual SQL statement and I’m able to actually minimize resource consumption – whether it’s CPU, memory, I/O – I’m able to address both sides of that same coin.
Eric Kavanagh: OK, folks. We burnt through just over an hour. Big, big thanks to our friends at IDERA. A big thanks to Matt Sarrel for joining us today. We do archive all these webcasts for later viewing, so feel free to come back and usually in just a couple hours the archive goes up. So check that out and all I have to say is I love this stuff, I love Precise, I love being able to get into the weeds. And I don’t know any other tool that allows you to dig around into all those different pieces and parts of the application stack than what those folks have at IDERA with Precise.
With that, we bid you farewell, folks. Thanks again, we’ll talk to you next time.