Creating a robust infrastructure by-which to host SolarWinds instances is of vital importance. It has a direct impact on what can be monitored, as well as stake holder confidence in the resulting outputs. The latter being a key consideration that is often given a back seat.
A key component to building trust in a monitoring system though, is said system needs to reside on a trustworthy platform. It would be hard to have confidence in a tool when you’re not sure about the underlying mechanics and the impact on results. Kind of like depending on a structure built using crooked measuring sticks. Which is a tongue-in-cheek metaphor for the topic of this article. Identifying a scaling strategy for hosting SolarWinds that inspires confidence and trust in the results produced from network, systems, and applications monitoring.
The following article presents a review of methods and techniques used by our consulting team when formulating recommendations for a robust and trustworthy SolarWinds implementation. The vendor provides a lot of valuable information you should consider and review as an addendum to this article. Keep in mind, values presented are current as of the article’s posting, but may change frequently. The approach is the key, and the specific values should be reviewed for relevance before making recommendations of your own.
Start with the classification of monitoring types defined as networking, applications, or a combination of both. The basic question is “What kinds of things do I want to monitor?”. Put the results of those questions into the afore mentioned type classifications. Generally it is recommended to always include Network Performance Monitor (NPM) as a starting point for networking and Server and Application Monitor (SAM) for applications.
From this exercise, expand each classification type identified and select modules that include desired monitoring features.
There are of course a few outliers like DPA and LEM, but these modules are not part of the Orion Core and hence would be scoped externally to a standard SolarWinds instance(s).
Jumping right in…
With those considerations in mind, define Poller Boundaries. This term describes the boundaries, once reached, where default and custom polling thresholds are throttled programatically. This boundary is per polling engine, which means the MPE or APE. These are hard limits set by SolarWinds, that cannot be modified, nor can the throttling behavior be changed.
To have polling thresholds throttled unexpectedly will erode trust in your monitoring system. Hence the reason for defining Poller Boundaries.
The below chart lists several SolarWinds modules and their default boundaries. The “Max Limit” is the upper boundary for a single SolarWinds instance.
Keep in mind, these numbers are subject to change as each module is updated to newer versions. Always check the SolarWinds web site for current values.
|Module||Single Poller / Instance||Measured By||Max Limit|
|NPM||12K||Elements (Node / Interface / Volume)||400K|
|SAM||10K||Components (Process / Service / Counter)||150K|
|NTA||50K||Flows per second||300K|
|Agents||1K||Per Polling Engine||N/A|
|WPM||12||Recordings per Player||Based on Complexity|
|VNQM (1 of 2)||5K||IP SLA Operations per Hour||15K per day|
|VNQM (2 of 2)||20K||Calls per Hour||200K per day|
For each module you plan to use, estimate the number of objects that are anticipated to be monitored. Use the above chart to determine what each module considers a monitored object. Then divide that number by the Single Poller / Instance value found in the chart, and round up. Take the results and determine the number of polling engines needed.
An example using NPM where you have estimated a need to monitor 50k elements.
50K / 12K = 4.2
Round 4.2 up by 1
Resulting in 5 polling engines needed for NPM.
In summary, at a minimum, you would need 1 instance, 1 MPE which is included with each instance, and 4 APEs to meet your NPM monitoring requirements without crossing any Polling Boundaries. Keep in mind this is an article on scaling. There are scenarios requiring multiple instances even when a single instance is within Polling Boundaries. That however, would be an Architecture discussion, and is out of scope for this article.
Finally, perform the same basic steps for each identified module. If per say, SAM results in 4 polling engines needed, then the total polling engine count remains 5 to accommodate NPM.
SolarWinds does not provide direct support for Microsoft SQL Server. It is required for the installation of SolarWinds, but it is not a product they officially support. This is often times a point of confusion and consternation. Make sure you have someone on staff who is qualified to handle the database side of things, because this also means you’ll need to support SQL Server High Availability solutions if you want failover and X-scaling capabilities.
Choose a configuration that meets your companies RTO and RPO business requirements. At a minimum you likely want to use Always On Basic Availability Groups, which replaces the deprecated SQL Server Database Mirroring feature. This is a complicated decision to make, but it is likely the most critical. SolarWinds is built entirely around the database and hence is extremely dependent on it.
If company RTO standards require continuous service during failure events, then SolarWinds High Availability Pools will be required. Like any X-scale solution based on duplication it means the host count is doubled. However, only one High Availability Pool license is required per pair. Technically this is getting into Architecture and Licensing, but it is an important consideration that crosses into the boundary of Scaling and deserved mention in this article.
Continuing with our NPM example. Five polling engines would mean 10 hosts (2 x MPE, 8 x APE) and 5 High Availability Pool licenses.
If you anticipate more then 20-25 concurrent users accessing the web console, then an Additional Web Server (AWS) is recommended. This is not a hard rule because much depends on the sophistication of configured dashboards. However, said recommendation is a good rule of thumb and is supported by SolarWinds.
SolarWinds provides resource allocation guidelines based on Small, Medium, Large, Extra Large, and Amazon Web Services deployments. These guidelines are important and should be followed to the letter if at all possible. Recommendation is to classify the size of your deployment based on these guidelines and deploy the solution using resource quantities provided in the guide.
The guide explains resource quantities based on the number of modules plus licenses purchased. To make the conversion to Poller Boundaries, take the number of objects per module from the previous steps and apply them to the matching license quantity in the guide.
Using the NPM example again, 50k objects would require an NPM SLX license. In the guide this qualifies as a Large deployment.
The primary host types you’re sizing for include:
Using the above techniques, it should be possible to accurately describe the resources needed for building a SolarWinds instance that is ready to grow and scale with your company. Next steps might include taking this information and refining the solution further by architecting an infrastructure that takes into account other important factors such as latency between sites. That however, is a different topic all together.
Need more information on Scaling, Architecture or Licensing? Let’s talk! Monalytic is an authorized SolarWinds partner specializing in project-based services, managed services, training, licensing, and maintenance renewals.
Suggested Post – Sizing Your SolarWinds Log and Event Manager ApplianceBack to News