Quickly turn raw profiling data into performance insights using the graphical interface to sort, filter and visualize data from a local or remote target. Or use the command line i/f to automate analysis.
Powerful Analysis Lets You Create Faster Code
Whether you’re tuning for the first time or doing advanced performance optimization, Intel VTune Amplifier provides accurate profiling data?collected with very low overhead. But good data isn’t enough. Intel VTune Amplifier gives you the tools to mine it and interpret it.
What’s New For 2017
- Profile both Python* and native code. Low overhead, source-line detail (plus Google’s Go* Programming Language)
- Quickly profile three critical metrics for modern hardware performance: CPU utilization (threading), memory access, and FPU utilization (FLOPS).
- Tune Intel® Xeon Phi™ processors including in-package MCDRAM.
- Storage analysis. Tune interplay of I/O and compute.
- Enhanced memory access analysis: Tune data structures for performance and optimize NUMA latency and scalability.
- Simplified OpenCL profiling: New summary view, easier hotspot analysis setup, OpenCL 2.0 shared virtual memory detection.
- Easier remote analysis and command line use: Configure a command line for any target architecture from the user interface, including support for MPI launchers.
- Add custom counters to the timeline: Import a file or use the new API to visualize your custom software counters on the timeline.
- Intel® Performance Snapshots: Simple enough to run during a coffee break and highlight where code modernization or faster storage can improve performance. Pre-installed with Intel VTune Amplifier or available separately for free
Specs at a Glance
|Processors||Intel® and compatible processors and coprocessors including Intel® Xeon Phi™ processors.|
|Languages||C, C++, C#, Fortran, Java*, Python*, Go*, ASM assembly, and more.|
|Compilers||Works with compilers from Microsoft, GCC, Intel and others that follow standards.|
|Development Environments||Integrated with Microsoft Visual Studio* or runs stand alone.|
|Host Operating Systems||Windows*, Linux* and OS X* (optional download1)|
|Target Operating Systems||Windows*, Linux*, Android*, Tizen*, Wind River Linux* and Yocto Project*|
|Basic Threading Analysis
Full threading information
|OpenMP*, Intel® Threading Building Blocks, Intel® Cilk Plus, and native threads.|
|Extended Threading Performance Analysis||OpenMP* and Intel Threading Building Blocks|
|MPI parallelism||Integration with Intel Trace Analyzer and Collector MPI profiler|
|GPU||OpenCL and media application tuning on newer Intel processors.|
|Intel VTune™ Amplifier XE||Performance profiling of Windows*, Linux* applications1. Sold alone or as part of an Intel Parallel Studio XE suite.|
|Intel VTune™ Amplifier for Systems||Profiling embedded targets. Includes energy profiling for battery operated systems. Sold only as part of an Intel System Studio suite.|
Specs at a Glance
|Processors||Intel® and compatible processors and coprocessors|
|Languages||C, C++, C#, Fortran, Java*, ASM and more.
Works with compilers from Microsoft, GCC, Intel and others that follow standards.
|Development Environments||Integrated with Microsoft Visual Studio* or Eclipse* or runs stand alone.|
|Operating Systems||Window or Linux|
On newer processors, optionally collect GPU data for tuning OpenCL applications. Correlate GPU and CPU activities. (Windows* only.)
No special builds
Use a production build with symbols from your normal compiler.
Accurate results you can count on.
Automate regression analysis. Simple remote collection.
System Wide Analysis
Tune drivers, kernel modules and multi-process apps.
Tune Inlining with Call Counts
When a function is called frequently it may make sense to "inline" the code and eliminate the overhead of the function call. VTune Amplifier XE now provides statistical call count data to help you make better inlining decisions. It also displays profile results on the source code, even if the code is inlined, making it easier to interpret profile results.
Auto Detect Microsoft DirectX* Frames
Got a slow spot in your Windows* game play? You don't want to know where you are spending a lot of time, you want to know where you are spending a lot of time and the frame rate is slow. VTune Amplifier can now automatically detect Microsoft DirectX* frames and filter results to show you what is happening in slow frames. Not using DirectX*? Just define the critical region using the API and frame analysis becomes a powerful tool for analyzing latency.
Intel® Threading Building Blocks, OpenMP 4.0, Intel® Cilk™ Plus support
Built-in understanding of parallel programming models means profiling data is described using familiar terms from the source, not with cryptic internal runtime labels.
Low Overhead Java* Profiling
Analyze Java or mixed Java and native code. Results are mapped to the original Java source. Unlike some Java profilers that instrument the code, VTune Amplifier uses low overhead statistical sampling with either a hardware or software collector. Hardware collection has extremely low overhead because it uses the on-chip performance monitoring hardware.
Analyze User Tasks
The task annotation API is used to annotate your source so VTune Amplifier XE can display which tasks are executing. For example if you label the stages of your pipeline, they will be marked in the timeline and hovering will reveal details. This makes profiling data much easier to understand.
Tune for Intel® Xeon Phi™ Products
Hardware profiling is supported for Intel® Xeon Phi™ products and can be launched from the graphic user interface. It can collect advanced hotspots and advanced event data and has time markers for correlation of data across multiple cards. Software collection (e.g., locks and waits analysis) is not supported on Intel® Xeon Phi™ products.
“Hot keys" Start and Stop Analysis
Add a short cut to quickly launch performance analysis whenever you see your app running slowly. Program hot keys to start and stop the collection of performance data.
Tune MPI Applications
Analyze hybrid applications using MPI and OpenMP. Install on a cluster.