Programs for Programmers

Intel VTune Amplifier XE

Intel® VTune™ Amplifier 2017

Performance Profiler

  • Create faster code: Get accurate data, low overhead
  • More data: CPU, GPU, FPU, threading, memory……
  • Fast answers: Easy analysis turns data into insight

View the Product Brief

Modern Processor Performance Analysis

Performance on modern processors requires much more than optimizing single thread performance. High-performing code must be:

  • Threaded and scalable to utilize multiple CPUs
  • Vectorized for efficient use of multiple FPUs
  • Tuned to take advantage of non-uniform memory architectures and caches

With Intel® VTune™ Amplifier, you get all these advanced profiling capabilities with a single, friendly analysis interface. And for media applications, you also get powerful tools to tune OpenCL* and the GPU.

 

Quickly turn raw profiling data into performance insights using the graphical interface to sort, filter and visualize data from a local or remote target. Or use the command line i/f to automate analysis.

Powerful Analysis Lets You Create Faster Code

Whether you’re tuning for the first time or doing advanced performance optimization, Intel VTune Amplifier provides accurate profiling data?collected with very low overhead. But good data isn’t enough. Intel VTune Amplifier gives you the tools to mine it and interpret it.

What’s New For 2017

  • Profile both Python* and native code. Low overhead, source-line detail (plus Google’s Go* Programming Language)
  • Quickly profile three critical metrics for modern hardware performance: CPU utilization (threading), memory access, and FPU utilization (FLOPS).
  • Tune Intel® Xeon Phi™ processors including in-package MCDRAM.
  • Storage analysis. Tune interplay of I/O and compute.
  • Enhanced memory access analysis: Tune data structures for performance and optimize NUMA latency and scalability.
  • Simplified OpenCL profiling: New summary view, easier hotspot analysis setup, OpenCL 2.0 shared virtual memory detection.
  • Easier remote analysis and command line use: Configure a command line for any target architecture from the user interface, including support for MPI launchers.
  • Add custom counters to the timeline: Import a file or use the new API to visualize your custom software counters on the timeline.
  • Intel® Performance Snapshots: Simple enough to run during a coffee break and highlight where code modernization or faster storage can improve performance. Pre-installed with Intel VTune Amplifier or available separately for free

 

Specs at a Glance

Processors Intel® and compatible processors and coprocessors including Intel® Xeon Phi™ processors.
Languages C, C++, C#, Fortran, Java*, Python*, Go*, ASM assembly, and more.
Compilers Works with compilers from Microsoft, GCC, Intel and others that follow standards.
Development Environments Integrated with Microsoft Visual Studio* or runs stand alone.
Host Operating Systems Windows*, Linux* and OS X* (optional download1)
Target Operating Systems Windows*, Linux*, Android*, Tizen*, Wind River Linux* and Yocto Project*
Basic Threading Analysis
Full threading information
OpenMP*, Intel® Threading Building Blocks, Intel® Cilk Plus, and native threads.
Extended Threading Performance Analysis OpenMP* and Intel Threading Building Blocks
MPI parallelism Integration with Intel Trace Analyzer and Collector MPI profiler
GPU OpenCL and media application tuning on newer Intel processors.
Intel VTune™ Amplifier XE Performance profiling of Windows*, Linux* applications1. Sold alone or as part of an Intel Parallel Studio XE suite.
Intel VTune™ Amplifier for Systems Profiling embedded targets. Includes energy profiling for battery operated systems. Sold only as part of an Intel System Studio suite.

Download product brief.

 

 

 

 

 

 

 

Specs at a Glance

ProcessorsIntel® and compatible processors and coprocessors
Languages C, C++, C#, Fortran, Java*, ASM and more.
Works with compilers from Microsoft, GCC, Intel and others that follow standards.
Development Environments Integrated with Microsoft Visual Studio* or Eclipse* or runs stand alone.
Operating Systems Window or Linux

 

On newer processors, optionally collect GPU data for tuning OpenCL applications. Correlate GPU and CPU activities. (Windows* only.)

 

No special builds
Use a production build with symbols from your normal compiler.

Low overhead
Accurate results you can count on.

Command line
Automate regression analysis. Simple remote collection.

System Wide Analysis
Tune drivers, kernel modules and multi-process apps.

Tune Inlining with Call Counts
When a function is called frequently it may make sense to "inline" the code and eliminate the overhead of the function call. VTune Amplifier XE now provides statistical call count data to help you make better inlining decisions. It also displays profile results on the source code, even if the code is inlined, making it easier to interpret profile results.

Auto Detect Microsoft DirectX* Frames
Got a slow spot in your Windows* game play? You don't want to know where you are spending a lot of time, you want to know where you are spending a lot of time and the frame rate is slow. VTune Amplifier can now automatically detect Microsoft DirectX* frames and filter results to show you what is happening in slow frames. Not using DirectX*? Just define the critical region using the API and frame analysis becomes a powerful tool for analyzing latency.

Intel® Threading Building Blocks, OpenMP 4.0, Intel® Cilk™ Plus support
Built-in understanding of parallel programming models means profiling data is described using familiar terms from the source, not with cryptic internal runtime labels.

Low Overhead Java* Profiling
Analyze Java or mixed Java and native code. Results are mapped to the original Java source. Unlike some Java profilers that instrument the code, VTune Amplifier uses low overhead statistical sampling with either a hardware or software collector. Hardware collection has extremely low overhead because it uses the on-chip performance monitoring hardware.

Analyze User Tasks
The task annotation API is used to annotate your source so VTune Amplifier XE can display which tasks are executing. For example if you label the stages of your pipeline, they will be marked in the timeline and hovering will reveal details. This makes profiling data much easier to understand.

Tune for Intel® Xeon Phi™ Products
Hardware profiling is supported for Intel® Xeon Phi™ products and can be launched from the graphic user interface. It can collect advanced hotspots and advanced event data and has time markers for correlation of data across multiple cards. Software collection (e.g., locks and waits analysis) is not supported on Intel® Xeon Phi™ products.

“Hot keys" Start and Stop Analysis
Add a short cut to quickly launch performance analysis whenever you see your app running slowly. Program hot keys to start and stop the collection of performance data.

Tune MPI Applications
Analyze hybrid applications using MPI and OpenMP. Install on a cluster.