ParallelUniverseMagazine_Special_Edition_v2 issue-25.pdf
(
23445 KB
)
Pobierz
chirp.
Intel
®
Threading Building
Blocks Celebrates 10 Years!
Intel
®
TBB and C++: Partners in Parallel
The Future: How Should Intel
®
TBB Evolve?
Ideal for Heterogenous Systems
Special
edition
2016
CONTENTS
The Parallel Universe
2
Letter from the Editor
From Hatching to Soaring: Intel® TBB
by James Reinders
3
FEATURE
The Genesis and Evolution of Intel® Threading Building Blocks
A Decade after the Introduction of Intel Threading Building Blocks, the Original Architect
Shares His Perspective
7
A Tale of Two High-Performance Libraries
How Intel® Math Kernel Library and Intel® Threading Building Blocks Work Together to Improve Performance
17
Heterogeneous Programming with Intel® Threading Building Blocks
With New Features, Intel® Threading Building Blocks can Coordinate the Execution of Computations
Across Multiple Devices
21
Preparing for a Many-Core Future
Johns Hopkins University Adds Multicore Parallelism to Increase Performance of Its Bowtie 2* Application
32
Leading and Following the C++ Standard
Intel® Threading Building Blocks Adheres Tightly to the C++ Standard Where It Can—
and Paves the Way for Supporting Parallelism Best
46
Intel® Threading Building Blocks: Toward the Future
The Architect of Intel® Threading Building Blocks Shares Thoughts on the Opportunities Ahead
54
For more complete information about compiler optimizations, see our
Optimization Notice.
Sign up for future issues
Share with a friend
The Parallel Universe
3
LETTER FROM THE EDITOR
James Reinders,
an expert on parallel programming,
is coauthor of the new
Intel® Xeon Phi™ Processor
High Performance Programming – Knights Landing Edition
(June 2016), and coeditor of the recent
High
Performance Parallel Programming Pearls Volumes One
and
Two
(2014 and 2015). His earlier book
credits include
Multithreading for Visual Effects
(2014),
Intel® Xeon Phi™ Coprocessor High Performance
Programming
(2013),
Structured Parallel Programming
(2012),
Intel® Threading Building Blocks: Outfitting
C++ for Multicore Processor Parallelism
(2007), and
VTune™ Performance Analyzer Essentials
(2005).
From Hatchling to Soaring: Intel® TBB
Intel® TBB Is One of the Most Important Contributions to Modern Parallel
Programming—and There's More to Come
This edition of
The Parallel Universe
celebrates the 10th anniversary of the introduction of Intel®
Threading Building Blocks (Intel® TBB). Intel TBB has been called the most important new addition to
parallel programming in the last decade, and I would not argue with that. The articles in this issue will
help you understand why. If you will be so kind as to indulge me, I will share my own thoughts about
Intel TBB. I have four things in mind to touch on as I ramble about TBB.
It Was a Revolution Inside Intel
Intel TBB is our first commercially successful software product to embrace open source. We knew we
wanted to open source Intel TBB from the start, but we were not ready when we launched in 2006.
Open source projects were new to our small team—and to Intel.
We focused first on creating a strong Intel TBB and launching it as
a product in mid-2006. Then we shifted our attention to revising
our build system, cleaning up code (commenting!), and a dozen
other things that would be inviting to others who would want to
understand and contribute to our source code. We had a goal to
be open source in mid-2007.
But a new problem arose: Intel TBB became an immediate hit
with customers. We did not hide our desire to be open source to
our customers, and this only intensified their interest in Intel TBB.
Some of our management asked, “Why give away the source code
to such a successful product?” and I boldly presented a multitude
of reasons, armed with facts and figures from our team, why we
should open up. That was a mistake, and I failed to get the needed
permissions before 2006 ended. I licked my wounds, and we
eventually realized we needed to prove only one thing: Intel TBB
would have far greater adoption if we open sourced it than if we
did not. After all, developers bet the very future of their code when
The original O'Reilly book cover
For more complete information about compiler optimizations, see our
Optimization Notice.
Sign up for future issues
Share with a friend
The Parallel Universe
4
they adopt a programming model.
Perhaps openness matters more for
programming models than it does for
most other software. We had failed
to articulate to our management that
this was
all
that really mattered―and
that it was all we needed to know to
understand that we must open source
Intel TBB.
Armed with this perspective, I
approached our senior VP, Renee James,
who had to approve our proposal. I
surprised her by showing up with only a
single piece of paper with a simple graph
on it, which compared projected Intel
The design of this special edition of The Parallel Universe was inspired by Intel® TBB’s
premiere 10 years ago at OSCON. The theme, “Into the Great Wide Open,” was a huge hit.
TBB adoption with and without open
sourcing. We predicted that Intel TBB
would vanish and be replaced within five years if we didn’t offer this critical programming model via open
source. We predicted great success if we did open source (we actually far underestimated the success, as
it turns out). Renee listened to my two-minute pitch, looked at me, and asked, “Why didn’t you say this the
first time? Of course we should do this.” Of course, I could have pointed out that this graph was identical to
slide 7 of the original way-too-long presentation from two months earlier, but I settled on “Thank you,” and
the rest is history. We chose the most popular open source licensing at the time: GPL v2* with classpath
exception (important for C++ template libraries). Ten years later, we are switching Intel TBB to the Apache*
license. We have received a great deal of feedback from the community of users and contributors that this
is the right license to use for Intel TBB today.
Intel TBB's First Revolution of Parallelism: Embrace Task-Stealing Abstraction, Fully
Composable, Fully C++
OpenMP is incredibly important, but it is not composable. This is a mistake of epic proportions, with long-
reaching ramifications, and it cannot be changed because OpenMP is so important and committed to
compatibility. I am complicit in the OpenMP mistake, along with everyone else who helped pull it together,
review it, and promote it starting in 1997. We overlooked the importance that nested parallelism would
have as the amount of hardware parallelism grew. It simply was not a concern in 1997.
Being composable is the most amazing feature of Intel TBB. I cannot overstate the importance of never
worrying about oversubscription, nested parallelism, etc. Intel TBB is gradually revolutionizing certain
communities of developers who demand composability for their applications. The Intel® Math Kernel
Library (Intel® MKL), which has long been based on OpenMP, offers a version built on top of Intel TBB
for exactly this reason. And the much newer (and open source) Intel® Data Analytics Acceleration Library
(Intel® DAAL) always uses Intel TBB and the Intel TBB-powered Intel MKL. In fact, Intel TBB is finding use in
some versions of Python* too.
For more complete information about compiler optimizations, see our
Optimization Notice.
Sign up for future issues
Share with a friend
The Parallel Universe
5
Of course, the task-stealing scheduler at
the heart of Intel TBB is the real magic.
While HPC customers worry about
squeezing out the ultimate performance
while running an application on dedicated
cores, Intel TBB tackles a problem that
HPC users never worry about: How can
you make parallelism work well when
you share the cores that you run upon?
Imagine running on eight cores, but a
virus checker happens to run on one core
during your application’s run. That would
never happen on a supercomputer, but
it happens all the time on workstations
and laptops. Without the dynamic nature
“Into the Great Wide Open" called attention to the open source nature of Intel® TBB. At
OSCON, attendees were invited to experience and explore the new Intel TBB for themselves.
of the Intel TBB task-stealing scheduler,
such a program would simply be delayed
by the full time that the virus checker stole—because it would effectively delay every thread in the
application. When using Intel TBB on eight
cores, an interruption of duration TIME on one core may delay
the application by as little as TIME/8. This real-world flexibility matters a lot.
Finally, Intel TBB is a C++ template library that fully embraces bringing parallelism to C++. The dedication
of Intel TBB to C++ has helped inspire changes to the C++ standard. Perhaps our biggest dream of all is
that Intel TBB will one day only be the scheduler and the algorithms that use it. The many other things
in Intel TBB―helping parallelize parts of STL, creating truly portable locks and atomics, addressing
shortcomings in memory allocations, and other features to bring parallelism to C++―that can and should
eventually be part of the standard language. Maybe even more of Intel TBB? Time will tell.
Intel TBB’s Second Revolution of Parallelism: Offer Superior Alternatives to Bulk
Synchronous Programming
As much as we can praise Intel TBB’s task-stealing scheduler, the algorithms most often used in
applications are organized with a lot of synchronization happening at runtime. This is a sign of the times
in terms of how parallel programming has been done successfully for years. However, as the amount of
parallelism has grown, this has become a great obstacle in the pursuit of scaling. A better approach is to
express the flow of data and require a minimal level of synchronization. The flow graph addition to Intel
TBB is a leader in this critical revolution in parallel programming. This type of thinking is required for any
parallel programming model to support the future well.
For more complete information about compiler optimizations, see our
Optimization Notice.
Sign up for future issues
Share with a friend
Plik z chomika:
Stefan_68
Inne pliki z tego folderu:
intel-parallel-universe-issue-25.pdf
(18315 KB)
parallel-universe-issue-38.pdf
(47383 KB)
parallel-universe-issue-37.pdf
(49829 KB)
parallel-universe-issue-34.pdf
(48340 KB)
parallel-universe-issue-36.pdf
(50339 KB)
Inne foldery tego chomika:
A Radio. Prakticka Elektronika
Czasopisma anglojezyczne
czasopisma czeskie
czasopisma polskie
Elektor
Zgłoś jeśli
naruszono regulamin