[boost] GSoC 2015 Boost Compute Proposal

27 Mar 2015

      Hi,

This is my proposal. Please suggest edits.

Thank you!! :)

*Personal Details:*

Name: Aditya Atluri
University: The George Washington University
Major: Computer Engineering
Degree: MS
Email: adityaavinash1@gmail.com
Homepage: adityaatluri.github.io
Availability:
    1. 7  days a week
    2. 27 April to 28 August
    3. Nothing.

*Background Information:*

*Educational Background:*
I did my undergrad from Indian Institute of Technology, Dhanbad, India.
Currently, I am pursuing my Masters at The George Washington University.
The courses I have taken are, High Performance Computing, Advanced Computer
Micro Architecture, Computer Graphics, Advanced Topics in Computer
Graphics, VLSI Design, Low Power VLSI Design, Compiler, Machine Learning. I
have did lots of projects in these courses which can be seen on my website
and in my bio[1].

*Programming Background and Interests:*
I am researcher, I program GPUs with IRs, writing work arounds and using a
specific hardware for other applications (example, implementing ray tracing
on rasterizer on GPUs, using tessellators for generating procedural
textures). I wrote high productive parallel programming abstractions for
GPUs[2]. I presented my work at GTC 2014. For this work, NVIDIA supported
me with a couple of Tesla K40s. As CUDA, OpenGL, OpenCL, Metal, Thrust are
C/C++ APIs I use C++ extensively. I also create databases on GPUs which is
now in deep interest from NVIDIA, Google and MapD.

I worked with Mesa, Open Source graphics drivers for Linux to build
ARB_shader_atomic_counters for R600 backend. I also worked at IBM software
labs, where I worked majorly on Java.

*Why I opted Boost?*
I use boost on daily basis as a part of PyCUDA. I write work arounds which
involve some of boost libraries. The next thing I am interested in Boost is
its Compute library. I have used other GPU STL style libraries but, none
are as good as Compute (simple and elegant). I couldn't contribute to
Compute via GSOC last year due to my CPT. I don't want to lose that chance
again. This time, I want to take Boost.Compute to a whole new level. I like
to program anything related to GPUs, compilers, drivers, APIs.

*Interest in Boost.Compute:*
As I have mentioned earlier, I am all-in for anything related to GPUs.
Either graphics or compute libraries. I believe in Boost.Compute that it
can be a widely used API, hence I want to make it more powerful and more
useful.

*Previous Work:*
I work on graphics drivers on regular basis. I write loads of OpenGL
compute and shader code (part of my TA job and Mesa testing on local
systems). I am good at using APIs, making optimizations, less memory
footage. For example, increasing the number of draw calls per second
without stalling CPU, writing shaders with more ALU operations than
Load/Store operations. Aligning data properly to memory width size,
minimizing padding done by compilers by struct unrolling (to get exact
multiple of word size). These are a few of the optimizations I do every
day. I am currently working on running Metal Shading Language on Intel
processors (using AVX).[3]

*Plans beyond SoC*:
I want to extend Compute to ARM and Qualcomm devices that support both
OpenCL and ARM Neon. This brings all "compute-intense" APIs under one roof.

C++ 98/03 - 03
C++ 98/14 - 03
C++ STL - 4
Boost C++ Libraries - (Knowledge: 2, Usage: 2)
Git: 3

I program using Visual Studio and Xcode. Using Terminal on Linux.
I am good with Doxygen.

*Project Proposal:*

Introduction: Boost Compute library is so far the best portable STL-like
C++ library for accelerated computing leveraging the underlying parallel
hardware. Boost Compute works well on OpenCL compatible desktop/notebook
devices. In this project, we implement Boost Compute on mobile devices.
Current mobile devices doesn’t provide good STL style abstractions that are
accelerated by SIMD and OpenCL. With the introduction of Metal graphics API
for iOS devices, development of compute application for non-iOS and iOS
devices requires knowledge multiple skill sets. With this project, we bring
all compute abstractions of different mobile devices under one roof that
is, Boost.Compute.

Project Goals:

Make C++ extensions from Objective-C.

Integrate device containers to Boost.Compute front end.

Make Metal API data types visible to user through Boost.Compute.

Implementing BOOST_COMPUTE_FUNCTION() to run custom kernels.

Most important, implementing algorithms for Boost.Compute on iOS devices
using Metal Shader Language.

*Implementation:*

Building Boost.Compute APIs for iOS devices.
- Metal graphics APIs are used as backend to create Boost.Compute APIs in
conjunction with Metal Shading Language

Make C++ extensions from Objective-C.
- Metal is Objective-C graphics API whereas Boost.Compute is C++ compute
API. Hence a bridge between these two APIs should be made to use Compute on
iOS devices. Hence, we build C++ extensions from Objective-C APIs. As C++
is also used in building iOS apps, writing Boost.Compute for iOS devices
makes it easier to leverage compute performance from an iOS app.

Build resource management Metal API backend for Boost.Compute
- Here we develop infrastructure which make "algorithms" run on the
hardware. Like, creating automated CommandQueues, CommandBuffers, Resources
(buffers or textures)
- As Metal APIs does not have ‘context’ related API, mapping can be done to
Metal command buffer API. Some OpenCL APIs which are natively available
(using in Boost.Compute) are not present in Metal API, appropriate mapping
from Boost.Compute to Metal API commands will be found.

Integrate device containers from Metal API to Boost.Compute front end.
- Boost.Compute provide vector and array containers. These should be
declared in Metal and integrated to Boost. Device containers include,
device vectors, initializing vectors, moving vectors, similarly for arrays,
etc., and their STL type functionality (.size(), .length())
- Implementing iterators for the containers should also be done.

Make Metal API data types visible to user through Boost.Compute.
- Mapping Metal data types to current Boost.Compute data types, while
making them consistent with current Boost.Compute APIs. Certain new data
types to Metal are being added which improve the performance and decrease
memory footprint. Making these visible to the programmer and building
building algorithms with them is done in this section.

Implement algorithms for Boost.Compute in Metal Shading Language.
- This makes the biggest challenge as Metal compute shaders support two
modes of execution (fast and precise modes). Algorithms written in each
mode should be iterated over to judge the best performance without losing
precision for speed.

Testing the existing Boost.Compute code on iOS devices and debugging for
errors.
- With Metal emulators not being available for Xcode, hardware testing will
be done[4].

*Timeline:*

The timeline is straight forward, for details check Implementation section.
27 April – 25 May: Community bonding, testing the Xcode and iOS
infrastructure to find any possible build problems. Build a test run
environment (iOS app) to check any other hardware/software issues in
deploying Metal API code to iOS device.

25 May – 10 June: Wrap C++ Metal extensions with C++, submit patches.

10 June – 25 June: Implement data types and containers for Boost from C++
Metal APIs. Make appropriate changes to the code after getting feedback
from community. Submit patches.

25 June – 3 July: Review community feedback and change patches accordingly.

3 July – 3 August: Write BUILD_COMPUTE_FUNCTION() and Boost.Compute
algorithms on Metal Shading Language and integrate it to Boost.Compute
Metal APIs. Submit patches.

3 August – 17 August: Write tests and debug by running them on iOS test app

17 August – 28 August: Cleaning up code and review with community.

Available Hardware: iPhone 4s, 6Plus, Mac Mini.

Mentor: Kyle Lutz (kyle.r.lutz@gmail.com)

*Competency:*

The project I started which creates the actual Metal API, is iaMetal (Metal
for Intel processors)[3]. It does not uses Metal SL, it converts Metal SL
to AVX extensions.

[1].
https://github.com/adityaatluri/adityaatluri.github.co/blob/master/CV.md
[2]. https://github.com/urutu/Urutu
[3]. https://github.com/iaMetal/iaMetal

[4]. https://devforums.apple.com/message/971605#971605

-- 
Regards,
Aditya Avinash Atluri,
Graduate Student,
Electrical and Computer Engineering,
The George Washington University,
Washington, DC.